Category: data center

  • AnandTech – Applied Micros X-Gene: The First ARMv8 SoC

    APM expects that even with a late 2012 launch it will have a 1 – 2 year lead on the competition. If it can get the X-Gene out on time, hitting power and clock targets both very difficult goals, the headstart will be tangible. Note that by the end of 2012 well only just begin to see the first Cortex A15 implementations. ARMv8 based competitors will likey be a full year out, at least. 

    via AnandTech – Applied Micros X-Gene: The First ARMv8 SoC.

    Chip Diagram for the ARM version 8 as implemented by APM

    It’s nice to get a confirmation of the production time lines for the Cortex A15 and the next generation ARM version 8 architecture. So don’t expect to see shipping chips, much less finished product using those chips well into 2013 or even later. As for the 4 core ARM A15, finished product will not appear until well into 2012. This means if Intel is able to scramble, they have time to further refine their Atom chips to reach the power level and Thermal Design Point (TDP) for the competing ARM version 8 architecture. What seems to be the goal is to jam in more cores per CPU socket than is currently done on the Intel architecture (up to almost 32 in on of the graphics presented with the article).

    The target we are talking about is 2W per core @ 3Ghz, and it is going to be a hard, hard target to hit for any chip designer or manufacturer. One can only hope that TMSC can help APM get a finished chip out the door on it’s finest ruling chip production lines (although an update to the article indicates it will ship on 40nm to get it out the door quicker). The finer the ruling of signal lines on the chip the lower the TDP, and the higher they can run the clock rate. If ARM version 8 can accomplish their goal of 2W per cpu core @ 3 Gigahertz, I think everyone will be astounded. And if this same chip can be sampled at the earliest prototypes stages by a current ARM Server manufacturer say, like Calxeda or even SeaMicro then hopefully we can get benchmarks to show what kind of performance can be expected from the ARM v.8  architecture and instruction set. These will be interesting times.

    Intel Atom CPU Z520, 1,333GHz
    Image via Wikipedia
  • Fusion plays its card: The Ten of Terabytes • The Register

    Fusion-io has crammed eight ioDrive flash modules on one PCIe card to give servers 10TB of app-accelerating flash.

    This follows on from its second generation ioDrives: PCIe-connected flash cards using single level cell and multi-level cell flash to provide from 400GB to 2.4TB of flash memory, which can be used by applications to get stored data many times faster than from disk. By putting eight 1.28TB multi-level cell ioDrive 2 modules on a single wide ioDrive Octal PCIe card Fusion reaches a 10TB capacity level.

    via Fusion plays its card: The Ten of Terabytes • The Register.

    Image representing Fusion-io as depicted in Cr...
    Image via CrunchBase

    This is some big news in the fight to be king of the PCIe SSD market. I declare: Advantage Fusion-io. They now have the lead in terms of not just speed but also overall capacity at the price point they have targeted.  As densities increase and prices more or less stay flat, the value add is more data can stay resident on the PCIe card and not be swapped out to Fibre-Channel array storage on the Storage Area Network (SAN). Performance is likely to be wicked cool and early adopters will now doubt reap big benefits from transaction processing and online analytic processing as well.

  • Intel Responds to Calxeda/HP ARM Server News (Wired.com)

    Now, you’re probably thinking, isn’t Xeon the exact opposite of the kind of extreme low-power computing envisioned by HP with Project Moonshot? Surely this is just crazy talk from Intel? Maybe, but Walcyzk raised some valid points that are worth airing.via Cloudline | Blog | Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data.

    Structure of the TILE64 Processor from Tilera
    Image via Wikipedia: Tile64 mesh network processor from Tilera
    Image representing Tilera as depicted in Crunc...
    Image via CrunchBase

    So Intel gets an interview with a Conde-Nast writer for a sub-blog of Wired.com. I doubt too many purchasers or data center architects consult Cloudline@Wired.com. But all the same, I saw through many thinly veiled bits of handwaving and old saws from Intel saying, “Yes, this exists but we’re already addressing it with our exiting product lines,. . .” So, I wrote in a comment to this very article. Especially regarding a throw-away line mentioning the ‘future’ of the data center and the direction the Data Center and Cloud Computing market was headed. However the moderator never published the comment. In effect, I raised the Question: Whither Tilera? And the Quanta SM-2 server based on the Tilera Chip?

    Aren’t they exactly what is described by the author John Stokes as a network of cores on a chip? And given the scale of Tilera’s own product plans going into the future and the fact they are not just concentrating on Network gear but actual Compute Clouds too, I’d say both Stokes and Walcyzk are asking the wrong questions and directing our attention in the wrong direction. This is not a PR battle but a flat out technology battle. You cannot win this with words and white papers but in fact it requires benchmarks and deployments and Case Histories. Technical merit and superior technology will differentiate the players in the  Cloud in a Box race. And this hasn’t been the case in the past as Intel has battled AMD in the desktop consumer market. In the data center Intel Fear Uncertainty and Doubt is the only weapon they have.

    And I’ll quote directly from John Stokes’s article here describing EXACTLY the kind of product that Tilera has been shipping already:

    “Instead of Xeon with virtualization, I could easily see a many-core Atom or ARM cluster-on-a-chip emerging as the best way to tackle batch-oriented Big Data workloads. Until then, though, it’s clear that Intel isn’t going to roll over and let ARM just take over one of the hottest emerging markets for compute power.”

    The key phrase here is cluster on a chip, in essence exactly what Tilera has strived to achieve with its Tilera64 based architecture. To review from previous blog entries of this website following the announcements and timelines published by Tilera:

  • ARM specs out first 64-bit RISC chips • The Register

    IMG_1267
    Image by krunkwerke via Flickr

    The ARM RISC processor is getting true 64-bit processing and memory addressing – removing the last practical barrier to seeing an army of ARM chips take a run at the desktops and servers that give Intel and AMD their moolah.

    via ARM specs out first 64-bit RISC chips • The Register.

    The downside to this announcement is the timeline ARM lays out for the first generation chips to use the new Vers. 8 architecture. Due to limited demand, as ARM defines it, chips will not be shipping until 2013 or as late as 2014. However according to this Register article the existing IT Data center infrastructure will not adopt ANY ARM-based chips until they are designed as a 64-bit clean architecture. Sounds like a potential for a chicken and egg scenario except ARM will get that Egg out the door on schedule with TMSC as it’s test chip partner. Some other details that come from the article include that the top end ARM-15 chip just announced already addresses more than 32-bits of Memory through a workaround that allows enterprising programmers to address as many as 40bits of memory if they need it. The best argument made for the real market need of 64-bit Memory addressing is for programmers currently on different chip architectures who might want to port their apps to ARM. THEY are are the real target market for the Vers. 8 architecture, and will have a much easier time porting over to another chip architecture that has the same level of memory addressing capability (64-bits all around).

    As for companies like Calxeda who are adopting the ARM-15 architecture and the current ARM-8 Cortex chips (both of which fall under the previous gen. vers. 7 architecture), 32-bits of memory (4Gbytes in total) is enough to get by depending on the application being run. Highly parallel apps or simple things like single threaded webservers will perform well under these circumstances, according to The Register. And I am inclined to believe this based on current practices of Data Center giants like Facebook and Google (virtualization is sacrificed for massively parallel architectures). Also given the plans folks like Calxeda have for hardware interconnects, the ability off all those low power 32-bit chips all communicating with one another holds a lot of promise too.  I’m still curious to see if Calxeda can come up with a unique product utilizing the 64-bit ARM vers. 8 architecture when the chip finally is taped out and test chips are shipped out my TMSC.

  • HP hooks up with Calxeda to form server ARMy • The Register

    Calxeda is producing 4-core, 32-bit, ARM-based system-on-chip SOC designs, developed from ARMs Cortex A9. It says it can deliver a server node with a thermal envelope of less than 5 watts. In the summer it was designing an interconnect to link thousands of these things together. A 2U rack enclosure could hold 120 server nodes: thats 480 cores.

    via HP hooks up with Calxeda to form server ARMy • The Register.

    EnergyCore prototype card
    The first attempt at making an OEM compute node from Calxeda

    HP signing on as a OEM for Calxeda designed equipment is going to push ARM based massively parallel server designs into a lot more data centers. Add to this the announcement of the new ARM-15 cpu and it’s timeline for addressing 64-bit memory and you have a battle royale going up against Intel. Currently the Intel Xeon is the preferred choice for applications requiring large amounts of DRAM to hold whole databases and Memcached webpages for lightning quick fetches. On the other end of the scale is the low per watt 4 core ARM chips dissipating a mere 5 watts. Intel is trying to drive down the Thermal Design Point for their chips even resorting to 64bit Atom chips to keep the Memory Addressing advantage. But the timeline for decreasing the Thermal Design Point doesn’t quite match up to the ARM x64 timeline. So I suspect ARM will have the advantage as will Calxeda for quite some time to come.

    While I had hoped the recen ARM-15 announcement was also going to usher in a fully 64-bit capable cpu, it will at least be able to fake larger size memory access. The datapath I remember being quoted was 40-bits wide and that can be further extended using software. And it doesn’t seem to have discouraged HP at all who are testing the Calxeda designed prototype EnergyCore evaluation board. This is all new territory for both Calxeda and HP so a fully engineered and designed prototype is absolutely necessary to get this project off the ground. My hope is HP can do a large scale test and figure out some of the software configuration optimization that needs to occur to gain an advantage in power savings, density and speed over an Intel Atom server (like SeaMicro).

  • U.S. Requests for Google User Data Spike 29 Percent in Six Months | Threat Level | Wired.com

    Image representing Google as depicted in Crunc...
    Image via CrunchBase

    The number of U.S. government requests for data on Google users for use in criminal investigations rose 29 percent in the last six months, according to data released by the search giant Monday.

    via U.S. Requests for Google User Data Spike 29 Percent in Six Months | Threat Level | Wired.com.

    Not good news in imho. The reason being is the mission creep and abuses that come with absolute power in the form of a National Security Letter. The other part of the equation is Google’s business model runs opposite to the idea of protecting people’s information. If you disagree, I ask that you read this blog post from Christopher Soghoian, where he details just what exactly it is Google does when it keeps all your data unencrypted in its data centers. In order to sell AdWords and serve advertisements to you, Google needs to keep everything open and unencrypted. At the same time they aren’t too casual in their stewardship of your data, but they do respond to law enforcement requests for customer data. To quote Seghoian at the end of his blog entry:

    The end result is that law enforcement agencies can, and regularly do request user data from the company — requests that would lead to nothing if the company put user security and privacy first.”

    And that indeed is the moral of the story. Which leaves everyone asking what’s the alternative? Earlier in the same story the blame is placed square on the end-user for not protecting themselves. Encryption tools for email and personal documents have been around for a long time. And often there are commercial products available to help accomplish some level of privacy even for so-called Cloud hosted data. But the friction point is always going to be the level of familiarity, ease of use and cost of the product before it is as widely used and adopted as Webmail has been since the advent of desktop email clients like Eudora.

    So if you really have concerns, take action, don’t wait for Google to act to defend your rights. Encrypt your email, your documents and make Google one bit less culpable for any law enforcement requests that may or may not include your personal data.

  • AnandTech – ARM & Cadence Tape Out 20nm Cortex A15 Test Chip

    Wordmark of Cadence Design Systems
    Image via Wikipedia

    The test chip will be fabbed at TSMC on its next-generation 20nm process, a full node reduction ~50% transistor scaling over its 28nm process. With the first 28nm ARM based products due out from TSMC in 2012, this 20nm tape-out announcement is an important milestone but were still around two years away from productization. 

    via AnandTech – ARM & Cadence Tape Out 20nm Cortex A15 Test Chip.

    Data Centre
    Image by Route79 via Flickr (Now that's scary isn't it! Boo!)

    Happy Halloween! And like most years there are some tricks up ARM’s sleeve announced this past week along with some partnerships that should make things trickier for the Engineers trying to equip ever more energy efficient and dense Data Centers the world over.

    It’s been announced, the ARM15 is coming to market some time in the future. Albeit a ways off yet. And it’s going to be using a really narrow design rule to insure it’s as low power as it possibly can be. I know manufacturers of the massively parallel compute cloud in a box will be seeking out this chip as soon as samples can arrive. The 64bit version of ARM15 is the real potential jewel in the crown for Calxeda who is attempting to balance low power and 64bit performance in the same design.

    I can’t wait to see the first benchmarks of these chips apart from the benchmarks from the first shipping product Calxeda can get out with the ARM15 x64. Also note just this week Hewlett-Packard has signed on to sell designs by Calxeda in forth coming servers targeted at Energy Efficient Data Center build-outs. So more news to come regarding that partnership and you can read it right here @ Carpetbomberz.com

  • AnandTech – OCZ Z-Drive R4 CM88 1.6TB PCIe SSD Review

    In the enterprise segment where 1U and 2U servers are common, PCI Express SSDs are very attractive. You may not always have a ton of 2.5″ drive bays but theres usually at least one high-bandwidth PCIe slot unused. The RevoDrive family of PCIe SSDs were targeted at the high-end desktop or workstation market, but for an enterprise-specific solution OCZ has its Z-Drive line.

    via AnandTech – OCZ Z-Drive R4 CM88 1.6TB PCIe SSD Review.

    Anandtech is breaking new ground covering some Enterprise level segments of the Solid State Disk industry. While I doubt he’ll be doing ratings of Violin and Texas Memory Systems gear very soon, the OCZ low end Enterprise PCIe cards is still beginning to approach that target. We’re talking $10,000 USD and up for anyone who wants to participate. Which puts it in the middle to high end of Fusion-io and barely touches the lower end of Violin and TMS not to mention Virident. Given that, it is still wild to see what kind of architecture and performance optimization one gets for the money they pay. SandForce rules the day at OCZ for anything requiring the top speeds for write performance. It’s also interesting to find out about the SandForce 25xx series use of super-capacitors to hold enough reserve power to flush the write caches on a power outage. It’s expensive, but moves the product up a few notches in the Enterprise level reliability scale.

  • $1,279-per-hour, 30,000-core cluster built on Amazon EC2 cloud

    Amazon Web Services logo
    Image via Wikipedia

    Amazon EC2 and other cloud services are expanding the market for high-performance computing. Without access to a national lab or a supercomputer in your own data center, cloud computing lets businesses spin up temporary clusters at will and stop paying for them as soon as the computing needs are met.

    via $1,279-per-hour, 30,000-core cluster built on Amazon EC2 cloud.

    If you own your Data Center, you might be a little nervous right now as even a Data Center can be outsourced on an as needed basis. Especially if you are doing scientific computing you should consider the fixed costs of acquiring and maintaining those sunk, capital costs after the cluster is up and running. This story provides one great example of what I think the Cloud Computer could one day become. Rent-a-Center style data centers and compute clusters seem like an incredible value especially for a University but even more so for a business that may not need a to keep a real live data center under their control. Examples abound as even online services like Drop Box lease their compute cycles from the likes of Amazon Web Services and the Elastic Compute Cloud (EC2). And if migrating an application into a Data Center along with the data set to be analyzed can be sped up sufficiently and the cost kept down, who knows what might be possible.

    Opportunity costs are many when it comes to having access to a sufficiently large number of nodes in a compute cluster. Mostly with modeling applications, you get to run a simulation at finer time slices, at higher resolution possibly gaining a better understanding of how close your algorithms match the real world. This isn’t just for business but for science as well and I think being saddled with a typical Data Center installation and it’s infrastructure and depreciation costs along with staffing make it seem less attractive if the big Data Center providers are willing to sell part of their compute cycles at a reasonable rate. The best part is you can shop around too. In the bad old days of batch computing and the glassed in data center, before desktops and mini-computers people were dying to get access to the machine and run their jobs. Now the surplus of computing cycles is so great for the big players, they help subsidize the costs of build-outs and redundancies by letting people bid of the spare compute cycles they have just lying around generating heat. It’s a whole new era of compute cycle auctions and I for one am dying to see more stories like this in the future.

  • David May, parallel processing pioneer • reghardware

    INMOS T800 Transputer
    Image via Wikipedia

    The key idea was to create a component that could be scaled from use as a single embedded chip in dedicated devices like a TV set-top box, all the way up to a vast supercomputer built from a huge array of interconnected Transputers.

    Connect them up and you had, what was, for its era, a hugely powerful system, able to render Mandelbrot Set images and even do ray tracing in real time – a complex computing task only now coming into the reach of the latest GPUs, but solved by British boffins 30-odd years ago.

    via David May, parallel processing pioneer • reghardware.

    I remember the Transputer. I remember seeing ISA-based add-on cards for desktop computers back in the early 1980s. They would advertise in the back of the popular computer technology magazines of the day. And while it seemed really mysterious what you could do with a Transputer, the price premium to buy those boards made you realize it must have been pretty magical.

    Most recently while I was attending workshop in Open Source software I met a couple form employees of  a famous manufacturer of camera film. In their research labs these guys used to build custom machines using arrays of Transputers to speed up image processing tasks inside the products they were developing. So knowing that there’s even denser architectures using chips like Tilera, Intel Atom and ARM chips absolutely blows them away. The price/performance ratio doesn’t come close.

    Software was probably the biggest point off friction in that the tools to integrate the Transputer into the overall design required another level of expertise. That is true to of the General Purpose Graphics Processing Unit (GPGU) that nVidia championed and now markets with its Tesla product line. And the Chinese have created a hybrid supercomputer mating Tesla boards up with commodity cpus. It’s too bad that the economics of designing and producing the Transputer didn’t scale with the time (the way it has for Intel as a comparison). Clock speeds also fell behind too, which allowed general purpose micro-processors to spend the extra clock cycles performing the same calculations only faster. This is also the advantage that RISC chips had until they couldn’t overcome the performance increases designed in by Intel.