Categories
cloud data center fpga science & technology

MIT Puts 36-Core Internet on a Chip | EE Times

Partially connected mesh topology
Partially connected mesh topology (Photo credit: Wikipedia)

Today many different interconnection topologies are used for multicore chips. For as few as eight cores direct bus connections can be made — cores taking turns using the same bus. MIT’s 36-core processors, on the other hand, are connected by an on-chip mesh network reminiscent of Intel’s 2007 Teraflop Research Chip — code-named Polaris — where direct connections were made to adjacent cores, with data intended for remote cores passed from core-to-core until reaching its destination. For its 50-core Xeon Phi, however, Intel settled instead on using multiple high-speed rings for data, address, and acknowledgement instead of a mesh.

via MIT Puts 36-Core Internet on a Chip | EE Times.

I commented some time back on a similar article on the same topic. It appears now the MIT research group has working silicon of the design. As mentioned in the pull-quote, the Xeon Phi (which has made some news in the Top 500 SuperComputer stories recently) is a massively multicore architecture but uses a different interconnect that Intel designed on their own. These stories as they appear get filed into the category of massively multicore or low power CPU developments. Most times the same CPUs add cores without significantly drawing more power and thus provide a net increase in compute ability. Tilera, Calxeda and yes even SeaMicro were all working along towards those ends. Either through mergers, or cutting of funding each one has seemed to trail off and not succeed at its original goal (massively multicore, low power designs). Also along the way Intel has done everything it can to dull and dent the novelty of the new designs by revising an Atom based or Celeron based CPU to provide much lower power at the scale of maybe 2 cores per CPU.

Like this chip MIT announced Tilera too was originally an MIT research product spun off of the University campus. Its principals were the PI and a research associate if I remember correctly. Now that MIT has the working silicon they’re going to benchmark and test and verify their design. The researchers will release the verilog hardware description of chip for anyone use, research or verify for themselves once they’ve completed their own study. It will be interesting to see how much of an incremental improvement this design provides, and possibly could be the launch of another Tilera style product out of MIT.

Categories
cloud computers data center google technology

Tilera | Wired Enterprise | Wired.com

Tilera’s roadmap calls for its next generation of processors, code-named Stratton, to be released in 2013. The product line will expand the number of processors in both directions, down to as few as four and up to as many as 200 cores. The company is going from a 40-nm to a 28-nm process, meaning they’re able to cram more circuits in a given area. The chip will have improvements to interfaces, memory, I/O and instruction set, and will have more cache memory.

via Tilera | Wired Enterprise | Wired.com.

Image representing Wired Magazine as depicted ...
Image via CrunchBase

I’m enjoying the survey of companies doing massively parallel, low power computing products. Wired.com|Enterprise started last week with a look at SeaMicro and how the two principal founders got their start observing Google’s initial stabs at a warehouse sized computer. Since that time things have fractured somewhat instead of coalescing and now three big attempts are competing to fulfil the low power, massively parallel computer in a box. Tilera is a longer term project startup from MIT going back further than Calxeda or SeaMicro.

However application of this technology has been completely dependent on the software. Whether it be OSes or Applications, they all have to be constructed carefully to take full advantage of the Tile processor architecture. To their credit Tilera has attempted to insulate application developers from some of the vagaries of the underlying chip by creating an OS that does the heavy lifting of queuing and scheduling. But still, there’s got to be a learning curve there even if it isn’t quite as daunting as say folks who develop applications for the super computers at National Labs here in the U.S. Suffice it to say it’s a non-trivial choice to adopt a Tilera cpu for a product/project you are working on. And the people who need a Tilera GX cpu for their app, already know all they need to know about the the chip in advance. It’s that kind of choice they are making.

I’m also relieved to know they are continuing development to shrink down the design rules. Intel being the biggest leader in silicon semi-conductor manufacturing, continues to shrink its design, development and manufacturing design rules. We’re fast approaching a 20nm-18nm production line in both Oregon and Arizona. Both are Intel design fabrication plants and there not about to stop and take a breath. Companies like Tilera, Calxeda and SeaMicro need to do continuous development on their products to keep from being blind sided by Intel’s continuous product development juggernaut. So Tilera is very wise to shrink its design rule from 40nm down to 28nm as fast as it can and then get good yields on the production lines once they start sampling chips at this size.

*UPDATE: Just saw this run through my blogroll last week. Tilera has announced a new chip coming in March. Glad to see Tilera is still duking it out, battling for the design wins with manufacturers selling into the Data Center as it were. Larger Memory addressing will help make the Tilera chips more competitive with Commodity Intel Hardware shops, and maybe we’ll see full 64bit memory extensions at some point as a follow on to the current 40bit address space extenstions currently being touted in this article from The Register.

English: Block diagram of the Tilera TILEPro64...
Image via Wikipedia
Categories
cloud computers data center technology

HP hooks up with Calxeda to form server ARMy • The Register

Calxeda is producing 4-core, 32-bit, ARM-based system-on-chip SOC designs, developed from ARMs Cortex A9. It says it can deliver a server node with a thermal envelope of less than 5 watts. In the summer it was designing an interconnect to link thousands of these things together. A 2U rack enclosure could hold 120 server nodes: thats 480 cores.

via HP hooks up with Calxeda to form server ARMy • The Register.

EnergyCore prototype card
The first attempt at making an OEM compute node from Calxeda

HP signing on as a OEM for Calxeda designed equipment is going to push ARM based massively parallel server designs into a lot more data centers. Add to this the announcement of the new ARM-15 cpu and it’s timeline for addressing 64-bit memory and you have a battle royale going up against Intel. Currently the Intel Xeon is the preferred choice for applications requiring large amounts of DRAM to hold whole databases and Memcached webpages for lightning quick fetches. On the other end of the scale is the low per watt 4 core ARM chips dissipating a mere 5 watts. Intel is trying to drive down the Thermal Design Point for their chips even resorting to 64bit Atom chips to keep the Memory Addressing advantage. But the timeline for decreasing the Thermal Design Point doesn’t quite match up to the ARM x64 timeline. So I suspect ARM will have the advantage as will Calxeda for quite some time to come.

While I had hoped the recen ARM-15 announcement was also going to usher in a fully 64-bit capable cpu, it will at least be able to fake larger size memory access. The datapath I remember being quoted was 40-bits wide and that can be further extended using software. And it doesn’t seem to have discouraged HP at all who are testing the Calxeda designed prototype EnergyCore evaluation board. This is all new territory for both Calxeda and HP so a fully engineered and designed prototype is absolutely necessary to get this project off the ground. My hope is HP can do a large scale test and figure out some of the software configuration optimization that needs to occur to gain an advantage in power savings, density and speed over an Intel Atom server (like SeaMicro).

Categories
cloud computers data center mobile technology

AnandTech – ARM & Cadence Tape Out 20nm Cortex A15 Test Chip

Wordmark of Cadence Design Systems
Image via Wikipedia

The test chip will be fabbed at TSMC on its next-generation 20nm process, a full node reduction ~50% transistor scaling over its 28nm process. With the first 28nm ARM based products due out from TSMC in 2012, this 20nm tape-out announcement is an important milestone but were still around two years away from productization. 

via AnandTech – ARM & Cadence Tape Out 20nm Cortex A15 Test Chip.

Data Centre
Image by Route79 via Flickr (Now that's scary isn't it! Boo!)

Happy Halloween! And like most years there are some tricks up ARM’s sleeve announced this past week along with some partnerships that should make things trickier for the Engineers trying to equip ever more energy efficient and dense Data Centers the world over.

It’s been announced, the ARM15 is coming to market some time in the future. Albeit a ways off yet. And it’s going to be using a really narrow design rule to insure it’s as low power as it possibly can be. I know manufacturers of the massively parallel compute cloud in a box will be seeking out this chip as soon as samples can arrive. The 64bit version of ARM15 is the real potential jewel in the crown for Calxeda who is attempting to balance low power and 64bit performance in the same design.

I can’t wait to see the first benchmarks of these chips apart from the benchmarks from the first shipping product Calxeda can get out with the ARM15 x64. Also note just this week Hewlett-Packard has signed on to sell designs by Calxeda in forth coming servers targeted at Energy Efficient Data Center build-outs. So more news to come regarding that partnership and you can read it right here @ Carpetbomberz.com

Categories
cloud data center google technology wintel

ARM server hero Calxeda lines up software super friends • The Register

Company Logo
Maker of the massively parallel ARM-based server

via ARM server hero Calxeda lines up software super friends • The Register.

Calxeda in the news again this week with some more announcements regarding its plans. Remembering recently to the last article I posted on Calxeda, this company boasts an ARM based server packing 120 cpus (each with four cores) into a 2U high rack (making it just 3-1/2″ tall *see note). With every evolution in hardware one must needs get an equal if not greater revolution in software. Which is the point of the announcement by Calxeda of its new software partners.

It’s all mostly cloud apps, cloud provisioning and cloud management types of vendors. And with the partnership each company gets early access to the hardware Calxeda is promising to design, prototype and eventually manufacture. Both Google and Intel have poo-poohed the idea of using “wimpy processors” on massively parallel workloads claiming faster serialized workloads are still easier to manage through existing software/programming techniques. For many years as Intel has complained about the programming tools, it still has gone the multi-core/multi-thread route hoping to continue its domination by offering up ‘newer’ and higher performing products. So while Intel bad mouths parallelism on competing cpus it seems to be desperate to sell multi-core to willing customers year over year.

Even as power efficient as those cores maybe Intel’s old culture of maximum performance for the money still holds sway. Even the most recent Ultra-low Voltage i-series cpus are still hitting about 17Watts of power for chips clocking in around 1.8Ghz (speed boosting up to 2.9Ghz in a pinch). Even if Intel allowed these chips to be installed into servers we’re stilling talking a lot of  Thermal Design Point (TDM) that has to be chilled to keep running.

Categories
computers data center mobile technology

Calxeda boasts of 5 watt ARM server node • The Register

Calxeda is not going to make and sell servers, but rather make chips and reference machines that it hopes other server makers will pick up and sell in their product lines. The company hopes to start sampling its first ARM chips and reference servers later this year. The first reference machine has 120 server nodes in a 2U rack-mounted format, and the fabric linking the nodes together internally can be extended to interconnect multiple enclosures together.

via Calxeda boasts of 5 watt ARM server node • The Register.

SeaMicro and now Calxeda are going gangbusters for the ultra dense low power server market. Unlike SeaMicro, Calxeda wants to create reference designs it licenses to manufacturers who will build machines with 120 cores in a 2 Unit rack. SeaMicro’s record right now is 512 cores per 10U rack  or roughly 102+ cores in a 2 Unit rack. The difference is the SeaMicro product uses an Intel low power Atom cpu,  whereas Calxeda is using a processor used more often in smart phones and tablet computers. SeaMicro has hinted they are not wedded to the Intel Architecture, but they are more interested in shipping real live product than coming up with generic designs others can license. In the long run it’s entirely possible SeaMicro may switch to a different CPU, they have indicated previously they have designed their servers with flexibility enough to swap out the processor to any other CPU if necessary. It would be really cool to see an apples-to-apples comparison of a SeaMicro server using first Intel CPUs versus ARM-based CPUs.