Carpet Bomberz Inc.

Focusing in on desktop and data center news and analysis

Archive for the ‘data center’ Category

SeaMicro adds Xeons to Atom smasher microservers • The Register

leave a comment »

Theres some interesting future possibilities for the SeaMicro machines. First, SeaMicro could extend that torus interconnect to span multiple chassis. Second, it could put a “Patsburg” C600 chipset on an auxiliary card and actually make fatter SMP nodes out of single processor cards and then link them into the torus interconnect. Finally, it could of course add other processors to the boards, such as Tileras 64-bit Tile Gx3000s or 64-bit ARM processors when they become available.

via SeaMicro adds Xeons to Atom smasher microservers • The Register.

SeaMicro SM10000

SeaMicro SM10000 (Photo credit: blogeee.net)

Timothy Prickett Morgan writing for The Register, has a great article on SeaMicro’s recent announcement of a Xeon-based 10U server chassis. Seemingly going against it’s first two generations of low power massively parallel server boxes, this one uses a brawny Intel Xeon server chip (albeit one that is fairly low power and low Thermal Design Point).

Sad as it may seem to me, the popularity of the low power, massively parallel cpu box must not be very lucrative. But a true testament to the flexibility of their original 10U server rack design is the ability to do a ‘swap’ of the higher power Intel Xeon cpus. I doubt there’s too many competitors in this section of the market that could ‘turn on a dime’ the way SeaMicro has appeared to do with this Xeon based server. Most often designs will be so heavily optimized for a particular cpu, power supply and form factor layout that changing one component might force a bigger change order in the design department. And the product would take longer to develop and ship as a result.

So even though I hope the 64bit Intel Atom will still be SeaMicro’s flagship product, I’m also glad they can stay in the fight longer selling into the ‘established’ older data center accounts worldwide. Adapt or die is the cliche adage of some technology writers and I would mark this with a plus (+) in the adapt column.

Written by carpetbomberz

February 20, 2012 at 3:00 pm

Tilera preps many-cored Gx chips for March launch • The Register

leave a comment »

“Were here today shipping a 64-bit processor core and we are what looks like two years ahead of ARM,” says Bishara. “The architecture of the Tile-Gx is aligned to the workload and gives one server node per chip rather than a sea of wimpy nodes not acting in a cache coherent manner. We have been in this market for two years now and we know what hurts in data centers and what works. And 32-bit ARM just is not going to cut it. Applied Micro is doing their own core, and that adds a lot of risks.”

via Tilera preps many-cored Gx chips for March launch • The Register.

Tile of a TILE64 processor from Tilera
Image via Wikipedia

Tilera is preparing to ship a 36 core Tile-Gx cpu in March. It’s going to be packaged with a re-compiled Linux distribution of CentOS on a development board (TILEencore). It will also have a number of re-compiled Unix utilities and packages included, so OEM shops can begin product development as soon as possible.

I’m glad to see Tilera is still duking it out, battling for the design wins with manufacturers selling into the Data Center as it were. Larger Memory addressing will help make the Tilera chips more competitive with Commodity Intel Hardware data center shops who build their own hardware. Maybe we’ll see full 64bit memory extensions at some point as a follow on to the current 40bit address space extensions currently. The memory extensions are necessary to address more than the 32bit limit of 4GBytes, so an extra 8 bits goes a long, long way to competing against a fully 64bit address space.

Also considering work being done at ARM for optimizing their chip designs for narrower design rules, Tilera should follow suit and attempt to shrink their chip architecture too. This would allow clock speeds to ease upward and keep the thermal design point consistent with previous generation Tile architecture chips, making Tile-Gx more competitive in the coming years. ARM announced 1 month ago they will be developing a 22nm sized cpu core for future licensing by ARM customers. As it is now Tilera uses an older fabrication design rule of around 40nm (which is still quite good given the expense required to shrink to lower design rules). And they have plans to eventually migrate to a narrower design rule. However ideally they would not stay farther behind that 1 generation from the top-end process lines of Intel (who is targeting 14nm production lines in the near future).

Written by carpetbomberz

February 16, 2012 at 10:28 pm

Tilera | Wired Enterprise | Wired.com

leave a comment »

Tilera’s roadmap calls for its next generation of processors, code-named Stratton, to be released in 2013. The product line will expand the number of processors in both directions, down to as few as four and up to as many as 200 cores. The company is going from a 40-nm to a 28-nm process, meaning they’re able to cram more circuits in a given area. The chip will have improvements to interfaces, memory, I/O and instruction set, and will have more cache memory.

via Tilera | Wired Enterprise | Wired.com.

Image representing Wired Magazine as depicted ...

Image via CrunchBase

I’m enjoying the survey of companies doing massively parallel, low power computing products. Wired.com|Enterprise started last week with a look at SeaMicro and how the two principal founders got their start observing Google’s initial stabs at a warehouse sized computer. Since that time things have fractured somewhat instead of coalescing and now three big attempts are competing to fulfil the low power, massively parallel computer in a box. Tilera is a longer term project startup from MIT going back further than Calxeda or SeaMicro.

However application of this technology has been completely dependent on the software. Whether it be OSes or Applications, they all have to be constructed carefully to take full advantage of the Tile processor architecture. To their credit Tilera has attempted to insulate application developers from some of the vagaries of the underlying chip by creating an OS that does the heavy lifting of queuing and scheduling. But still, there’s got to be a learning curve there even if it isn’t quite as daunting as say folks who develop applications for the super computers at National Labs here in the U.S. Suffice it to say it’s a non-trivial choice to adopt a Tilera cpu for a product/project you are working on. And the people who need a Tilera GX cpu for their app, already know all they need to know about the the chip in advance. It’s that kind of choice they are making.

I’m also relieved to know they are continuing development to shrink down the design rules. Intel being the biggest leader in silicon semi-conductor manufacturing, continues to shrink its design, development and manufacturing design rules. We’re fast approaching a 20nm-18nm production line in both Oregon and Arizona. Both are Intel design fabrication plants and there not about to stop and take a breath. Companies like Tilera, Calxeda and SeaMicro need to do continuous development on their products to keep from being blind sided by Intel’s continuous product development juggernaut. So Tilera is very wise to shrink its design rule from 40nm down to 28nm as fast as it can and then get good yields on the production lines once they start sampling chips at this size.

*UPDATE: Just saw this run through my blogroll last week. Tilera has announced a new chip coming in March. Glad to see Tilera is still duking it out, battling for the design wins with manufacturers selling into the Data Center as it were. Larger Memory addressing will help make the Tilera chips more competitive with Commodity Intel Hardware shops, and maybe we’ll see full 64bit memory extensions at some point as a follow on to the current 40bit address space extenstions currently being touted in this article from The Register.

English: Block diagram of the Tilera TILEPro64...

Image via Wikipedia

Written by carpetbomberz

February 6, 2012 at 3:00 pm

How Google Spawned The 384-Chip Server | Wired Enterprise | Wired.com

leave a comment »

SeaMicro’s latest server includes 384 Intel Atom chips, and each chip has two “cores,” which are essentially processors unto themselves. This means the machine can handle 768 tasks at once, and if you’re running software suited to this massively parallel setup, you can indeed save power and space.

via How Google Spawned The 384-Chip Server | Wired Enterprise | Wired.com.

Image representing Wired Magazine as depicted ...

Image via CrunchBase

Great article from Wired.com on SeaMicro and the two principle minds behind its formation. Both of these fellows were quite impressed with Google’s data center infrastructure at the points in time when they both got to visit a Google Data Center. But rather than just sit back and gawk, they decided to take action and borrow, nay steal some of those interesting ideas the Google Engineers adopted early on. However, the typical naysayers pull a page out of the Google white paper arguing against SeaMicro and the large number of smaller, lower-powered cores they use in the SM-10000 product.

SeaMicro SM10000

Image by blogeee.net via Flickr

But nothing speaks of success more than product sales and SeaMicro is selling it’s product into data centers. While they may not achieve the level of commerce reached by Apple Inc., it’s a good start. What still needs to be done is more benchmarks and real world comparisons that reproduce or negate the results of Google’s whitepaper promoting their choice of off the shelf commodity Intel chips. Google is adamant that higher clock speed ‘server’ chips attached to single motherboards connected to one another in large quantity is the best way to go. However, the two guys who started SeaMicro insist that while Google’s choice for itself makes perfect sense, NO ONE else is quite like Google in their compute infrastructure requirements. Nobody has such a large enterprise or the scale Google requires (except for maybe Facebook, and possibly Amazon). So maybe there is a market at the middle and lower end of the data center owner’s market? Every data center’s needs will be different especially when it comes to available space, available power and cooling restrictions for a given application. And SeaMicro might be the secret weapon for shops constrained by all three: power/cooling/space.

*UPDATE: Just saw this flash through my Google Reader blogroll this past Wednesday, Seamicro is now selling an Intel Xeon based server. I guess the market for larger numbers of lower power chips just isn’t strong enough to sustain a business. Sadly this makes all the wonder and speculation surrounding the SM10000 seem kinda moot now. But hopefully there’s enough intellectual property rights and patents in the original design to keep the idea going for a while. Seamicro does have quite a headstart over competitors like Tilera, Calxeda and Applied Micro. And if they can help finance further developments of Atom based servers by selling a few Xeons along the way, all the better.

Written by carpetbomberz

February 2, 2012 at 9:51 pm

Fusion-io demos billion IOPS server config • The Register

leave a comment »

Fusion-io has achieved a billion IOPS from eight servers in a demonstration at the DEMO Enterprise event

Image representing Fusion-io as depicted in Cr...

Image via CrunchBase

in San Francisco.

The cracking performance needed just eight HP DL370 G6 servers, running Linux 2.6.35.6-45 on two, 6-core Intel processors, 96GB RAM. Each server was fitted with eight 2.4TB ioDrive2 Duo PCIE flash drives; thats 19.2TB of flash per server and 153.6TB of flash in total.

via Fusion-io demos billion IOPS server config • The Register.

This is in a word, no mean feat. 1 Million IOPS was the target to beat not just 2 years ago for anyone attempting to buy/build their own Flash based storage from the top Enterprise Level manufacturers. So the bar has risen no less than 3 orders of magnitude higher than the top end from 1 year ago. Add to that the magic sauce of bypassing the host OS and using the Flash memory as just an enhanced large memory.

This makes me wonder, how exactly does the Flash memory get used alongside the RAM memory pool?

How do the Applications use the Flash memory, and how does the OS use it?

Those are the details I think that no one else other than Fusion-io can provide as a value-add beyond the PCIe based flash memory modules itself. Instead of hardware being the main differentiator (drive controllers, Single Level Cells, etc.) Fusion-io is using a different path through the OS to the Flash memory. The File I/O system traditionally tied to hard disk storage and more generically ‘storage’ of some kind is being sacrificed. But I understand the logic, design and engineering of bypassing the overhead of the ‘storage’ route and redefining the Flash memory as another form of system memory.

Maybe the old style Von Neumann architecture or Harvard architecture computers are too old school for this new paradigm of a larger tiered memory pool with DRAM and Flash memory modules consisting of the most important parts of the computer. Maybe disk storage could be used as a mere backup of the data held in the Flash memory? Hard to say, and I think Fusion-io is right to hold this info close as they might be able to make this a more general case solution to the I/O problems facing some customers (not just Wall Street type high frequency traders).

Written by carpetbomberz

January 26, 2012 at 3:00 pm

Maxeler Makes Waves With Dataflow Design – Digits – WSJ

with 2 comments

In the dataflow approach, the chip or computer is essentially tailored for a particular program, and works a bit like a factory floor.

via Maxeler Makes Waves With Dataflow Design – Digits – WSJ.

English: Altera Stratix IV EP4SGX230 FPGA on a PCB

Image via Wikipedia

My supercomputer can beat your supercomputer, and money is no object. FPGAs (Field Programmable Gate Arrays) are used most often in prototyping new computer processors. You can design a chip, then ‘program’ the FPGA to match the circuit design so that it can be verified. Verification is the process by which you do exhaustive tests on the logic and circuits to see if you’ve left anything out or didn’t get the timing right for the circuits that may run at different speeds within the chip itself. They are expensive niche products that chip design outfits and occasionally product manufacturers use to solve problems. Less often they might be used in data network gear to help classify and reroute packets in a data center and optimize performance over time.

This by itself would be a pretty good roster of applications, but something near and dear to my heart is the use of FPGAs as a kind of reconfigurable processor. I am certain one day we will see the application of FPGA  in desktop computers. But until then, we’ll have to settle for using FPGAs as special purpose application accelerators in high volume trading and Wall Street type data centers. This article in WSJ is going to change a few opinions about the application of FPGAs for real computing tasks. The speedups quoted for different analysis and reports derived from the transactions show multiple orders of magnitude speedups. In extreme examples sometimes 1,000 times faster speed-ups occurred when using a fully optimized FPGA versus a general purpose CPU.

When someone can tout 1,000X speedups everyone is going to take notice. And hopefully it won’t be simply a bunch of copycats trying to speed up their reports and management dashboards. There’s a renaissance out there waiting to happen with FPGAs and I still have hope I’ll see it in my lifetime.

Written by carpetbomberz

January 16, 2012 at 3:00 pm

Xen hypervisor ported to ARM chips • The Register

leave a comment »

Deutsch: Offizielles Logo der ARM-Prozessorarc...

Image via Wikipedia

You can bet that if ARM servers suddenly look like they will be taking off that Red Hat and Canonical will kick in some help and move these Xen and KVM projects along. Server maker HP, which has launched the “Redstone” experimental server line using Calxedas new quad-core EnergyCore ARM chips, might also help out. Dell has been playing around with ARM servers, too, and might help with the hypervisor efforts as well.

via Xen hypervisor ported to ARM chips • The Register.

This is an interesting note, some open source Hypervisor projects are popping up now that the ARM Cortex A15 has been announced and some manufacturers are doling out development boards. What it means longer term is hard to say other than it will potentially be a boon to manufacturers using the ARM15 in massively parallel boxes like Calxeda. Or who are trying to ‘roll their own’ ARM based server farms and want to have the flexibility of virtual machines running under a hypervisor environment. However, the argument remains, “Why use virtual servers on massively parallel cpu architectures when a  1:1 cpu core to app ratio is more often preferred?”

However, I would say old habits of application and hardware consolidation die hard and virtualization is going to be expected because that’s what ‘everyone’ does in their data centers these days. So knowing that a hypervisor is available will help foster some more hardware sales of what will most likely be a niche products for very specific workloads (ie. Calxeda, Qanta SM-2, SeaMicro). And who knows maybe this will foster more manufacturers or even giant data center owners (like Apple, Facebook and Google) to attempt experiments of rolling their own ARM15 environments knowing there’s a ready made hypervisor out there that they can compile on the new ARM chip.

However, I think all eyes are really still going to be on the next generation ARM version 8 with the full 64bit memory and instruction set. Toolsets nowadays are developed in house by a lot of the datacenters and the dominant instruction set is Intel x64 (IA64) which means the migration to 64bits has already happened. Going back to 32bits just to gain the advantage of the lower power ARM architecture is far to costly for most. Whereas porting from IA64 to 64bit ARM architecture is something more datacenters might be willing to do if the potential cost/benefit ratio is high enough to cross-compile and debug. So legacy management software toolsets are really going to drive a lot of testing and adoption decisions by data centers looking at their workloads and seeing if ARM cpus fit their longer term goals of saving money by using less power.

Written by carpetbomberz

January 12, 2012 at 3:00 pm

AnandTech – Applied Micros X-Gene: The First ARMv8 SoC

with one comment

APM expects that even with a late 2012 launch it will have a 1 – 2 year lead on the competition. If it can get the X-Gene out on time, hitting power and clock targets both very difficult goals, the headstart will be tangible. Note that by the end of 2012 well only just begin to see the first Cortex A15 implementations. ARMv8 based competitors will likey be a full year out, at least. 

via AnandTech – Applied Micros X-Gene: The First ARMv8 SoC.

Chip Diagram for the ARM version 8 as implemented by APM

It’s nice to get a confirmation of the production time lines for the Cortex A15 and the next generation ARM version 8 architecture. So don’t expect to see shipping chips, much less finished product using those chips well into 2013 or even later. As for the 4 core ARM A15, finished product will not appear until well into 2012. This means if Intel is able to scramble, they have time to further refine their Atom chips to reach the power level and Thermal Design Point (TDP) for the competing ARM version 8 architecture. What seems to be the goal is to jam in more cores per CPU socket than is currently done on the Intel architecture (up to almost 32 in on of the graphics presented with the article).

The target we are talking about is 2W per core @ 3Ghz, and it is going to be a hard, hard target to hit for any chip designer or manufacturer. One can only hope that TMSC can help APM get a finished chip out the door on it’s finest ruling chip production lines (although an update to the article indicates it will ship on 40nm to get it out the door quicker). The finer the ruling of signal lines on the chip the lower the TDP, and the higher they can run the clock rate. If ARM version 8 can accomplish their goal of 2W per cpu core @ 3 Gigahertz, I think everyone will be astounded. And if this same chip can be sampled at the earliest prototypes stages by a current ARM Server manufacturer say, like Calxeda or even SeaMicro then hopefully we can get benchmarks to show what kind of performance can be expected from the ARM v.8  architecture and instruction set. These will be interesting times.

Intel Atom CPU Z520, 1,333GHz

Image via Wikipedia

Written by carpetbomberz

December 5, 2011 at 3:00 pm

Fusion plays its card: The Ten of Terabytes • The Register

leave a comment »

Fusion-io has crammed eight ioDrive flash modules on one PCIe card to give servers 10TB of app-accelerating flash.

This follows on from its second generation ioDrives: PCIe-connected flash cards using single level cell and multi-level cell flash to provide from 400GB to 2.4TB of flash memory, which can be used by applications to get stored data many times faster than from disk. By putting eight 1.28TB multi-level cell ioDrive 2 modules on a single wide ioDrive Octal PCIe card Fusion reaches a 10TB capacity level.

via Fusion plays its card: The Ten of Terabytes • The Register.

Image representing Fusion-io as depicted in Cr...

Image via CrunchBase

This is some big news in the fight to be king of the PCIe SSD market. I declare: Advantage Fusion-io. They now have the lead in terms of not just speed but also overall capacity at the price point they have targeted.  As densities increase and prices more or less stay flat, the value add is more data can stay resident on the PCIe card and not be swapped out to Fibre-Channel array storage on the Storage Area Network (SAN). Performance is likely to be wicked cool and early adopters will now doubt reap big benefits from transaction processing and online analytic processing as well.

Written by carpetbomberz

November 28, 2011 at 3:00 pm

Intel Responds to Calxeda/HP ARM Server News (Wired.com)

with 3 comments

Now, you’re probably thinking, isn’t Xeon the exact opposite of the kind of extreme low-power computing envisioned by HP with Project Moonshot? Surely this is just crazy talk from Intel? Maybe, but Walcyzk raised some valid points that are worth airing.via Cloudline | Blog | Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data.

Structure of the TILE64 Processor from Tilera

Image via Wikipedia: Tile64 mesh network processor from Tilera

Image representing Tilera as depicted in Crunc...

Image via CrunchBase

So Intel gets an interview with a Conde-Nast writer for a sub-blog of Wired.com. I doubt too many purchasers or data center architects consult Cloudline@Wired.com. But all the same, I saw through many thinly veiled bits of handwaving and old saws from Intel saying, “Yes, this exists but we’re already addressing it with our exiting product lines,. . .” So, I wrote in a comment to this very article. Especially regarding a throw-away line mentioning the ‘future’ of the data center and the direction the Data Center and Cloud Computing market was headed. However the moderator never published the comment. In effect, I raised the Question: Whither Tilera? And the Quanta SM-2 server based on the Tilera Chip?

Aren’t they exactly what is described by the author John Stokes as a network of cores on a chip? And given the scale of Tilera’s own product plans going into the future and the fact they are not just concentrating on Network gear but actual Compute Clouds too, I’d say both Stokes and Walcyzk are asking the wrong questions and directing our attention in the wrong direction. This is not a PR battle but a flat out technology battle. You cannot win this with words and white papers but in fact it requires benchmarks and deployments and Case Histories. Technical merit and superior technology will differentiate the players in the  Cloud in a Box race. And this hasn’t been the case in the past as Intel has battled AMD in the desktop consumer market. In the data center Intel Fear Uncertainty and Doubt is the only weapon they have.

And I’ll quote directly from John Stokes’s article here describing EXACTLY the kind of product that Tilera has been shipping already:

“Instead of Xeon with virtualization, I could easily see a many-core Atom or ARM cluster-on-a-chip emerging as the best way to tackle batch-oriented Big Data workloads. Until then, though, it’s clear that Intel isn’t going to roll over and let ARM just take over one of the hottest emerging markets for compute power.”

The key phrase here is cluster on a chip, in essence exactly what Tilera has strived to achieve with its Tilera64 based architecture. To review from previous blog entries of this website following the announcements and timelines published by Tilera:

Written by carpetbomberz

November 17, 2011 at 3:00 pm

Follow

Get every new post delivered to your Inbox.

Join 217 other followers