Fusion-io demos billion IOPS server config • The Register
Fusion-io has achieved a billion IOPS from eight servers in a demonstration at the DEMO Enterprise eventin San Francisco.
The cracking performance needed just eight HP DL370 G6 servers, running Linux 2.6.35.6-45 on two, 6-core Intel processors, 96GB RAM. Each server was fitted with eight 2.4TB ioDrive2 Duo PCIE flash drives; thats 19.2TB of flash per server and 153.6TB of flash in total.
via Fusion-io demos billion IOPS server config • The Register.
This is in a word, no mean feat. 1 Million IOPS was the target to beat not just 2 years ago for anyone attempting to buy/build their own Flash based storage from the top Enterprise Level manufacturers. So the bar has risen no less than 3 orders of magnitude higher than the top end from 1 year ago. Add to that the magic sauce of bypassing the host OS and using the Flash memory as just an enhanced large memory.
This makes me wonder, how exactly does the Flash memory get used alongside the RAM memory pool?
How do the Applications use the Flash memory, and how does the OS use it?
Those are the details I think that no one else other than Fusion-io can provide as a value-add beyond the PCIe based flash memory modules itself. Instead of hardware being the main differentiator (drive controllers, Single Level Cells, etc.) Fusion-io is using a different path through the OS to the Flash memory. The File I/O system traditionally tied to hard disk storage and more generically ‘storage’ of some kind is being sacrificed. But I understand the logic, design and engineering of bypassing the overhead of the ‘storage’ route and redefining the Flash memory as another form of system memory.
Maybe the old style Von Neumann architecture or Harvard architecture computers are too old school for this new paradigm of a larger tiered memory pool with DRAM and Flash memory modules consisting of the most important parts of the computer. Maybe disk storage could be used as a mere backup of the data held in the Flash memory? Hard to say, and I think Fusion-io is right to hold this info close as they might be able to make this a more general case solution to the I/O problems facing some customers (not just Wall Street type high frequency traders).
Related articles
More PCI-express SSD cards coming to OS X | MacFixIt – CNET Reviews
The card will use the Marvell 88SE9455 RAID controller that will interface with the SandForce 2200-based daughter cards that can be added to the main controller on demand. This will allow for user-configurable drive sizes from between 60GB and 2TB in size, allowing you to expand your storage as your need for it increases.
via More PCI-express SSD cards coming to OS X | MacFixIt – CNET Reviews.

Other World Computing
I’m a big fan of Other World Computing (OWC) and have always marveled at their ability to create new products they brand on their own. In the article they talk about a new Mac compatible PCIe SSD. It sounds like an uncanny doppleganger to the Angelbird board announced about 2 years ago and started shipping last Fall 2011. The add-on sockets especially remind me of the ugpradable Angelbird board especially. There are not many PCIe SSD cards that have sockets for Flash memory modules and Other World Computing would be the second one I have seen since I’ve been commenting on these devices when they hit the consumer market. Putting sockets on the board makes it easier to come into the market at a lower price point for users where price is most important. However at the high end capacity is king for some purchasers of PCIe SSD drives. So the oddball upgradeable PCIe SSD fills a niche that’s for sure.
Performance projections for this card are really good and typical of most competing PCIe SSD cards. So depending on your needs you might find this perfect. Price however is always harder to pin down. Angelbird sold a bare PCIe card with no SSDs for around $249. It came with 32GB onboard for that price. What was really nice was the card used SATA sockets set far enough apart to place full sized SSDs on the card without crowding each other. This brought the possibility of slowly upgrading to higher speed drives or larger capacity drives over time to the consumer market.

Welcome to Wings from Angelbird - Mac comaptible PCIe SSD
But what’s cooler still is Angelbird’s card allowed it to run under ANY OS, even Mac OS as it was engineered to be a a free standing computer with a large Flash memory attached to it. That allowed it to pre-boot into an embedded OS before handing over control to the Host OS whatever flavor it might be. I don’t know if the OWC card works similarly, but it does NOT use SATA sockets or provide enough room to plug in SSD drives. The plug-in modules for this device are mSATA style sockets used in tablets and netbook style computers. So the modules will most likely need to be purchased direct from OWC to peform capacity upgrades over the life of the PCIe card itself. Prices have not yet been set according to this article.
Related articles
- Marvell brews ARM-based native PCIe SSD Controller IC: 88NV9145 handles direct PCIe to NAND Flash I/O for high-performance, low-overhead SSD designs (denalimemoryreport.wordpress.com)
- OWC gives Mac Pro users the first PCI Express SSD option (9to5mac.com)
- Angelbird’s Wings PCIe-based SSD preview and benchmarks (engadget.com)
RE: Erics Archived Thoughts: Vigilance and Victory
Erics Archived Thoughts: Vigilance and Victory.
While I agree there might be a better technical solution to the DNS blocking adopted by SOPA and PIPA bills, less formal networks are in essence filling the gap. By this I mean the MegaUpload takedown that occurred yesterday at the the order of the U.S. Justice Department. Without even the benefit of SOPA or PIPA, they ordered investigations, arrests and takedowns of the whole MegaUpload enterprise. But what is interesting is the knock-on effects social networks had in the vacuum left by the DNS blocking. Within hours the DNS was replaced by it’s immediate pre-cursors. That’s right, folks were sending the IP addresses of available MegaUpload hosts by plain text in Tweet messages the world ’round. And given the announcement today that Twitter will be closing in on it’s 500 Million’th account being created I’m not too worried about a technical solution to DNS blocking. That too is already moot, by virtue of the the fact of social networking and simple numeric IP addresses. Long live IPv4 and the quadruple octets 255.255.255.xxx
AnandTech – AMD Radeon HD 7970 Review: 28nm And Graphics Core Next, Together As One
Quick Sync made real-time H.264 encoding practical on even low-power devices, and made GPU encoding redundant at the time. AMD of course isn’t one to sit idle, and they have been hard at work at their own implementation of that technology: the Video Codec Engine VCE.
via AnandTech – AMD Radeon HD 7970 Review: 28nm And Graphics Core Next, Together As One.
Intel’s QuickSync helped speed up the realtime encoding of H.264 video. AMD is striking back and has Hybrid Mode VCE operations that will speed things up EVEN MORE! The key to having this hit the market and get widely adopted of course is the compatibility of the software with a wide range of video cards from AMD. The original CUDA software environment from nVidia took a while to disperse into the mainstream as it had a limited number of graphics cards it could support when it rolled out. Now it’s part of the infrastructure and more or less provided gratis whenever you buy ANY nVidia graphics card today. AMD has to follow this semi-forced adoption of this technology as fast as possible to deliver the benefit quickly. At the same time the User Interface to this VCE software had better be a great design and easy to use. Any type of configuration file dependencies and tweaking through preference files should be eliminated to the point where you merely move a slider up and down a scale (Slower->Faster). And that should be it.
And if need be AMD should commission an encoder App or a plug-in to an open source project like HandBrake to utilize the VCE capability upon detection of the graphics chip on the computer. Make it ‘just happen’ without the tempting early adopter approach of making a tool available and forcing people to ‘build’ a version of an open source encoder to utilize the hardware properly. Hands-off approaches that favor early adopters is going to consign this technology to the margins for a number of years if AMD doesn’t take a more activist role. QuickSync on Intel hasn’t been widely touted either so maybe it’s a moot point to urge anyone to treat their technology as an insanely great offering. But I think there’s definitely brand loyalty that could be brought into play if the performance gains to be had with a discreet graphics card far outpace the integrated graphics solution of QuickSync provided by Intel. If you can achieve a 10x order of magnitude boost, you should be pushing that to all the the potential computer purchasers from this announcement forward.
Related articles
Maxeler Makes Waves With Dataflow Design – Digits – WSJ
In the dataflow approach, the chip or computer is essentially tailored for a particular program, and works a bit like a factory floor.
via Maxeler Makes Waves With Dataflow Design – Digits – WSJ.
My supercomputer can beat your supercomputer, and money is no object. FPGAs (Field Programmable Gate Arrays) are used most often in prototyping new computer processors. You can design a chip, then ‘program’ the FPGA to match the circuit design so that it can be verified. Verification is the process by which you do exhaustive tests on the logic and circuits to see if you’ve left anything out or didn’t get the timing right for the circuits that may run at different speeds within the chip itself. They are expensive niche products that chip design outfits and occasionally product manufacturers use to solve problems. Less often they might be used in data network gear to help classify and reroute packets in a data center and optimize performance over time.
This by itself would be a pretty good roster of applications, but something near and dear to my heart is the use of FPGAs as a kind of reconfigurable processor. I am certain one day we will see the application of FPGA in desktop computers. But until then, we’ll have to settle for using FPGAs as special purpose application accelerators in high volume trading and Wall Street type data centers. This article in WSJ is going to change a few opinions about the application of FPGAs for real computing tasks. The speedups quoted for different analysis and reports derived from the transactions show multiple orders of magnitude speedups. In extreme examples sometimes 1,000 times faster speed-ups occurred when using a fully optimized FPGA versus a general purpose CPU.
When someone can tout 1,000X speedups everyone is going to take notice. And hopefully it won’t be simply a bunch of copycats trying to speed up their reports and management dashboards. There’s a renaissance out there waiting to happen with FPGAs and I still have hope I’ll see it in my lifetime.
Related articles
- Xilinx Accelerates Design Cycles and Lowers Costs for Industrial Networking and Motor Control Systems (prnewswire.com)
- JPMorgan Rolls Out (Another) FPGA Supercomputer (news.slashdot.org)
- Xtreme Compute Technologies (XCT) Announces New “r-BriX” Line of FPGA Powered “Reconfigurable” Compute Solutions (prweb.com)
Xen hypervisor ported to ARM chips • The Register
You can bet that if ARM servers suddenly look like they will be taking off that Red Hat and Canonical will kick in some help and move these Xen and KVM projects along. Server maker HP, which has launched the “Redstone” experimental server line using Calxedas new quad-core EnergyCore ARM chips, might also help out. Dell has been playing around with ARM servers, too, and might help with the hypervisor efforts as well.
via Xen hypervisor ported to ARM chips • The Register.
This is an interesting note, some open source Hypervisor projects are popping up now that the ARM Cortex A15 has been announced and some manufacturers are doling out development boards. What it means longer term is hard to say other than it will potentially be a boon to manufacturers using the ARM15 in massively parallel boxes like Calxeda. Or who are trying to ‘roll their own’ ARM based server farms and want to have the flexibility of virtual machines running under a hypervisor environment. However, the argument remains, “Why use virtual servers on massively parallel cpu architectures when a 1:1 cpu core to app ratio is more often preferred?”
However, I would say old habits of application and hardware consolidation die hard and virtualization is going to be expected because that’s what ‘everyone’ does in their data centers these days. So knowing that a hypervisor is available will help foster some more hardware sales of what will most likely be a niche products for very specific workloads (ie. Calxeda, Qanta SM-2, SeaMicro). And who knows maybe this will foster more manufacturers or even giant data center owners (like Apple, Facebook and Google) to attempt experiments of rolling their own ARM15 environments knowing there’s a ready made hypervisor out there that they can compile on the new ARM chip.
However, I think all eyes are really still going to be on the next generation ARM version 8 with the full 64bit memory and instruction set. Toolsets nowadays are developed in house by a lot of the datacenters and the dominant instruction set is Intel x64 (IA64) which means the migration to 64bits has already happened. Going back to 32bits just to gain the advantage of the lower power ARM architecture is far to costly for most. Whereas porting from IA64 to 64bit ARM architecture is something more datacenters might be willing to do if the potential cost/benefit ratio is high enough to cross-compile and debug. So legacy management software toolsets are really going to drive a lot of testing and adoption decisions by data centers looking at their workloads and seeing if ARM cpus fit their longer term goals of saving money by using less power.
Related articles
- HP and Calxeda’s Moonshot ARM servers will bring all the boys to the yard (video) (engadget.com)
- ARM V8 Architecture (perspectives.mvdirona.com)
The PC is dead. Why no angry nerds? :: The Future of the Internet — And How to Stop It
Famously proprietary Microsoft never dared to extract a tax on every piece of software written by others for Windows—perhaps because, in the absence of consistent Internet access in the 1990s through which to manage purchases and licenses, there’d be no realistic way to make it happen.
via The PC is dead. Why no angry nerds? :: The Future of the Internet — And How to Stop It.
While true that Microsoft didn’t tax Software Developers who sold product running on the Windows OS, a kind of a tax levy did exist for hardware manufacturers creating desktop pc’s with Intel chips inside. But message received I get the bigger point, cul-de-sacs don’t make good computers. They do however make good appliances. But as the author Jonathan Zittrain points out we are becoming less aware of the distinction between a computer and an applicance, and have lowered our expectation accordingly.
In fact this points to the bigger trend of not just computers becoming silos of information/entertainment consumption no, not by a long shot. This trend was preceded by the wild popularity of MySpace, followed quickly by Facebook and now Twitter. All platforms as described by their owners with some amount of API publishing and hooks allowed to let in 3rd party developers (like game maker Zynga). But so what if I can play Scrabble or Farmville with my ‘friends’ on a social networking ‘platform’? Am I still getting access to the Internet? Probably not, as you are most likely reading what ever filters into or out of the central all-encompassing data store of the Social Networking Platform.
Like the old World Maps in the days before Columbus, there be Dragons and the world ends HERE even though platform owners might say otherwise. It is an Intranet pure and simple, a gated community that forces unique identities on all participants. Worse yet it is a big brother-like panopticon where each step and every little movement monitored and tallied. You take quizzes, you like, you share, all these things are collection points, check points to get more data about you. And that is the TAX levied on anyone who voluntarily participates in a social networking platform.
So long live the Internet, even though it’s frontier, wild-catting days are nearly over. There will be books and movies like How the Cyberspace was Won, and the pioneers will all be noted and revered. We’ll remember when we could go anywhere we wanted and do lots of things we never dreamed. But those days are slipping as new laws get passed under very suspicious pretenses all in the name of Commerce. As for me I much prefer Freedom over Commerce, and you can log that in your stupid little database.
Related articles
- Now You Can Tether Your iPhone to Your Laptop Without a Monthly Fee (readwriteweb.com)
- Did Steve Jobs Favor or Oppose Internet Freedom? (scientificamerican.com)
- Apple pulls iTether from App Store, cites carrier burden (macnn.com)
- The Personal Computer Is Dead (technologyreview.in)
AnandTech – Intel and Micron IMFT Announce Worlds First 128Gb 20nm MLC NAND
The big question is endurance, however we wont see a reduction in write cycles this time around. IMFTs 20nm client-grade compute NAND used in consumer SSDs is designed for 3K – 5K write cycles, identical to its 25nm process.
via AnandTech – Intel and Micron IMFT Announce Worlds First 128Gb 20nm MLC NAND.
If true this will help considerably in driving down cost of Flash memory chips while maintaining the current level of wear and performance drop seen over the lifetime of a chip. Stories I have read previously indicated that Flash memory might not continue to evolve using the current generation of silicon chip manufacturing technology. Performance drops occur as memory cells wear out. Memory cells were wearing out faster and faster as the wires and transistors got smaller and narrower on the Flash memory chip.
The reason for this is memory cells have to be erased in order to free them up and writing and erasing take a toll on the memory cell each time one of these operations is performed. Single Level memory cells are the most robust, and can go through many thousands even millions of write and erase cycles before they wear out. However the cost per megabyte of Single Level memory cells make it an Enterprise level premium price level for Corporate customers generally speaking. Two Level memory cells are much more cost effective, but the structure of the cells makes them less durable than Single Level cells. And as the wires connecting them get thinner and narrower, the amount of write and erase cycles they can endure without failing drops significantly. Enterprise customers in the past would not purchase products specifically because of this limitation of the Two level memory cell.
As companies like Intel and Samsung tried to make Flash memory chips smaller and less expensive to manufacture, the durability of the chips became less and less. The question everyone asked is there a point of diminishing return where smaller design rules, thinner wires is going to make chips so fragile? The solution for most manufacturers is to add spare memory cells, “over-providing” so that when a cell fails, you can unlock a spare and continue using the whole chip. The over -provisioning no so secret trick has been the way most Solid State Disks (SSDs) have handled the write/erase problem for Two Level memory cells. But even then, the question is how much do you over-provision? Another technique used is called wear-levelling where a memory controller distributes writes/erases over ALL the chips available to it. A statistical scheme is used to make sure each and every chip suffers equally and gets the same number of wear and tear apllied to it. It’s difficult balancing act manufacturers of Flash Memory and storage product manufacturers who consume those chips to make products that perform adequately, do not fail unexpectedly and do not cost too much for laptop and desktop manufacturers to offer to their customers.
If Intel and Micron can successfully address the fragility of Flash chips as the wiring and design rules get smaller and smaller, we will start to see larger memories included in more mobile devices. I predict you will see iPhones and Samsung Android smartphones with upwards of 128GBytes of Flash memory storage. Similarly, tablets and ultra-mobile laptops will also start to have larger and larger SSDs available. Costs should stay about where they are now in comparison to current shipping products. We’ll just have more products to choose from, say like 1TByte SSDs instead of the more typical high end 512GByte SSDs we see today. Prices might also come down, but that’s bound to take a little longer until all the other Flash memory manufacturers catch up.
Related articles
- IMTF exposes its incredible shrinking NAND (go.theregister.com)
Samsung: 2 GHz Cortex-A15 Exynos 5250 Chip
Samsung also previewed a 2 GHz dual-core ARM Cortex-A15 application processor, the Exynos 5250, also designed on its 32-nm process. The company said that the processor is twice as fast as a 1.5 GHz A9 design without having to jump to a quad-core layout.
via Samsung Reveals 2 GHz Cortex-A15 Exynos 5250 Chip.
More news on the release dates and the details off Samsung’s version of the ARM Cortex A15 cpu for mobile devices. Samsung is helping ramp up performance by shrinking the design rule down to 32nm, and in the A15 cpu dropping two out of the four possible cores. This choice is to make room for the integrated graphics processor. It’s a deluxe system on a chip that will no doubt give any A9 equipped tablet a run for its money. Indications at this point by Samsung are that the A15 will be a tablet only cpu and not adapted to smartphone use.
Early in the Fall there were some indications that the memory addressing of the Cortex A15 would be enhanced to allow larger memories (greater than 4GBytes) to be added to devices. As it is now memory addressing isn’t a big issue as memory extensions (up to 40bits Large Physical Address Extensions-LPAE) are allowed under the current generation Cortex A9. However the Instructions are still the same 32 bit Instruction Set longtime users of the ARM architecture are familiar with, and as always are backward compatible with previous generation software. It would appear that the biggest advantage to moving to Cortex A15 would be the potential for higher clock rates, decent power management and room to grow on the die for embedded graphics.
Apple in it’s designs using the Cortex processors has stayed one generation behind the rest of the manufacturers and used all possible knowledge and brute force to eek out a little more power savings. Witness the iPad battery life still tops most other devices on the market. By creating a fully customized Cortex A8, Apple has absolutely set the bar on power management on die, and on the motherboard as well. If Samsung decides to go the route of pure power and clock, but sacrifices two cores to get the power level down I just hope they can justify that effort with equally amazing advancements in the software that runs on this new chip. Whether it be a game or better yet a snazzy User Interface, they need to differentiate themselves and try to show off their new cpu.
Related articles
- How fast can an ARM Cortex-A15 run? 2GHz in Samsung’s 32nm process technology. That’s fast! (eda360insider.wordpress.com)





![Four in a Row [Explored] Four in a Row [Explored]](http://static.flickr.com/7018/6782703137_25b3b3b811_m.jpg)