NoSQL database supplier Couchbase says it is tweaking its key-value storage server to hook into Fusion-ios PCIe flash ioMemory products – caching the hottest data in RAM and storing lukewarm info in flash. Couchbase will use the ioMemory SDK to bypass the host operating systems IO subsystems and buffers to drill straight into the flash cache.
Can you hear it? It’s starting to happen. Can you feel it? The biggest single meme of the last 2 years Big Data/NoSQL is mashing up with PCIe SSDs and in memory databases. What does it mean? One can only guess but the performance gains to be had using a product like CouchBase to overcome the limits of a traditional tables/rows SQL database will be amplified when optimized and paired up with PCIe SSD data stores. I’m imagining something like a 10X boost in data reads/writes on the CouchBase back end. And something more like realtime performance from something that might have been treated previously like a Data Mart/Data warehouse. If the move to use the ioMemory SDK and directFS technology with CouchBase is successful you are going to see some interesting benchmarks and white papers about the performance gains.
What is Violin Memory Inc. doing in this market segment of tiered database caches? Violin is teaming with SAP to create a tiered cache for the HANA in memory databasefrom SAP. The SSD SAN array provided by Violin could be multi-tasked to do other duties (providing a cache to any machine on the SAN network). However, this product most likely would be a dedicated caching store to speed up all operations of a RAM based HANA installation, speeding up Online transaction processing and parallel queries on realtime data. No doubt SAP users could stand to gain a lot if they are already invested heavily into the SAP universe of products. But for the more enterprising, entrepreneurial types I think Fusio-io and Couchbase could help get a legacy free group of developers up and running with equal performance and scale. Which ever one you pick is likely to do the job once it’s been purchased, installed and is up and running in a QA environment.
Like the native API libraries, directFS is implemented directly on ioMemory, significantly reducing latency by entirely bypassing operating system buffer caches, file system and kernel block I/O layers. Fusion-io directFS will be released as a practical working example of an application running natively on flash to help developers explore the use of Fusion-io APIs.
Another interesting announcement from the folks at Fusion-io regarding their brand of PCIe SSD cards. There was a proof of concept project covered previously by Chris Mellor in which Fusion-io attempted to top out at 1 Billion IOPs using a novel architecture where PCIe SSD drives were not treated as storage. In fact the Fusion-io was turned into a memory tier bypassing most of the OSes own buffers and queues for handling a traditional Filesystem. Doing this reaped many benefits in terms of depleting the latency inherent with a FileSystem and how it has to communicate through the OS kernel through to the memory subsystem and back again.
Considering also work done within the last 4 years or more using so-called “in memory’ databases and big data projects in general a product like directFS might pair nicely with them. The limit with in memory databases is always the amount of RAM available and total number of cpu nodes managing those memory subsystems. Tack on the necessary storage to load and snapshot the database over time and you have a very traditional looking database server. However, if you supplement that traditional looking architecture with a tier of storage like the directFS the SAN network becomes a 3rd tier of storage, almost like a tape backup device. Sounds interesting the more I daydream about it.
Some interesting notes about future directions SandForce might take especially now that SandForce has been bought out by LSI. They are hard at work attempting to optimize other parts of their current memory controller technology (speeding up small random reads and writes). There might be another 2X performance gain to be had at least on the SSD front, but more importantly is the PCI Express market. Fusion-io has been the team to beat when it comes to integrating components and moving data across the PCIe interface. Now SandForce is looking to come out with a bona fide PCIe-SSD controller which up until now has been a roll-your own type affair. The engineering and design expertise of companies like Fusion-io were absolutely necessary to get a PCIe SSD card to market. Now that playing field too will be leveled somewhat and possibly now competitors will enter the market with equally good performance numbers
But even more interesting than this wrinkle in the parts design for PCIe SSDs is the announcement earlier this month of Fusion-io’s new software interface for getting around the limits of File I/O on modern day OSes. Auto Commit Memory: “ACM is a software layer which allows developers to send and receive data stored on Fusion-io’s ioDrive cards directly to and from the CPU, rather than relying upon the operating system”(Link to The Verge article listed in my Fusion-io article). SandForce is up against a moving target if they hope to compete more directly with Fusion-io who is now investing in hardware AND software engineering at the same time. 1 Billion IOPS is nothing to sneeze at given the pace of change since SATA SSDs and PCIe SSDs hit the market in quantity.
Fusion-io has achieved a billion IOPS from eight servers in a demonstration at the DEMO Enterprise event
in San Francisco.
The cracking performance needed just eight HP DL370 G6 servers, running Linux 220.127.116.11-45 on two, 6-core Intel processors, 96GB RAM. Each server was fitted with eight 2.4TB ioDrive2 Duo PCIE flash drives; thats 19.2TB of flash per server and 153.6TB of flash in total.
This is in a word, no mean feat. 1 Million IOPS was the target to beat not just 2 years ago for anyone attempting to buy/build their own Flash based storage from the top Enterprise Level manufacturers. So the bar has risen no less than 3 orders of magnitude higher than the top end from 1 year ago. Add to that the magic sauce of bypassing the host OS and using the Flash memory as just an enhanced large memory.
This makes me wonder, how exactly does the Flash memory get used alongside the RAM memory pool?
How do the Applications use the Flash memory, and how does the OS use it?
Those are the details I think that no one else other than Fusion-io can provide as a value-add beyond the PCIe based flash memory modules itself. Instead of hardware being the main differentiator (drive controllers, Single Level Cells, etc.) Fusion-io is using a different path through the OS to the Flash memory. The File I/O system traditionally tied to hard disk storage and more generically ‘storage’ of some kind is being sacrificed. But I understand the logic, design and engineering of bypassing the overhead of the ‘storage’ route and redefining the Flash memory as another form of system memory.
Maybe the old style Von Neumann architecture or Harvard architecture computers are too old school for this new paradigm of a larger tiered memory pool with DRAM and Flash memory modules consisting of the most important parts of the computer. Maybe disk storage could be used as a mere backup of the data held in the Flash memory? Hard to say, and I think Fusion-io is right to hold this info close as they might be able to make this a more general case solution to the I/O problems facing some customers (not just Wall Street type high frequency traders).
Fusion-io has crammed eight ioDrive flash modules on one PCIe card to give servers 10TB of app-accelerating flash.
This follows on from its second generation ioDrives: PCIe-connected flash cards using single level cell and multi-level cell flash to provide from 400GB to 2.4TB of flash memory, which can be used by applications to get stored data many times faster than from disk. By putting eight 1.28TB multi-level cell ioDrive 2 modules on a single wide ioDrive Octal PCIe card Fusion reaches a 10TB capacity level.
This is some big news in the fight to be king of the PCIe SSD market. I declare: Advantage Fusion-io. They now have the lead in terms of not just speed but also overall capacity at the price point they have targeted. As densities increase and prices more or less stay flat, the value add is more data can stay resident on the PCIe card and not be swapped out to Fibre-Channel array storage on the Storage Area Network (SAN). Performance is likely to be wicked cool and early adopters will now doubt reap big benefits from transaction processing and online analytic processing as well.
Between the RevoDrive and the Z-Drive OCZ is tearing up the charts with product releases announced in Taipei, Taiwan‘s Computex 2011 trade show. This particular one off demonstration was using a number of OCZ’s announced but as yet unreleased Z-Drive R4 88 packed into a 3U Colfax International enclosure. In other words, it’s an idealized demonstration of what kind of performance you might achieve in a best case scenario. The speeds are in excess of 3Gbytes/sec. for writing and reading which for Webserving or Database hosting is going to make a big difference for people that need the I/O. Previously you would have had to use a very expensive large scale Fibre Channel hard drive array that split and RAID’d the data across so many spinning hard drive spindles that you might come partially close to matching these speeds. But the SIZE! Ohmigosh. You would not be able to fit that amount of hardware into a 3U enclosure, never. So space constrained data centers will benefit enormously from dumping some of their drive array infrastructure for these more compact I/O monsters (some are from other manufacturers too, like Violin, RamSan and Fusion-io). Again, as I have said before when Anandtech and Tom’s Hardware can get sample hardware to benchmark the performance I will be happy to see what else these PCIe SSDs can do.
This sounds like an interesting evolution of the SSD type of storage. But, I don’t know if there is a big advantage forcing a RAM memory controller to be the bridge to a Flash Memory controller. In terms of bandwidth, the speed seems comparable to a 4x PCIe interface. I’m thinking now of how it might compare to PCIe based SSD from OCZ or Fusion-io. It seems like the advantage is still held by PCIe in terms of total bandwidth and capacity (above 500MB/sec and 2Terabytes total storage). It maybe a slightly lower cost, but the use of Single Level Cell Flash memory chips raises the cost considerably for any given size of storage, and this product from Viking uses the Single Level Cell flash memory. I think if this product ships, it will not compete very well against products like consumer level SSDs, PCIe SSDs, etc. However if they continue to develop the product and evolve it, there might be a niche where it can be performance or price competitive.
The main categories here are SF-2100, SF-2200, SF-2500 and SF-2600. The 2500/2600 parts are focused on the enterprise. They’re put through more aggressive testing, their firmware supports enterprise specific features and they support the use of a supercap to minimize dataloss in the event of a power failure. The difference between the SF-2582 and the SF-2682 boils down to one feature: support for non-512B sectors. Whether or not you need support for this really depends on the type of system it’s going into. Some SANs demand non-512B sectors in which case the SF-2682 is the right choice.
The cat is out of the bag, OCZ has not one but two SandForce SF-2000 series based SSDs out on the market now. And performance-wise the consumer level product is even slightly better performing than the enterprise level product at less cost. These indeed are interesting times. The speeds are so fast with the newer SandForce drive controllers that with a SATA 6GB/s drive interface you get speeds close to what could only be purchased on a PCIe based SSD drive array for $1200 or so. The economics of this is getting topsy-turvy, new generations of single drives outdistancing previous top-end products (I’m talking about you Fusion-io and you Violin Memory). SandForce has become the drive controller for the rest of us and with speeds like this 500MB/sec. read and write what more could you possibly ask for? I would say the final bottleneck on the desktop/laptop computer is quickly vanishing and we’ll have to wait and see just how much faster the SSD drives become. My suspicion is now a computer motherboard’s BIOS will slowly creep up to be the last link in the chain of noticeable computer speed. Once we get a full range of UEFI motherboards and fully optimized embedded software to configure them we will have theoretically the fastest personal computers one could possibly design.
One cannot make this stuff up, two weeks ago Angelbird announced its bootable PCI Express SSD. Late yesterday OCZ one of the biggest 3rd party after market makers of SSDs announces a new PCI Express SSD which is also bootable. Big difference between the Angelbird product and OCZ’s RevoDrive is the throughput on the top end. This means if you purchase the most expensive fully equipped card from either manufacturer you will get 900+MBytes/sec. on the Angelfire versus 700+MBytes/sec. on the Revodrive from OCZ. Other differences include the ‘native’ support of the OCZ on the Host OS. I think this means that they aren’t using the ‘virtual OS’ on the embedded chips to boot so much as having the PCIe drive electronics make everything appear to be a real native boot drive. Angelbird uses an embedded OS to virtualize and abstract the hardware so that you get to boot any OS you want and run it off the flash memory onboard.
The other difference I can see from reading the announcements is that only the largest configured size on the Angelbird that gets you the fastest throughput. As drives are added the RAID array is striped over more available flash drives. The OCZ product also does a RAID array to increase speed, however they hit the maximum throughput at an intermediate size (~250GByte configuration) and at the maximum size too. So if you want an ‘normal’ to ‘average’ size storage but better throughput you don’t have to buy the maxed out most expensive version of the OCZ RevoDrive to get there. Which means this could be a more manageable price for the gaming market or for the PC fanboys who want faster boot times. Don’t get me wrong though, I’m not recommending buying an expensive 250GByte RevoDrive if a similarly sized SATA SSD costs a good deal less. No far from it, the speed difference may not be worth the price you pay. But, the RevoDrive could be upgraded over time and keep your speeds at the max 700+MBytes/sec. you get with its high throughput intermediate configuration. Right now, I don’t have any prices to compare for either the Angelbird or OCZ Revodrive products. I can tell you however that the Fusion-io low end desktop product is in the $700-$800 range and doesn’t come with upgradeable storage, you get a few sizes to choose from, and that’s it. If either of the two products ship at a price significantly less than the Fusion-io product everyone will flock to them I’m sure.
Two other significant features touted by both product announcements are the SandForce SF-1200 flash controller. Right now that controller is the de facto standard high throughput part everyone is using for the SATA SSD products. There’s even an intermediate part on the market called the SF-1500 (their top end offering). So it’s de rigeur to include the SandForce SF-1200t in any product you hope to sell to a wide audience (especially hardware fanboys). However, let me caution you that in the flurry of product announcements and always keeping an eye on preventing buyers remorse, SandForce did announce very recently a new drive controller they have labelled the SF-2000 series. This part may or may not be targeted for the consumer desktop market, but depending on how well it performs once it starts shipping you may want to wait and see if the revision of this crop of newly announced PCIe cards adopts the SandForce controller chip to gain the extra throughput it is touting. The new controller is rated at 740MBytes/sec. all by itself, with 4 SSDs attached to it on a PCIe card, theoretically four times 740 equals 2,096 and that is a substantially large quantity of data coming through th PCI Express data bus. Luckily for most of us the PCI Express interface on a 4X (four lane) data bus has a while to go before it gets saturated by all this disk throughput. The question is how long will it take to overwhelm the a four lane PCI Express connector? I hope to see the day this happens.
First let’s just take a quick look backwards to see what was considered state of the art a year ago. A company called STEC was making Flash-based hard drives and selling them to big players in the enterprise storage market like IBM and NetApp. I depends solely on The Register for this information as you can read here: STEC becalmed as Fusion-io streaks ahead
STEC flooded the market according to The Register and subsequently the people using their product were suddenly left with a glut of product using these Fibre Channel based Flash Drives (Solid State Disk Drives – SSD). And the gains in storage array performance followed. However the supply exceeded the demand and EMC is stuck with a raft of last year’s product that it hasn’t marked up and re-sold to its current customers. Which created an opening for a similar but sexier product Fusion-io and it’s PCIe based Flash hard drive. Why sexy?
The necessity of a Fibre Channel interface for the Enterprise Storage market has long been an accepted performance standard. You need at minimum the theoretical 6GB/sec of FC interfaces to compete. But for those in the middle levels of the Enterprise who don’t own the heavy iron of giant multi-terabyte storage arrays, there was/is now an entry point through the magic of the PCIe 2.0 interface. Any given PC whether a server or not will have open PCIe slots in which a
Fusion-io SSD card could be installed. That lower threshold (though not a lower price necessarily) has made Fusion-io the new darling for anyone wanting to add SSD throughput to their servers and storage systems. And now everyone wants Fusion-io not the re-branded STEC Fibre Channel SSDs everyone was buying a year ago.
Anyone who has studied history knows in the chain of human relations there’s always another competitor out there that wants to sit on your head. Enter LSI and Seagate with a new product for the wealthy, well-heeled purchasing agent at your local data center: LSI and Seagate take on Fusion-io with flash
Rather than create a better/smarter Fibre Channel SSD, LSI and Seagate are assembling a card that plugs into PCIe slot of a storage array or server to act as a high speed cache to the slower spinning disks. The Register refers to three form factors in the market now RamSan, STEC and Fusion-io. Because Fusion-io seems to have moved into the market at the right time and is selling like hot cakes, LSI/Seagate are targeting that particular form factor with it’s SSS6200.
STEC is also going to create a product with a PCIe interface and Micron is going to design a product too. LSI’s product will not be available to ship until the end of the year. In terms of performance the speeds being target are comparable between the Fusion-io Duo and the LSI SSS6200 (both using single level cell memory). So let the price war begin! Once we finally get some competition in the market I would hope the entry level price of Fusion-io (~$35,000) finally erodes a bit. It is a premium product right now intended to help some folks do some heavy lifting.
My hope for the future is we could see something comparable (though much less expensive and scaled down) available on desktop machines. I don’t care if it’s built-in to a spinning SATA hard drive (say as a high speed but very large cache) or some kind of card plugging into a bus on the motherboard (like the failed Intel Speed Boost cache). If a high speed flash cache could become part of the standard desktop PC architecture to sit in front of monstrous single hard drives (2TB or higher nowadays) we might get faster response from our OS of choice, and possible better optimization of reads/writes to fairly fast but incredibly dense and possibly more error prone HDDs. I say this after reading about the big charge by Western Digital to move from smaller blocks of data to the 4K block.
Much wailing and gnashing of teeth has accompanied the move recently by WD to address the issue of error correcting Cycle Redundancy Check (CRC) algorithms on the hard drives. Because 2Terabyte drives have so many 512bit blocks more and more time and space is taken up doing the CRC check as data is read and written to the drive. A larger block made up of 4096 bits instead of 512 makes the whole thing 4x less wasteful and possibly more reliable even if some space is wasted to small text files or web pages. I understand completely the implication and even more so, old-timers like Steve Gibson at GRC.com understand the danger of ever larger single hard drives. The potential for catastrophic loss of data as more data blocks need to be audited can numerically become overwhelming to even the fastest CPU and SATA bus. I think I remember Steve Gibson expressing doubts as to how large hard drives could theoretically become.
As the creator of the SpinRite data recovery utility he knows fundamentally the limits to the design of the Parallel ATA interface. Despite advances in speeds, error-correcting hasn’t changed and neither has the quality of the magnetic medium used on the spinning disks. One thing that has changed is the physical size of the blocks of data. They have gotten infinitesimally smaller with each larger size of disk storage. The smaller the block of data the more error correcting must be done. The more error-correcting the more space to write the error-correcting information. Gibson himself observers something as random as cosmic rays can flip bits within a block of data at those incredibly small scales of the block of data on a 2TByte disk.
So my hope for the future is a new look at the current state of the art motherboard, chipset, I/O bus architecture. Let’s find a middle level, safe area to store the data we’re working on, one that doesn’t spontaneously degrade or is too susceptible to random errors (ie cosmic rays). Let the Flash Cache’s flow, let’s get better throughput and let’s put disks into the class of reliable but slower backing stores for our SSDs.