Archive for December 2010
This is a really good technical article on attempts made by Micron and Intel to fix read/write errors in their Solid State memory based on Flash memory chips. Each revision of their design and materials for manufacture helps decrease the size of the individual memory cells on the flash memory chip however as the design rules (the distance between the wires) decrease, random errors increase. And the materials themselves suffer from fatigue with each read and write cycle. The fatigue is due in no small part (pun intended) on the size, specifically thickness of some layers in the sandwich that make up a flash memory cell. Thinner materials just wear out quicker. Typically this wearing out was addressed by adding extra unused memory cells that could act as a spare memory cell whenever one of them finally gave up the ghost, stopped working altogether. Another technique is to spread reads/writes over an area much greater than (sometimes 23% bigger) than the size of the storage on the outside of the packing. This is called wear levelling and it’s like rotating your tires to ensure they don’t start to get bare patches on them too quickly.
All these techniques will only go so far as the sizes and thickness continue to shrink. So taking a chapter out of the bad old days of computing, we are back into Error Correcting Codes or ECC. When memory errors were common and you needed to guarantee your electronic logic was not creating spontaneous errors, bits of data called parity bits would be woven into all the operations to insure something didn’t accidentally flip from being a 1 to a 0. ECC memory is still widely used in data center computers that need to guarantee the spontaneous bits don’t get flipped by say, a stray cosmic ray raining down upon us. Now however ECC is becoming the next tool after spare memory cells and wear leveling to insure flash memory can continue to grow smaller and still be reliable.
Two methods in operation today are to build the ECC memory controllers into the Flash memory modules themselves. This raises the cost of the chip, but lowers the cost to the manufacturer of a Solid State Disk or MP3 player. They don’t have to add the error correction after the fact or buy another part and integrate it into their design. The other more ‘state of the art’ method is to build the error correction into the Flash memory controller (as opposed to the memory cells), providing much more leeway in how it can be implemented, updated over time. As it turns out the premier manufacturer/designer of Flash memory controllers SandForce already does this with the current shipping version of their SF-1200 Flash memory controller. SandForce still has two more advanced controllers yet to hit the market, so they are only going to become stronger if they have already adopted ECC into their current shipping product.
Which way the market chooses to go will depend on how low the target price is for the final shipping product. Low margin, high volume goods will most likely go with no error correction and take their chances. Other higher end goods may adopted the embedded ECC from Micron and Intel. Top of the line data center purchasers will not stray far from the cream of the crop, high margin SandForce controllers as they are still providing great performance/value even in their early generation products.
Being a student of the history of technology I know that the silicon semiconductor industry has been able to scale production according to Moore’s Law. However apart from the advances in how small the transistors can be made (the real basis of Moore’s Law), the other scaling factor has been the size of the wafers. Back in the old days silicon crystals had to be drawn out from a furnace at a very even steady rate which forced them to be thin cylinders 1-2″ in diameter. However as techniques improved (including a neat trick where the crystal was re-melted to purify it) the crystals increase in diameter to a nice 4″ size that helped bring down costs. Then came the big migration to 6″ wafers, 8″ and now the 300mm wafer (roughly 12″). Now Intel is still on its freight train to further bring down costs by moving the wafers up to the next largest size (450mm) and is stilling shrinking the parts (down to an unbelievably skinny 22nm in size). As the wafers continue to grow, the cost of processing equipment goes up and the cost of the whole production facility will too. The last big price point for a new production fab for Intel was always $2Billion. There may be multiple production lines in that Fab, but you needed to always have upfront that required money in order to be competitive. And Intel was more than competitive, it could put 3 lines into production in 3 years (blowing the competition out of the water for a while) and make things very difficult in the industry.
Where things will really shake up is in the Flash memory production lines. The size of the design rulings for current flash memory chips at Intel is right around 22nm. Intel and Samsung both are trying to shrink down the feature sizes of all the circuits on their Single and Multi-Level Flash memory chips. Add to this the stacking of chips into super sandwiches and you find they can glue together 8 of their 8Gbyte chips, making for a single very thin 64Gbyte memory chip. This chip is then mated up to a memory controller and voila, the iPhone suddenly hits 64Gbytes of storage for all your apps and mp4′s from iTunes. Similarly on the hard drive end of the scale things will also wildly improve. Solid State Disk capacities should creep upwards further (beyond the top of the line 512Gbyte SSDs) as will the PCI Express based storage devices (probably doubling in capacity to 2 TeraBytes) after 450mm wafers take hold across the semiconductor industry. So it’s going to be a big deal if Chinese, Japanese and American companies get on the large silicon wafer bandwagon.
I need to remind myself of this cartoon every time I see the word values appear in mission statements.