Race to sleep, is the new, new thing for mobile cpus. Power conservation at a given clock speed is all done through parceling out a task and with more cores or higher clock speed. All cores execute and comple the task then cores are put to sleep or a much lower power state. That’s how you get things done and maintain a 10 hour battery life for an iPad Air or iPhone 5s.
So even though a mobile processor could be the equal of the average desktop cpu, it’s the race to sleep state that is the big differentiation now. That is what Apple’s adopting of a 64bit ARM vers. 8 architecture is bringing to market, the race to sleep. At the very beginning of the hints and rumors 64bit seemed more like an attempt to address more DRAM or gain some desktop level performance capability. But it’s all for the sake of executing quick and going into sleep mode to preserve the battery capacity.
I’m thinking now of some past articles covering the nascent, emerging market for lower power, massively parallel data center servers. 64bits was an absolute necessary first step to get ARM cpus into blades and rack servers destined for low power data centers. Memory addressing is considered a non-negotiable feature that even the most power efficient server must have. Didn’t matter what CPU it is designed around, memory address HAS got to be 64bits or it cannot be considered. That rule still applies today and will be the sticking point still for folks sitting back and ignoring the Tilera architecture or SeaMicro’s interesting cloud in a box designs. To date, it seems like Apple was first to market with a 64bit ARM design, without ARM actually supplying the base circuit design and layouts for the new generation of 64bit ARM. Apple instead did the heavy lifting and engineering themselves to get the 64bit memory addressing it needed to continue its drive to better battery life. Time will tell if this will herald other efficiency or performance improvements in raw compute power.
During Intels annual investor day on Thursday, CEO Paul Otellini outlined the companys plan to leverage its multi-billion-dollar chip fabrication plants, thousands of developers and industry sway to catch up in the lucrative mobile device sector, reports Forbes.
But what you are seeing is a form of Fear, Uncertainty and Doubt (FUD) being spread about to sow the seeds of mobile Intel processors sales. The doubt is not as obvious as questioning the performance of ARM chips, or the ability of manufacturers like Samsung to meet their volume targets and reject rates for each new mobile chip. No it’s more subtle than that and only noticeable to people who know details like what design rule Intel is currently using versus that which is used by Samsung or TSMC (Taiwan Semiconductor Manufacturing Corp.) Intel is currently just releasing its next gen 22nm chips as companies like Samsung are still trying to recoup their investment in 45nm and 32nm production lines. Apple is just now beginning to sample some 32nm chips from Samsung in iPad 2 and Apple TV products. It’s current flagship model iPad/iPhone both use a 45nm chip produced by Samsung. Intel is trying to say that the old generation technology while good doesn’t have the weight and just massive investment in the next generation chip technology. The new chips will be smaller, energy efficient, less expensive all the things need to make higher profit on consumer devices using them. However, Intel doesn’t do ARM chips, it has Atom and that is the one thing that has hampered any big design wins in cellphone or tablet designs to date. At any narrow size of the design rule, ARM chips almost always use less power than a comparably sized Atom chip from Intel. So whether it’s really an attempt to spread FUD, can easily be debated one way or another. But the message is clear, Intel is trying to fight back against ARM. Why? Let’s turn back the clock to March of this year in a previous article also appearing in Apple Insider:
This article is referenced in the original article quoted at the top of the page. And it points out why Intel is trying to get Apple to take notice of its own mobile chip commitments. Apple designs its own chips and has the manufacturing contracted out to a foundry. To date Samsung has been the sole source of the A-processors used in iPhones/iPod/iPad devices as Apple is trying to get TSMC up to speed to get a second source. Meanwhile sales of the Apple devices continues to grow handsomely in spite of these supply limits. More important to Intel is the blistering growth in spite of being on older foundry technology and design rules. Intel has a technological and investment advantage over Samsung now. They do not have a chip however that is BETTER than Apple’s in house designed ARM chip. That’s why the underlying message for Intel is that it has to make it’s Atom chip so much better than an A4, A5, A5X at ANY design ruling that Apple cannot ignore Intel’s superior design and manufacturing capability. Apple will still use Intel chips, but not in its flagship products until Intel achieves that much greater level of technical capability and sophistication in its Mobile microprocessors.
Intel is planning a two-pronged attack on the smartphone and tablet markets, with dual Atom lines going down to 14 nanometers and Android providing the special sauce to spur sales.
Lastly, Ian Thomson from The Register weighs in looking at what the underlying message from Intel really is. It’s all about the future of microprocessors for the consumer market. However the emphasis in this article is that Android OS devices whether they be phones or tablets or netbooks will be the way to compete AGAINST Apple. But again it’s not Apple as such it’s the microprocessor Apple is using in it’s best selling devices that scares Intel the most. Intel has since its inception been geared towards the ‘mainstream’ market selling into Enterprises and the Consumer area for years. It has milked the desktop PC revolution as it helped create it more or less starting with its forays into integrated micro-processor chips and chipsets. It reminds me a little of the old steel plants that existed in the U.S. during the 1970s as Japan was building NEW steel plants that used a much more energy efficient design, and a steel making technology that created a higher quality product. So less expensive higher quality steel was only possible by creating brand new steel plants. But the old line U.S. plants couldn’t justify the expense and so just wrapped up and shutdown operations all over the place. Intel while it is able to make that type of investment in newer technology is still not able to create the energy saving mobile processor that will out perform an ARM core cpu.
Unsung Heroes of Tech Back in the late 1970s you wouldnt have guessed that this shy young Cambridge maths student named Wilson would be the seed for what has now become the hottest-selling microprocessor in the world.
This is an amazing story of how a small computer company in Britain was able to jump into the chip design business and accidentally create a new paradigm in low power chips. Astounding what seemingly small groups can come with as complete product categories unto themselves. The BBC Micro was the single most important project that kept the company going and was produced as a learning aid for the BBC television show: The_Computer_Programme, a part of the BBC Computer Literacy Project. From that humble beginning of making the BBC Micro, Furber and Wilson’s ability to engineer a complete computer was well demonstrated.
But whereas the BBC Micro used an off the shelf MOS 6502 cpu, a later computer used a custom (bespoke) designed chip created in house by Wilson and Furber. This is the vaunted Acorn Risc Machine (ARM) used in the Archimedes desktop computer. And that one chip helped launch a revolution unto itself in that the very first time the powered up a sample chip, the multimeter hooked up to registered no power draw. At first one would think this was a flaw, and ask “What the heck is happening here?” But in fact when further inspection showed that the multimeter was correct, the engineers discovered that the whole cpu was running of power that was leaking from the logic circuits within the chip itself. Yes, the low power requirement of this first sample chip of the ARM cpu in 1985 ran on 1/10 of a watt of electricity. And that ‘bug’ then went on to become a feature in later generations of the ARM architecture.
Today we know of the ARM cpu cores as a bit of licensed Intellectual Property that any chip make can acquire and implement in their mobile processor designs. It has come to dominate many different architectures by different manufacturers as diverse as Qualcomm and Apple Inc. But none of it ever would have happened were it not for that somewhat surprising discovery of how power efficient that first sample chip really was when it was plugged into a development board. So thankyou Sophie Wilson and Steve Furber, as the designers and engineers today are able to stand upon your shoulders the way you once stood on the shoulders of people who designed the MOS 6502.
Three Dimensional transistors in the news again. Previously Intel announced they were adopting a new design for their next generation next smaller design rule for the Ivy Bridge generation Intel CPUs. Now ARM is also doing work to integrate similar technology into their ARM cpu cores as well. No doubt in order to lower Thermal Design Point and maintain clock speed as well are both driving this move to refine and narrow the design rules for the ARM architecture. Knowing Intel is still the top research and development outfit for silicon semi-conductors would give pause to anyone directly competing with them, but ARM is king of the low power semi-conductor and keeping pace with Intel’s design rules is an absolute necessity.
I don’t know how quickly ARM is going to be able to get a licensee to jump onboard and adopt the new design. Hopefully a large operation like Samsung can take this on and get the chip into it’s design, development, production lines at a chip fabrication facility as soon as possible. Likewise other contract manufacturers like Taiwan Semiconductor Manufacturing Company (TSMC) should also try to get this chip into their facilities quickly too. That way the cell-phone and tablet markets can benefit too as they use a lot of ARM licensed cpu cores and similar intellectual property in their shipping products. And my interest is not so much invested in the competition between Intel and ARM for low power computing but more the overall performance of any single ARM design once it’s been in production for a while and optimized the way Apple designs its custom CPUs using ARM licensed cpu cores. The single most outstanding achievement of Apple in their design and production of the iPad is the battery charge duration of 10 hours. Which to date, is an achievement that has not been beaten, even by other manufacturers and products who also license ARM intellectual property. So if the ARM design is good and can be validated and proto-typed with useful yields quickly, Apple will no doubt be the first to benefit, and by way of Apple so will the consumer (hopefully).
APM expects that even with a late 2012 launch it will have a 1 – 2 year lead on the competition. If it can get the X-Gene out on time, hitting power and clock targets both very difficult goals, the headstart will be tangible. Note that by the end of 2012 well only just begin to see the first Cortex A15 implementations. ARMv8 based competitors will likey be a full year out, at least.
It’s nice to get a confirmation of the production time lines for the Cortex A15 and the next generation ARM version 8 architecture. So don’t expect to see shipping chips, much less finished product using those chips well into 2013 or even later. As for the 4 core ARM A15, finished product will not appear until well into 2012. This means if Intel is able to scramble, they have time to further refine their Atom chips to reach the power level and Thermal Design Point (TDP) for the competing ARM version 8 architecture. What seems to be the goal is to jam in more cores per CPU socket than is currently done on the Intel architecture (up to almost 32 in on of the graphics presented with the article).
The target we are talking about is 2W per core @ 3Ghz, and it is going to be a hard, hard target to hit for any chip designer or manufacturer. One can only hope that TMSC can help APM get a finished chip out the door on it’s finest ruling chip production lines (although an update to the article indicates it will ship on 40nm to get it out the door quicker). The finer the ruling of signal lines on the chip the lower the TDP, and the higher they can run the clock rate. If ARM version 8 can accomplish their goal of 2W per cpu core @ 3 Gigahertz, I think everyone will be astounded. And if this same chip can be sampled at the earliest prototypes stages by a current ARM Server manufacturer say, like Calxeda or even SeaMicro then hopefully we can get benchmarks to show what kind of performance can be expected from the ARM v.8 architecture and instruction set. These will be interesting times.
So Intel gets an interview with a Conde-Nast writer for a sub-blog of Wired.com. I doubt too many purchasers or data center architects consult Cloudline@Wired.com. But all the same, I saw through many thinly veiled bits of handwaving and old saws from Intel saying, “Yes, this exists but we’re already addressing it with our exiting product lines,. . .” So, I wrote in a comment to this very article. Especially regarding a throw-away line mentioning the ‘future’ of the data center and the direction the Data Center and Cloud Computing market was headed. However the moderator never published the comment. In effect, I raised the Question: Whither Tilera? And the Quanta SM-2 server based on the Tilera Chip?
Aren’t they exactly what is described by the author John Stokes as a network of cores on a chip? And given the scale of Tilera’s own product plans going into the future and the fact they are not just concentrating on Network gear but actual Compute Clouds too, I’d say both Stokes and Walcyzk are asking the wrong questions and directing our attention in the wrong direction. This is not a PR battle but a flat out technology battle. You cannot win this with words and white papers but in fact it requires benchmarks and deployments and Case Histories. Technical merit and superior technology will differentiate the players in the Cloud in a Box race. And this hasn’t been the case in the past as Intel has battled AMD in the desktop consumer market. In the data center Intel Fear Uncertainty and Doubt is the only weapon they have.
And I’ll quote directly from John Stokes’s article here describing EXACTLY the kind of product that Tilera has been shipping already:
“Instead of Xeon with virtualization, I could easily see a many-core Atom or ARM cluster-on-a-chip emerging as the best way to tackle batch-oriented Big Data workloads. Until then, though, it’s clear that Intel isn’t going to roll over and let ARM just take over one of the hottest emerging markets for compute power.”
The key phrase here is cluster on a chip, in essence exactly what Tilera has strived to achieve with its Tilera64 based architecture. To review from previous blog entries of this website following the announcements and timelines published by Tilera:
The ARM RISC processor is getting true 64-bit processing and memory addressing – removing the last practical barrier to seeing an army of ARM chips take a run at the desktops and servers that give Intel and AMD their moolah.
The downside to this announcement is the timeline ARM lays out for the first generation chips to use the new Vers. 8 architecture. Due to limited demand, as ARM defines it, chips will not be shipping until 2013 or as late as 2014. However according to this Register article the existing IT Data center infrastructure will not adopt ANY ARM-based chips until they are designed as a 64-bit clean architecture. Sounds like a potential for a chicken and egg scenario except ARM will get that Egg out the door on schedule with TMSC as it’s test chip partner. Some other details that come from the article include that the top end ARM-15 chip just announced already addresses more than 32-bits of Memory through a workaround that allows enterprising programmers to address as many as 40bits of memory if they need it. The best argument made for the real market need of 64-bit Memory addressing is for programmers currently on different chip architectures who might want to port their apps to ARM. THEY are are the real target market for the Vers. 8 architecture, and will have a much easier time porting over to another chip architecture that has the same level of memory addressing capability (64-bits all around).
As for companies like Calxeda who are adopting the ARM-15 architecture and the current ARM-8 Cortex chips (both of which fall under the previous gen. vers. 7 architecture), 32-bits of memory (4Gbytes in total) is enough to get by depending on the application being run. Highly parallel apps or simple things like single threaded webservers will perform well under these circumstances, according to The Register. And I am inclined to believe this based on current practices of Data Center giants like Facebook and Google (virtualization is sacrificed for massively parallel architectures). Also given the plans folks like Calxeda have for hardware interconnects, the ability off all those low power 32-bit chips all communicating with one another holds a lot of promise too. I’m still curious to see if Calxeda can come up with a unique product utilizing the 64-bit ARM vers. 8 architecture when the chip finally is taped out and test chips are shipped out my TMSC.
The test chip will be fabbed at TSMC on its next-generation 20nm process, a full node reduction ~50% transistor scaling over its 28nm process. With the first 28nm ARM based products due out from TSMC in 2012, this 20nm tape-out announcement is an important milestone but were still around two years away from productization.
Happy Halloween! And like most years there are some tricks up ARM’s sleeve announced this past week along with some partnerships that should make things trickier for the Engineers trying to equip ever more energy efficient and dense Data Centers the world over.
It’s been announced, the ARM15 is coming to market some time in the future. Albeit a ways off yet. And it’s going to be using a really narrow design rule to insure it’s as low power as it possibly can be. I know manufacturers of the massively parallel compute cloud in a box will be seeking out this chip as soon as samples can arrive. The 64bit version of ARM15 is the real potential jewel in the crown for Calxeda who is attempting to balance low power and 64bit performance in the same design.
I can’t wait to see the first benchmarks of these chips apart from the benchmarks from the first shipping product Calxeda can get out with the ARM15 x64. Also note just this week Hewlett-Packard has signed on to sell designs by Calxeda in forth coming servers targeted at Energy Efficient Data Center build-outs. So more news to come regarding that partnership and you can read it right here @ Carpetbomberz.com
Many-core processors are apparently the new black for 2011. Intel continues to work on both its single chip cloud computer and Knights Corner, Tilera made headlines earlier this year, and now a new company, Adapteva, has announced its own entry into the field.
A competitor to Tilera and Intel’s MIC has entered the field as a mobile processor, co-processor. Given the volatile nature of chip architectures in the mobile market, this is going to be hard sell for some device designers I think. I say this as each new generation of Mobile CPU gets more and more integrated features as each new die shrink allows more embedded functions. The Graphic processors are now being embedded wholesale into every smartphone cpu. Other features like memory controllers and baseband processors will now doubt soon be added to the list as well. If Adapteva wants any traction at all in the Mobile market they will need to further their development of the Epiphany into a synthesizable core that can be added to an existing cpu (most likely a design from ARM). Otherwise trying to stick with being a separate auxiliary chip is going to hamper and severely limit the potential applications of their product.
Witness the integration of the graphics processing unit. Not long ago it was a way to differentiate a phone but required it to be integrated into the motherboard design along with any of the power requirements it required. In a very short time, after GPUs were added to cell phones they were integrated into the CPU chip sandwich to help keep manufacturing and power budget in check. If the Epiphany had been introduced around the golden age of discrete chips on cell phone motherboards, it would make a lot more sense. But now you need to be embedded, integrated and 100% ARM compatible with a fully baked developer toolkit. Otherwise, it’s all uphill from the product introduction forward. If there’s an application for the Ephiphany co-processor I hope they concentrate more on the tools to fully use the device and develop a niche right out of the gate rather than attempt to get some big name but small scale wins on individual devices from the Android market. That seems like the most likely candidates for shipping product right now.
Harkening back to when he joined ARM, Segars said: “2G, back in the early 90s, was a hard problem. It was solved with a general-purpose processor, DSP, and a bit of control logic, but essentially it was a programmable thing. It was hard then – but by todays standards that was a complete walk in the park.”
He wasn’t merely indulging in “Hey you kids, get off my lawn!” old-guy nostalgia. He had a point to make about increasing silicon complexity – and he had figures to back it up: “A 4G modem,” he said, “which is going to deliver about 100X the bandwidth … is going to be about 500 times more complex than a 2G solution.”
A very interesting look a the state of the art in microprocessor manufacturing, The Register talks with one of the principles at ARM, the folks who license their processor designs to almost every cell phone manufacturer worldwide. Looking at the trends in manufacturing, Simon Segars is predicting a more difficult level of sustained performance gains in the near future. Most advancement he feels will be had by integrating more kinds of processing and coordinating the I/O between those processors on the same processor die. Which is kind of what Intel is attempting to do integrating graphics cores, memory controllers and CPU all on one slice of silicon. But the software integration is the trickiest part, and Intel still sees fit to just add more general purpose CPU cores to continue making new sales. Processor clocks stay pretty rigidly near the 3GHz boundary and have not shifted significantly since the end of the Pentium IV era.
Note too, the difficulty of scaling up as well as designing the next gen chips. Referring back to my article from Dec.21, 2010; 450mm wafers (commentary on Electronista article), Intel is the only company rich enough to scale up to the next size of wafer. Every step in the manufacturing process has become so specialized that the motivation to create new devices for manufacture and test just isn’t there because the total number of manufacturers who can scale up to the next largest size of silicon wafer is probably 4 companies worldwide. That’s a measure of how exorbitantly expensive large scale chip manufacturing has become. It seems more and more a plateau is being reached in terms of clock speeds and the size of wafers finished in manufacturing. With these limits, Simon Segars thesis becomes even stronger.