AnandTech | Apple’s Cyclone Microarchitecture Detailed

Image representing Apple as depicted in CrunchBase

So for now, Cyclone’s performance is really used to exploit race to sleep and get the device into a low power state as quickly as possible.

via AnandTech | Apple’s Cyclone Microarchitecture Detailed.

Race to sleep, is the new, new thing for mobile cpus. Power conservation at a given clock speed is all done through parceling out a task and with more cores or higher clock speed. All cores execute and comple the task then cores are put to sleep or a much lower power state. That’s how you get things done and maintain a 10 hour battery life for an iPad Air or iPhone 5s.

So even though a mobile processor could be the equal of the average desktop cpu, it’s the race to sleep state that is the big differentiation now. That is what Apple’s adopting of a 64bit ARM vers. 8 architecture is bringing to market, the race to sleep. At the very beginning of the hints and rumors 64bit seemed more like an attempt to address more DRAM or gain some desktop level performance capability. But it’s all for the sake of executing quick and going into sleep mode to preserve the battery capacity.

I’m thinking now of some past articles covering the nascent, emerging market for lower power, massively parallel data center servers. 64bits was an absolute necessary first step to get ARM cpus into blades and rack servers destined for low power data centers. Memory addressing is considered a non-negotiable feature that even the most power efficient server must have. Didn’t matter what CPU it is designed around, memory address HAS got to be 64bits or it cannot be considered. That rule still applies today and will be the sticking point still for folks sitting back and ignoring the Tilera architecture or SeaMicro’s interesting cloud in a box designs. To date, it seems like Apple was first to market with a 64bit ARM design, without ARM actually supplying the base circuit design and layouts for the new generation of 64bit ARM. Apple instead did the heavy lifting and engineering themselves to get the 64bit memory addressing it needed to continue its drive to better battery life. Time will tell if this will herald other efficiency or performance improvements in raw compute power.

Enhanced by Zemanta

Intel looks to build ultra-efficient mobile chips Apple cant ignore

English: Paul Otellini, CEO of Intel
Paul Otellini, CEO of Intel (Photo credit: Wikipedia)

During Intels annual investor day on Thursday, CEO Paul Otellini outlined the companys plan to leverage its multi-billion-dollar chip fabrication plants, thousands of developers and industry sway to catch up in the lucrative mobile device sector, reports Forbes.

via Intel looks to build ultra-efficient mobile chips Apple cant ignore (Apple Insider)

But what you are seeing is a form of Fear, Uncertainty and Doubt (FUD) being spread about to sow the seeds of mobile Intel processors sales. The doubt is not as obvious as questioning the performance of ARM chips, or the ability of manufacturers like Samsung to meet their volume targets and reject rates for each new mobile chip. No it’s more subtle than that and only noticeable to people who know details like what design rule Intel is currently using versus that which is used by Samsung or TSMC (Taiwan Semiconductor Manufacturing Corp.) Intel is currently just releasing its next gen 22nm chips as companies like Samsung are still trying to recoup their investment in 45nm and 32nm production lines. Apple is just now beginning to sample some 32nm chips from Samsung in iPad 2 and Apple TV products. It’s current flagship model iPad/iPhone both use a 45nm chip produced by Samsung. Intel is trying to say that the old generation technology while good doesn’t have the weight and just massive investment in the next generation chip technology. The new chips will be smaller, energy efficient, less expensive all the things need to make higher profit on consumer devices using them. However, Intel doesn’t do ARM chips, it has Atom and that is the one thing that has hampered any big design wins in cellphone or tablet designs to date. At any narrow size of the design rule, ARM chips almost always use less power than a comparably sized Atom chip from Intel. So whether it’s really an attempt to spread FUD, can easily be debated one way or another. But the message is clear, Intel is trying to fight back against ARM. Why? Let’s turn back the clock to March of this year in a previous article also appearing in Apple Insider:

Apple could be top mobile processor maker by end of 2012 (Apple Insider, March 20, 2012)

This article is referenced in the original article quoted at the top of the page. And it points out why Intel is trying to get Apple to take notice of its own mobile chip commitments. Apple designs its own chips and has the manufacturing contracted out to a foundry. To date Samsung has been the sole source of the A-processors used in iPhones/iPod/iPad devices as Apple is trying to get TSMC up to speed to get a second source. Meanwhile sales of the Apple devices continues to grow handsomely in spite of these supply limits. More important to Intel is the blistering growth in spite of being on older foundry technology and design rules. Intel has a technological and investment advantage over Samsung now. They do not have a chip however that is BETTER than Apple’s in house designed ARM chip. That’s why the underlying message for Intel is that it has to make it’s Atom chip so much better than an A4, A5, A5X at ANY design ruling that Apple cannot ignore Intel’s superior design and manufacturing capability. Apple will still use Intel chips, but not in its flagship products until Intel achieves that much greater level of technical capability and sophistication in its Mobile microprocessors.

Twin-track development plan for Intel’s expansion into smartphones (The Register, May 11, 2012)

Intel is planning a two-pronged attack on the smartphone and tablet markets, with dual Atom lines going down to 14 nanometers and Android providing the special sauce to spur sales. 

Lastly, Ian Thomson from The Register weighs in looking at what the underlying message from Intel really is. It’s all about the future of microprocessors for the consumer market. However the emphasis in this article is that Android OS devices whether they be phones or tablets or netbooks will be the way to compete AGAINST Apple. But again it’s not Apple as such it’s the microprocessor Apple is using in it’s best selling devices that scares Intel the most. Intel has since its inception been geared towards the ‘mainstream’ market selling into Enterprises and the Consumer area for years. It has milked the desktop PC revolution as it helped create it more or less starting with its forays into integrated micro-processor chips and chipsets. It reminds me a little of the old steel plants that existed in the U.S. during the 1970s as Japan was building NEW steel plants that used a much more energy efficient design, and a steel making technology that created  a higher quality product. So less expensive higher quality steel was only possible by creating brand new steel plants. But the old line U.S. plants couldn’t justify the expense and so just wrapped up and shutdown operations all over the place. Intel while it is able to make that type of investment in newer technology is still not able to create the energy saving mobile processor that will out perform an ARM core cpu.

ARM creators Sophie Wilson and Steve Furber • reghardware

BBC Micro
BBC Micro (Photo credit: Wikipedia)

Unsung Heroes of Tech Back in the late 1970s you wouldnt have guessed that this shy young Cambridge maths student named Wilson would be the seed for what has now become the hottest-selling microprocessor in the world.

via Chris Bidmead: ARM creators Sophie Wilson and Steve Furber • reghardware.

This is an amazing story of how a small computer company in Britain was able to jump into the chip design business and accidentally create a new paradigm in low power chips. Astounding what seemingly small groups can come with as complete product categories unto themselves. The BBC Micro was the single most important project that kept the company going and was produced as a learning aid for the BBC television show: The_Computer_Programme, a part of the BBC Computer Literacy Project. From that humble beginning of making the BBC Micro, Furber and Wilson’s ability to engineer a complete computer was well demonstrated.

But whereas the BBC Micro used an off the shelf MOS 6502 cpu, a later computer used a custom (bespoke) designed chip created in house by Wilson and Furber. This is the vaunted Acorn Risc Machine (ARM) used in the Archimedes desktop computer. And that one chip helped launch a revolution unto itself in that the very first time the powered up a sample chip, the multimeter hooked up to registered no power draw. At first one would think this was a flaw, and ask “What the heck is happening here?” But in fact when further inspection showed that the multimeter was correct, the engineers discovered that the whole cpu was running of power that was leaking from the logic circuits within the chip itself. Yes, the low power requirement of this first sample chip of the ARM cpu in 1985 ran on 1/10 of a watt of electricity. And that ‘bug’ then went on to become a feature in later generations of the ARM architecture.

Today we know of the ARM cpu cores as a bit of licensed Intellectual Property that any chip make can acquire and implement in their mobile processor designs. It has come to dominate many different architectures by different manufacturers as diverse as Qualcomm and Apple Inc. But none of it ever would have happened were it not for that somewhat surprising discovery of how power efficient that first sample chip really was when it was plugged into a development board. So thankyou Sophie Wilson and Steve Furber, as the designers and engineers today are able to stand upon your shoulders the way you once stood on the shoulders of people who designed the MOS 6502.

MOS 6502 microprocessor in a dual in-line pack...
MOS 6502 microprocessor in a dual in-line package, an extremely popular 8-bit design (Photo credit: Wikipedia)

ARM Pitches Tri-gate Transistors for 20nm and Beyond

English: I am the author of this image.
Image via Wikipedia

. . . 20 nm may represent an inflection point in which it will be necessary to transition from a metal-oxide semiconductor field-effect transistor MOSFET to Fin-Shaped Field Effect Transistors FinFET or 3D transistors, which Intel refers to as tri-gate designs that are set to debut with the companys 22 nm Ivy Bridge product generation.

via ARM Pitches Tri-gate Transistors for 20nm and Beyond.

Three Dimensional transistors in the news again. Previously Intel announced they were adopting a new design for their next generation next smaller design rule for the Ivy Bridge generation Intel CPUs. Now ARM is also doing work to integrate similar technology into their ARM cpu cores as well. No doubt in order to lower Thermal Design Point and maintain clock speed as well are both driving this move to refine and narrow the design rules for the ARM architecture. Knowing Intel is still the top research and development outfit for silicon semi-conductors would give pause to anyone directly competing with them, but ARM is king of the low power semi-conductor and keeping pace with Intel’s design rules is an absolute necessity.

I don’t know how quickly ARM is going to be able to get a licensee to jump onboard and adopt the new design. Hopefully a large operation like Samsung can take this on and get the chip into it’s design, development, production lines at a chip fabrication facility as soon as possible. Likewise other contract manufacturers like Taiwan Semiconductor Manufacturing Company (TSMC) should also try to get this chip into their facilities quickly too. That way the cell-phone and tablet markets can benefit too as they use a lot of ARM licensed cpu cores and similar intellectual property in their shipping products. And my interest is not so much invested in the competition between Intel and ARM for low power computing but more the overall performance of any single ARM design once it’s been in production for a while and optimized the way Apple designs its custom CPUs using ARM licensed cpu cores. The single most outstanding achievement of Apple in their design and production of the iPad is the battery charge duration of 10 hours. Which to date, is an achievement that has not been beaten, even by other manufacturers and products who also license ARM intellectual property. So if  the ARM design is good and can be validated and proto-typed with useful yields quickly, Apple will no doubt be the first to benefit, and by way of Apple so will the consumer (hopefully).

Schematic view (L) and SEM view (R) of Intel t...
Image via Wikipedia

AnandTech – Applied Micros X-Gene: The First ARMv8 SoC

While everyone was focusing on the ARM v.7 (current generatation) chip architecture, APM has been focused on the 64bit ARM v.8 which promises to herald a new era of low power with Server level features. By estimates made in the article APM has a 1 year lead on any other current licensees of ARM designs. And with any luck they will be sampling products that meet their performance targets in March of 2012 (albeit in FPGA eval form) Read On:

APM expects that even with a late 2012 launch it will have a 1 – 2 year lead on the competition. If it can get the X-Gene out on time, hitting power and clock targets both very difficult goals, the headstart will be tangible. Note that by the end of 2012 well only just begin to see the first Cortex A15 implementations. ARMv8 based competitors will likey be a full year out, at least. 

via AnandTech – Applied Micros X-Gene: The First ARMv8 SoC.

Chip Diagram for the ARM version 8 as implemented by APM

It’s nice to get a confirmation of the production time lines for the Cortex A15 and the next generation ARM version 8 architecture. So don’t expect to see shipping chips, much less finished product using those chips well into 2013 or even later. As for the 4 core ARM A15, finished product will not appear until well into 2012. This means if Intel is able to scramble, they have time to further refine their Atom chips to reach the power level and Thermal Design Point (TDP) for the competing ARM version 8 architecture. What seems to be the goal is to jam in more cores per CPU socket than is currently done on the Intel architecture (up to almost 32 in on of the graphics presented with the article).

The target we are talking about is 2W per core @ 3Ghz, and it is going to be a hard, hard target to hit for any chip designer or manufacturer. One can only hope that TMSC can help APM get a finished chip out the door on it’s finest ruling chip production lines (although an update to the article indicates it will ship on 40nm to get it out the door quicker). The finer the ruling of signal lines on the chip the lower the TDP, and the higher they can run the clock rate. If ARM version 8 can accomplish their goal of 2W per cpu core @ 3 Gigahertz, I think everyone will be astounded. And if this same chip can be sampled at the earliest prototypes stages by a current ARM Server manufacturer say, like Calxeda or even SeaMicro then hopefully we can get benchmarks to show what kind of performance can be expected from the ARM v.8  architecture and instruction set. These will be interesting times.

Intel Atom CPU Z520, 1,333GHz
Image via Wikipedia

Intel Responds to Calxeda/HP ARM Server News ( isn’t the best at following the Cloud Data Industry. In fact at least they partially want to keep their advertisers happy so they will publish a Fear, Uncertainty and Doubt raising response direct from an Intel PR Engineer. Happily the Intel folks aren’t even fully aware of what people are doing with their SeaMicro and Quanta SQ-2 boxes and continue to beat the drum on Virtualized Servers on Multi-core, high-clocked chips. That’s the old school thinking on what a Compute Cloud can be. The New School says put the cloud in a single box, let the clock run slower and use less power and everybody wins. Read On:

Now, you’re probably thinking, isn’t Xeon the exact opposite of the kind of extreme low-power computing envisioned by HP with Project Moonshot? Surely this is just crazy talk from Intel? Maybe, but Walcyzk raised some valid points that are worth airing.via Cloudline | Blog | Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data.

Structure of the TILE64 Processor from Tilera
Image via Wikipedia: Tile64 mesh network processor from Tilera
Image representing Tilera as depicted in Crunc...
Image via CrunchBase

So Intel gets an interview with a Conde-Nast writer for a sub-blog of I doubt too many purchasers or data center architects consult But all the same, I saw through many thinly veiled bits of handwaving and old saws from Intel saying, “Yes, this exists but we’re already addressing it with our exiting product lines,. . .” So, I wrote in a comment to this very article. Especially regarding a throw-away line mentioning the ‘future’ of the data center and the direction the Data Center and Cloud Computing market was headed. However the moderator never published the comment. In effect, I raised the Question: Whither Tilera? And the Quanta SM-2 server based on the Tilera Chip?

Aren’t they exactly what is described by the author John Stokes as a network of cores on a chip? And given the scale of Tilera’s own product plans going into the future and the fact they are not just concentrating on Network gear but actual Compute Clouds too, I’d say both Stokes and Walcyzk are asking the wrong questions and directing our attention in the wrong direction. This is not a PR battle but a flat out technology battle. You cannot win this with words and white papers but in fact it requires benchmarks and deployments and Case Histories. Technical merit and superior technology will differentiate the players in the  Cloud in a Box race. And this hasn’t been the case in the past as Intel has battled AMD in the desktop consumer market. In the data center Intel Fear Uncertainty and Doubt is the only weapon they have.

And I’ll quote directly from John Stokes’s article here describing EXACTLY the kind of product that Tilera has been shipping already:

“Instead of Xeon with virtualization, I could easily see a many-core Atom or ARM cluster-on-a-chip emerging as the best way to tackle batch-oriented Big Data workloads. Until then, though, it’s clear that Intel isn’t going to roll over and let ARM just take over one of the hottest emerging markets for compute power.”

The key phrase here is cluster on a chip, in essence exactly what Tilera has strived to achieve with its Tilera64 based architecture. To review from previous blog entries of this website following the announcements and timelines published by Tilera:

ARM specs out first 64-bit RISC chips • The Register

With the announcements recently from Calxeda and HP now ARM is jumping back into the limelight. They are working on the next version of the ARM architecture which will feature a 64bit clean design (which some speculated was going to arrive with ARM-15). Although the current version of the ARM architecture and support 40bit data paths, anything wider than that has to be worked around through OS level support of the underlying workaround. In a word, it’s a non-trivial task. But things will get much easier with the next Rev. All hail Version 8

Image by krunkwerke via Flickr

The ARM RISC processor is getting true 64-bit processing and memory addressing – removing the last practical barrier to seeing an army of ARM chips take a run at the desktops and servers that give Intel and AMD their moolah.

via ARM specs out first 64-bit RISC chips • The Register.

The downside to this announcement is the timeline ARM lays out for the first generation chips to use the new Vers. 8 architecture. Due to limited demand, as ARM defines it, chips will not be shipping until 2013 or as late as 2014. However according to this Register article the existing IT Data center infrastructure will not adopt ANY ARM-based chips until they are designed as a 64-bit clean architecture. Sounds like a potential for a chicken and egg scenario except ARM will get that Egg out the door on schedule with TMSC as it’s test chip partner. Some other details that come from the article include that the top end ARM-15 chip just announced already addresses more than 32-bits of Memory through a workaround that allows enterprising programmers to address as many as 40bits of memory if they need it. The best argument made for the real market need of 64-bit Memory addressing is for programmers currently on different chip architectures who might want to port their apps to ARM. THEY are are the real target market for the Vers. 8 architecture, and will have a much easier time porting over to another chip architecture that has the same level of memory addressing capability (64-bits all around).

As for companies like Calxeda who are adopting the ARM-15 architecture and the current ARM-8 Cortex chips (both of which fall under the previous gen. vers. 7 architecture), 32-bits of memory (4Gbytes in total) is enough to get by depending on the application being run. Highly parallel apps or simple things like single threaded webservers will perform well under these circumstances, according to The Register. And I am inclined to believe this based on current practices of Data Center giants like Facebook and Google (virtualization is sacrificed for massively parallel architectures). Also given the plans folks like Calxeda have for hardware interconnects, the ability off all those low power 32-bit chips all communicating with one another holds a lot of promise too.  I’m still curious to see if Calxeda can come up with a unique product utilizing the 64-bit ARM vers. 8 architecture when the chip finally is taped out and test chips are shipped out my TMSC.