Clearly, ARM and Tilera are a potential threat to Intel’s server business. But it should be noted that even Google has called for caution when it comes to massively multicore systems. In a paper published in IEEE Micro last year, Google senior vice president of operations Urs Hölzle said that chips that spread workloads across more energy-efficient but slower cores may not be preferable to processors with faster but power-hungry cores.
“So why doesn’t everyone want wimpy-core systems?” Hölzle writes. “Because in many corners of the real world, they’re prohibited by law – Amdahl’s law.
The explanation given here by Google’s top systems person is that latency versus parallel processes overhead. Which means if you have to do all the steps in order, with a very low level of parallel tasks that results in much higher performance. And that is the measure that all the users of your service will judge you by. Making things massively parallel might provide the same level of response, but at a lower energy cost. However the complications due to communication and processing overhead to assemble all the data and send it over the wire will offset any advantage in power efficiency. In other words, everything takes longer and latency increases, and the users will deem your service to be slow and unresponsive. That’s the dilemna of Amdahl’s Law, the point of diminishing returns when adopting parallel computer architectures.
Now compare this to something say we know much more concretely, like the Airline Industry. As the cost of tickets came down, the attempt to cut costs went up. Schedules for landings and gate assignments got more complicated and service levels have suffered terribly. No one is really all that happy about the service they get, even from the best airline currently operating. So maybe Amdahl’s Law doesn’t apply when there’s a false ceiling placed on what is acceptable in terms of the latency of a ‘system’. If airlines are not on time, but you still make your connection 99% of the time, who will complain? So by way of comparison there is a middle ground that may be achieved where more parallelizing of compute tasks will lower the energy required by a data center. It will require greater latency, and a worse experience for the users. But if everyone suffers equally from this and the service is not great but adequate, then the company will be able to cut costs through implementing more parallel processors in their data centers.
I think Tilera holds a special attraction potentially for Facebook. Especially since Quanta their hardware assembler of choice is already putting together computers with the Tilera chip for customers now. It seems like this chain of associations might prove a way for Facebook to test the waters on a scale large enough to figure out the cost/benefits of massively parallel cpus in the data center. Maybe it will take another build out of a new data center to get there, but it will happen no doubt eventually.