Why Microsoft is building programmable chips that specialize in search — Tech News and Analysis

English: Altera Stratix IV EP4SGX230 FPGA on a PCB
English: Altera Stratix IV EP4SGX230 FPGA on a PCB (Photo credit: Wikipedia)

SUMMARY: Microsoft has been experimenting with its own custom chip effort in order to make its data centers more efficient, and these chips aren’t centered around ARM-based cores, but rather FPGAs from Altera.

via Why Microsoft is building programmable chips that specialize in search — Tech News and Analysis.

FPGAs for the win, at least for eliminating unnecessary Xeon CPUs for doing online analytic processing for the Bing Search service. MS are saying they can process the same amount of data with half the number of CPUs by offloading some of the heavy lifting from general purpose CPUs to specially programmed FPGAs tune to the MS algorithms to deliver up the best search results. For MS the cost of the data center will out, and if you can drop half of the Xeons in a data center you just cut your per transaction costs by half. That is quite an accomplishment these days of radical incrementalism when it comes to Data Center ops and DevOps. The Field Programmable Gate Array is known as a niche, discipline specific kind of hardware solution. But when flashed, and programmed properly and re-configured as workloads and needs change it can do some magical heavy lifting from a computing standpoint.

Specifically I’m thinking really repetitive loops or recursive algorithms that take forever to unwind and deliver a final result are things best done in hardware versus software. For Search Engines that might be the process used to determine the authority of a page in the rankings (like Google’s PageRank). And knowing you can further tune the hardware to fit the algorithm means you’ll spend less time attempting to do heavy lifting on the General CPU using really fast C/C++ code instead. In Microsoft’s plan that means less CPUs need to do the same amount of work. And better yet, if you determine a better algorithm for your daily batch processes, you can spin up a new hardware/circuit diagram and apply that to the compute cluster over time (and not have to do a pull and replace of large sections of the cluster). It will be interesting to see if Microsoft reports out any efficiencies in a final report, as of now this seems somewhat theoretical though it may have been tested at least in a production test bed of some sort using real data.

Advertisement

Posted

in

, ,

by

Tags:

%d bloggers like this: