Previous generations of multi-core, massively parallel, ARM based servers were one off manufacturers with their own toolsets and Linux distros. HP’s attempt to really market to this segment of the market will hopefully be substantial enough to get an Ubuntu distro that has enough Libraries and packages to make it function right out of the box. In the article it says companies are using the Proliant ARM-based system as a memcached server. I would speculate that if that’s what people want, the easier you can make that happen from an OS and app server standpoint the better. There’s a reason folks like to buy Synology and BuffaloTech NAS products and that’s the ease with which you spin them up and get a lot of storage attached in a short amount of time. If Proliant can do that for people needing quicker and more predictable page loads on their web apps, then optimize for memcached performance and make it easy to configure and put into production.
Now what, you may ask, is memcached? If you’re running a web server or a web application that requires a lot of speed so that purchases or other transactions complete and show some visual cue that it was successful, the easiest way to do that is through cacheing. The web page contents are kept in a high speed storage location separate from the actual webpage and when required will redirect, or point to the stuff that sits over in that high speed location. By swapping the high speed stored stuff for the slower stuff, you get a really good experience with the web page refreshing automagically showing your purchases in a shopping cart, or that your tax refund is on it’s way. The web site world is built on caching so we don’t see spinning watches or other indications that processing is going on in the background.
To date, this type of caching has seen different software packages do this for first Apache web servers, but now in the world of Social Media, it’s doing it for any type of web server. Whether it’s Amazon, Google or Facebook, memcached or a similar cacheing server is sending you that actual webpage as you click, submit and wait for the page to refresh. And if a data center owner like Amazon, Google and Facebook can lower the cost for each of it’s memcached servers, they can lower their operating costs for each of these cached web pages and keep everyone happy with the speed of their respective websites. Whether or not ARM-based servers see a wider application is dependent on the apps being written specifically for that chip architecture. But at least now people can point to memcached and web page acceleration as a big first win that might see wider adoption longer term.
Since last year, Apple’s been hard at work building out their own CDN and now those efforts are paying off. Recently, Apple’s CDN has gone live in the U.S. and Europe and the company is now delivering some of their own content, directly to consumers. In addition, Apple has interconnect deals in place with multiple ISPs, including Comcast and others, and has paid to get direct access to their networks.
Given some of my experiences attempting to watch the Live Stream from Apple’s combined iPhone, Watch event, I wanted to address CDN. Content Distribution Networks are designed to speed the flow of many types of files from Data Centers or Video head ends for Live Events. So I note, I started this article back on August 1st when this original announcement went out. And now it’s doubly poignant as the video stream difficulties at the start of the show (1PM EDT) kind of ruined it for me and for a few others. They lost me in that scant few first 10 minutes and they never recovered. I did connect later but that was after the Apple Watch presentation was half done. Oh well, you get what you pay for. I paid nothing for the Live Event stream from Apple and got nothing in return.
Back during the Steve Jobs era, one of the biggest supporters of Akamais and its content delivery network was Apple Inc. And this was not just for streaming of the Keynote Speeches and MacWorld (before they withdrew from that event) but also the World Developers Conference (WWDC). At the time enjoyed great access to free streams and great performance levels for free. But Apple cut way back on that simulcasts and rivals like Eventbrite began to eat in to Akamai’s lower end. Since then the huge data center providers began to build out their own data centers worldwide. And in so doing, a kind of internal monopoly of content distribution went into effect. Google was first to really scale up in a massive way then scale out, to make sure all those GMail accounts ran faster and better in spite of the huge mail spools on each account member. Eventually the second wave of social media outlets joined in (with Facebook leading a revolution in Open Stack and Open Hardware specs) and created their own version of content delivery as well.
Now Apple has attempted to scale up and scale out to keep people tightly bound to brand. iCloud really is a thing, but more than that now the real heavy lifting is going on once and for all time. Peering arrangements (anathema to the open Internet) would be signed and deals made to scratch each other’s backs by sharing the load/burden of carrying not just your own internal traffic, but those of others too. And depending on the ISP you could really get gouged by those negotiations. But no matter Apple soldiered on and now they’re ready to really let all the prep work be put to good use. Hopefully the marketing will be sufficient to express the satisfaction and end user experience at all levels, iTunes, iApps, iCloud data storage and everything else would experience the boosts in speed. If Apple can hold its own against both Facebook and Gmail in this regard, they future’s so bright they’re gonna need shades.
Today many different interconnection topologies are used for multicore chips. For as few as eight cores direct bus connections can be made — cores taking turns using the same bus. MIT’s 36-core processors, on the other hand, are connected by an on-chip mesh network reminiscent of Intel’s 2007 Teraflop Research Chip — code-named Polaris — where direct connections were made to adjacent cores, with data intended for remote cores passed from core-to-core until reaching its destination. For its 50-core Xeon Phi, however, Intel settled instead on using multiple high-speed rings for data, address, and acknowledgement instead of a mesh.
I commented some time back on a similar article on the same topic. It appears now the MIT research group has working silicon of the design. As mentioned in the pull-quote, the Xeon Phi (which has made some news in the Top 500 SuperComputer stories recently) is a massively multicore architecture but uses a different interconnect that Intel designed on their own. These stories as they appear get filed into the category of massively multicore or low power CPU developments. Most times the same CPUs add cores without significantly drawing more power and thus provide a net increase in compute ability. Tilera, Calxeda and yes even SeaMicro were all working along towards those ends. Either through mergers, or cutting of funding each one has seemed to trail off and not succeed at its original goal (massively multicore, low power designs). Also along the way Intel has done everything it can to dull and dent the novelty of the new designs by revising an Atom based or Celeron based CPU to provide much lower power at the scale of maybe 2 cores per CPU.
Like this chip MIT announced Tilera too was originally an MIT research product spun off of the University campus. Its principals were the PI and a research associate if I remember correctly. Now that MIT has the working silicon they’re going to benchmark and test and verify their design. The researchers will release the verilog hardware description of chip for anyone use, research or verify for themselves once they’ve completed their own study. It will be interesting to see how much of an incremental improvement this design provides, and possibly could be the launch of another Tilera style product out of MIT.
SUMMARY: Microsoft has been experimenting with its own custom chip effort in order to make its data centers more efficient, and these chips aren’t centered around ARM-based cores, but rather FPGAs from Altera.
FPGAs for the win, at least for eliminating unnecessary Xeon CPUs for doing online analytic processing for the Bing Search service. MS are saying they can process the same amount of data with half the number of CPUs by offloading some of the heavy lifting from general purpose CPUs to specially programmed FPGAs tune to the MS algorithms to deliver up the best search results. For MS the cost of the data center will out, and if you can drop half of the Xeons in a data center you just cut your per transaction costs by half. That is quite an accomplishment these days of radical incrementalism when it comes to Data Center ops and DevOps. The Field Programmable Gate Array is known as a niche, discipline specific kind of hardware solution. But when flashed, and programmed properly and re-configured as workloads and needs change it can do some magical heavy lifting from a computing standpoint.
Specifically I’m thinking really repetitive loops or recursive algorithms that take forever to unwind and deliver a final result are things best done in hardware versus software. For Search Engines that might be the process used to determine the authority of a page in the rankings (like Google’s PageRank). And knowing you can further tune the hardware to fit the algorithm means you’ll spend less time attempting to do heavy lifting on the General CPU using really fast C/C++ code instead. In Microsoft’s plan that means less CPUs need to do the same amount of work. And better yet, if you determine a better algorithm for your daily batch processes, you can spin up a new hardware/circuit diagram and apply that to the compute cluster over time (and not have to do a pull and replace of large sections of the cluster). It will be interesting to see if Microsoft reports out any efficiencies in a final report, as of now this seems somewhat theoretical though it may have been tested at least in a production test bed of some sort using real data.
Another entry into the massively multi-core low power server race. Since the fading of other competitors like Calxeda, SeaMicro there hasn’t been a lot of announcements or shipping products that promised to be the low-power vendor of choice. Each time an inventor or entrepreneur stepped up with a lower power or more core device, Intel would kind of blunt the advantage by doing a benchmark and claiming shutting cores off saves more power than using an inherently low power design. The race today as designed by Intel is race to sleep and that’s the benchmark by which they are measuring their own progress in the low power massively multi-core cpu market. However now Cavium is stepping up with an ARM based cpu with 48 cores. So let’s find out what we can about this new chip from this EE Times article.
It appears the manufacturing partner for this new product is Gigabyte who are creating a 2-socket motherboard for the 48-core ARM based CPU. The 48-core cpu is ARMv.8 based and addresses 64bits, so large amounts of RAM can be used with this architecture (a failing of past products from previous manufacturers attempting ARM based servers). Cavium has network processors in the market already using MIPS based CPUs and this new architecture using ARM based chips tries to leverage a lot of their expertise in the network processor market. Architecturally the motherboard interfaces and protocols are still in place, with only a cpu swap being the most noticeable difference. To Cavium is primarily known as a network processor manufacturer, but this move could push them into large scale data cloud type applications, with a tight binding to network operations supplied by their existing network processor products. Dates are still a little hazy, with the end of the calendar year being the most likely time a product has been developed, tested, manufactured and shipped.
I’m so happy to see the pressure being kept up in this one niche of computing. I still think ARM-based CPUs with massive amounts of cores being a new growth area. Similarly the move to 64bits takes away one of the last impediments most buyers pointed out when folks like Calxeda tried to market their wares into the data centers. Bit by bit, each attempt by each startup and each design outfit gets a little closer to a competitive product that might yet go up against the mighty Intel Xeon multi-core cpu.
It’s not unprecedented: Google already offers a testing suite for Android apps, though that’s focused on making sure they run well on smartphones and tablets, not testing the cloud-based services they connect to. If Google added testing services for the websites and services those apps connect to, it would have an end-to-end lock on developing for both the Web and mobile.
Load testing websites and web-apps is a market whose time has come. I know where I work we have Project group who has a guy who manages an installation of Silk as a load tester. Behind that is a little farm of old Latitude E6400s that he manages from the Silk console to point at whichever app is in development/QA/testing before it goes into production. Knowing there’s potential for a cloud-based tool for this makes me very, very interested.
As outsourcing goes, the Software as a Service (SaaS) or Platform as a Service (PaaS) or even Infrastructure as a Service (IaaS) categories are great as raw materials. But if there was just an app that I could login to, spin up some VMs install my load-test tool of choice and then manage them from my desktop, I would feel like I had accomplished something. Or failing that even just a toolkit for load testing with whatever tool du jour is already available (nothing is perfect that way) would be cool too. And better yet, if I could do that with an updated tool whenever I needed to conduct a round of testing, the tool would take into account things like the Heart Bleed bug in a timely fashion. That’s the kind of benefit a cloud-based, centrally managed, centrally updated Load Test service could provide.
And now as Microsoft has just announced a partnership with Salesforce on their Azure cloud platform, things get even more interesting. Not only could you develop using an existing toolkit like Salesforce.com, but host it on more than one cloud platform (AWS or Azure) as your needs change. And I would hope this would include unit test, load test and the whole sweet suite of security auditing one would expect for a webapp (thereby helping prevent vulnerabilities like HeartBleed OpenSSL).
The Center IT outfit I work for is dumping as much on premise Exchange Mailbox hosting as it can. However we are sticking with Outlook365 as provisioned by Microsoft (essentially an Outlook’d version of Hotmail). It has the calendar and global address list we all have come to rely on. But as this article goes into great detail on the rest of the Office Suite, people aren’t creating as many documents as they once did. We’re viewing them yes, but we just aren’t creating them.
I wonder how much of this is due in part to re-use or the assignment of duties to much higher top level people to become the authors. Your average admin assistant or even secretary doesn’t draft anything dictated to them anymore. The top level types now generally would be embarrassed to dictate something out to anyone. Plus the culture of secrecy necessitates more 1-to-1 style communications. And long form writing? Who does that anymore? No one writes letters, they write brief email or even briefer text, Tweets or Facebook updates. Everything is abbreviated to such a degree you don’t need thesaurus, pagination, or any of the super specialized doo-dads and add-ons we all begged M$ and Novell to add to their première word processors back in the day.
From an evolutionary standpoint, we could get by with the original text editors first made available on timesharing systems. I’m thinking of utilities like line editors (that’s really a step backwards, so I’m being really facetious here). The point I’m making is we’ve gone through a very advanced stage in the evolution of our writing tool of choice and it became a monopoly. WordPerfect lost out and fell by the wayside. Primary, Secondary and Middle Schools across the U.S. adopted M$ Word. They made it a requirement. Every college freshman has been given discounts to further the loyalty to the Office Suite. Now we don’t write like we used to, much less read. What’s the use of writing something so long in pages, no one will ever read it? We’ve jumped the shark of long form writing, and therefore the premiere app, the killer app for the desktop computer is slowly receding behind us as we keep speeding ahead. Eventually we’ll see it on the horizon, it’s sails being the last visible part, the crow’s nest, then poof! It will disappear below the horizon line. We’ll be left with our nostalgic memories of the first time we used MS Word.