Amazon Now Storing 2 Trillion Objects in S3 -- Virtualization Review

The Cloud Report

Amazon Now Storing 2 Trillion Objects in S3

Ed.'s Note: The original headline incorrectly stated that Amazon stored 2 billion objects, instead of 2 trillion. The headline has been corrected.

In the latest sign that Amazon's enterprise cloud business remains the envy of every other service provider, the amount of data stored in Amazon Web Services (AWS) Simple Storage Service, or S3, is now 2 trillion.

To put that in context, it's double the amount of information stored in S3 since last June, when AWS hit the 1 trillion object milestone.

Amazon CTO Werner Vogels revealed the latest stat at the kickoff of his company's first AWS Summit, a 13-city roadshow which commenced in New York last week. While Amazon doesn't break out revenues for its AWS business, revenues categorized as "other" jumped 60 percent from $500 million to $798 million in the first quarter year-over-year, the company reported after the markets closed today. It's widely presumed that the "other" revenues described by Amazon primarily come from AWS underscoring the rapid growth for the business.

Launched in 2006, AWS now runs datacenters worldwide, offers 33 different services and 25 application categories in its marketplace, and -- with its hundreds of thousands of servers and scale -- has reduced the price of compute and storage services 31 times. Seven of those price cuts have come over the past three months.

"If we are able to drive the cost of compute to a point where you no longer have to think about it, tremendous new products will be built," Vogels said in his AWS Summit keynote address. Vogels showcased various customers that have opted to use EC2 and S3 to provide compute and storage on demand as an alternative to deploying server farms, among them Bristol-Myers Squibb and Nasdaq OMX.

Russell Towell, a senior solutions specialist at Bristol-Myers Squibb, explained how researchers are able to run highly complex compute jobs using AWS to perform research -- a drug-maker's lifeblood -- that were previously beyond reach. Towell's team built a Java application with a portal using EC2-based API calls that lets researchers self-provision a server or database.

"We empower the users to be able to log on through this Web screen. They select an image type, they select a server type, how much compute capacity they want, and they basically just hit the submit button," Towell said. "If you're a research cloud user and ask for a Linux server on the research cloud, you're going to get it in five minutes. If you choose one out of the four different Oracle databases that are available in the catalog, you can tell it what you want the database name to be, hit the submit button, and you will have an Oracle database in 12 minutes. If you ask for a Windows 2008 R2 server, you're going to get that in 20 minutes."

Nasdaq OMX managing director Scott Mullins talked up FinQloud, a platform offered for Nasdaq OMX's various clients launched back in September. In addition to running the famous Nasdaq trading exchange, Nasdaq OMX provides technology to 70 different marketplaces.

"What this really means is we've taken those publicly available solutions that AWS offers such as S3, EMR [Elastic MapReduce] and EC2, and we custom-built solutions that are tailored to our industry specifications and then enabled our clients to really re-architect themselves," Mullins said.

Thousands of customers, developers and partners attended the New York AWS Summit and I had a chance to chat with quite a few. Many validated what is already well understood: that AWS is by far the most widely utilized provider of cloud services.

"We think Amazon has a three- to four-year headstart on product depth and pricing and a decade on global infrastructure," said Jeff Aden, president of 2nd Watch, a Seattle-based systems integrator that has deployed 200 core production enterprise systems using AWS. "You're talking potentially five to 10 years out until there's a serious contender." While most acknowledge AWS' lead in the market, some might beg to differ with the challenge its rivals are facing in catching up.

I asked Aden if he was exclusively tied to Amazon. He said he's cloud-agnostic but AWS rivals have not been able to match the cost and level of infrastructure 2nd Watch requires to date. Aden said his company spends an extensive amount of time investigating alternatives, notably the newly expanded Windows Azure Infrastructure Services, as well as OpenStack-based services from HP, IBM and Rackspace.

"We continually test on Windows Azure and look at it," Aden said. "It's great for the marketplace overall, because competition leads to better products, but there are certain things that we have to test around security and being able to manage the services before we make recommendations on how to use it."

Vogels' message was clear: AWS is focused on "relentless cost reduction" of running racks of servers while bringing high-performance computing to a customer base that couldn't consider running simulations that require that scale. Vogels called out one partner, Cycle Computing, which writes software that creates and automates environments that run computing jobs and handle the movement of data to the cloud.

Cycle Computing started out as a consulting company to big pharmaceutical and financial services firms and is now offering software that lets organizations run jobs using AWS by the hour.

Jason Stowe, Cycle Computing's CEO, told me his company recently ran a job that used 10,600 server instances in Amazon's EC2 to perform a job for a major pharmaceutical firm. To run that simulation in house would require the equivalent of a 14,400-square-foot datacenter which, based on calculations from market researcher IDC, would cost $44 million.

"Essentially, we created an HPC cluster in two hours and ran 40 years of computing in approximately 11 hours," Stowe explained. "The total cost for the usage from Amazon was just $4,472."

Posted by Jeffrey Schwartz on 04/25/2013 at 12:48 PM