Today's Top SOA Links
Industry News Desk
AWS Heads Up-Market with Redshift
Redshift literally represents a shift in Amazon’s targeting
By: Maureen O'Gara
Dec. 3, 2012 07:00 AM
Amazon Web Services used re:Invent, its very first customer and partner conference this week in Vegas, to announce the coming of a cloud data warehouse service called Redshift meant to undercut and disrupt the pricey "old guard" brands of Oracle, IBM, Teradata, EMC Greenplum and HP.
Redshift literally represents a shift in Amazon's targeting.
It's going up-market looking for customers among the big corporates that have supposedly overcome their doubts about running mission-critical apps in the public cloud and are now down to figuring out which ones to move first and how quickly.
Amazon evidently thinks it's on the cusp of this great inflection point and to bait the big fellas onto its cloud - as well as defend its 60% IaaS market share against the likes of Google, Rackspace and Microsoft - it's giving them an alternative to the on-premise data warehouses that large companies say are "too expensive and a pain in the butt to manage" and that smaller companies feel are simply beyond their reach according AWS boss Andy Jassy.
Redshift, which is probably the closest AWS has come to peddling an actual application, is a typical pay-as-you-go service with no payment upfront.
It's supposed to cost about one-tenth a traditional data warehouse.
That's why Amazon's figures its high-volume low-margin model is going to take business way from the 60% to 80% gross margin crowd and their scale-lacking private "cloudwash" clouds.
Redshift is currently in limited preview and technical details are kind of thin, but it's reportedly designed to be easy to provision and automates setup, operation and cluster regulation.
It promises to work with all the popular business analytics tools and give great performance. It's a columnar data store used to make certain kinds of ad hoc queries against and its query return on almost any size dataset is supposed to be really fast because of its basic design and because of compression on the server nodes.
The Register thinks it may be based on a parallelized version of the PostgreSQL open source database à la Netezza and Greenplum since it uses PostgreSQL drives to link to third-party BI tools. Anyway, it speaks standard SQL and has JDBC and ODBC hooks for the BI programs.
ITPorPortal says it consists of ParAccel-licensed components, available in two underlying node variants that can contain either 2TB or 16TB of compressed customer data per node. A user can start with a single small node and scale up to a 32 nodes with 64TB of capacity or use fatter nodes and scale up to 1.6PB of capacity.
There was no mention of flash storage to boost I/O as likely as it may be.
Amazon, the Internet retailer, AWS' parent company, which spends a few million a year on a conventional data warehouse, tested Redshift on a two-node cluster and reportedly ran six of its toughest queries on a dataset with two billion rows of data. It found that it ran 10 times faster than its on-premise warehouse and cost $3.65 an hour, or about $32,000 a year, peanuts compared to what it's now spending.
AWS says classic on-premise data warehouses run $19,000-$25,000 a terabyte of storage a year including a few administrators, hardware, software and maintenance.
Redshift is supposed to launch early next year priced at under $1,000 a terabyte a year with the ability to scale to a petabyte or more of storage for those who promise to stick around for a while. It's got one-year and three-year reservations. The price quoted is for a three-year deal on a heavily used 13-node Redshift cluster.
Using it on-demand will cost more. Figure 85 cents an hour for 2TB nodes and $6.80 an hour for 16TB nodes.
NASA's Jet Propulsion Lab and Netflix are already users.
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week