Comments
Patrick Collands wrote: collands (AT) gmail com I'd be very grateful for an invitation. Thank you.
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..

SYS-CON.TV
Today's Top SOA Links


Cloud Analytics: Dataflow versus Databases
Realtime analytics drives a migration away from databases to more scalable parallel dataflow architectures.

For twenty years, analytics has been viewed as just one specific area within the broader relational database industry. So, analytics has meant databases. Today that view is changing. Over the past year or so, a new movement, the "NoSQL" movement has emerged promoting the advantages of doing a variety of kinds of analytics without using any relational database technologies at all.

Whatever one thinks of the capabilities and limitations of distributed key-value stores relative to relational databases, one thing is clear - the stranglehold that SQL has held over all aspects of data analytics since 1990 is now coming to an end. Other non-SQL approaches to analytics such as MapReduce/Hadoop, a very simple dataflow architecture for batch computing, are now gaining ground. As the need for realtime analytics grows we will continue to see a migration away from databases and towards more scalable parallel dataflow architectures for analytics.



The main differences between databases and dataflow can be summarized as follows:

Database

Dataflow

Historical

Realtime

Offline

Online

Pull Model

Push Model

High latency

Low latency

Demand-driven

Data-driven


The shift from databases to dataflow for enterprise cloud analytics mirrors what we have recently seen in another area, the "realtime web". The old demand-driven web model of polling/querying/pulling RSS feeds has proved unable to deliver the kinds of low latency required for the numerous new realtime web services being created by Twitter and others. New data-driven, realtime, push models such as PubSubHubbub and RSSCloud are now replacing the old approaches.

About Bill McColl
Bill McColl is Founder and CEO, Cloudscale Inc. - which is developing a massively parallel cloud-based platform for continuous real-time intelligence on live data streams.

In 2006, he left Oxford University Computing Laboratory where for over twenty years he had been head of research in parallel computing and scalable systems. At the time of his departure, he was Professor of Computer Science and Chairman of the Faculty of Computer Science. McColl has published and lectured extensively on the design, analysis and implementation of massively parallel algorithms and systems.

He established and led Oxford Parallel, a major center for research on industrial and business applications of parallel computing at the university. He was also founder and CEO of Sychron Inc., a Silicon Valley VC-backed software company developing massively parallel system software for datacenter and desktop virtualization. Cloudscale Inc.is his second Silicon Valley company.

Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE