Today's Top SOA Links
Lightning in the Clouds, Big Data
Venture into the clouds, but prepare for lightning
By: Ajay Budhraja
Jan. 4, 2013 09:00 AM
Cloud computing has been marketed as one of the key advances in technology and every day we hear about new areas where cloud services are being utilized. Cloud is the bright shining star that is being leveraged for it's elastic, on-demand, resource pooling capabilities. However there have been Cloud outages recently that have adversely impacted businesses. These outages highlight the risks of the Cloud and bring into focus that such risks need to be effectively managed. Cloud outages are like lightning in the Clouds, lightning can cause problems where it strikes and preparation is important to avoid damage.
This year Amazon, Salesforce, Google, Gmail, Google App Engine, Microsoft Office 365, Azure had outages that impacted businesses. During this holiday season, some Netflix subscribers were hit with an outage on Christmas Eve that was caused by Amazon cloud servers. Amazon tracked the issue to Elastic Load Balancing, that enables spreading traffic across many servers.The wasn't good timing for the outage as subscribers were looking forward to watching movies during this period. Microsoft Azure storage had an outage during the holidays that impacted the management portal. Big Data services and providers have also reported access issues.
There are many ways of minimizing risks associated with such outages. The key is to distribute the load that comes in and to have redundancy, so that failure in one area can be picked up by another area. There are many approaches to achieving such redundancy. One way to do this is to distribute systems across locations so that if one location is down, the other one can pick up the load. Multiple location failover can be more expensive, hence the costs have to be weighed against the benefits and availability requirements. Another approach is to have back up clouds for applications with requirements for high availability, to which traffic can be diverted to reduce the risks of downtime in a specific cloud. Again, this can be more costly compared to having the regular clouds. A similar approach is to have the applications spread out across many clouds, so that if there are downtime issues, not all applications go down at once. If applications are located in many clouds, this can enhance availability if one of the clouds goes down.
It is important to have adequate monitoring for applications and cloud services to be informed when the services are down and to take necessary actions. Both availability and performance monitoring should be conducted to identify any problems. Availability monitoring tracks if the applications and services are up and running, performance monitoring looks into performance metrics such as response time for applications and services. Availability and performance should be specified for services and applications based on the requirements. Key events have to be identified for monitoring and specific alerts have to be set up and as soon as these alerts send notifications, specific actions should be taken. All these aspects should be defined in a plan that lays out all the details of events and related actions so that any outages can be handled effectively. Preparation is key to ensure the proper actions are taken to address and recover from any such issues. The risks related to Cloud outages can be managed by having the proper approaches for availability and performance, regularly monitoring and taking appropriate actions in a timely manner.
(This has been extracted from and is reference to Ajay Budhraja's blog)
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week