I was amused by Steve Jobs' wi-fi overload issues during his iPhone 4 presentation. While he could ask the audience to "put your laptops on the floor" and turn off their 3G-to-wi-fi devices, most cloud providers won't have the luxury of asking customers not to use their services when their cloud platforms get oversaturated. There have been many recent availability issues with cloud providers, such as Twitter's and Google Calendar's struggles, as well as Google App Engine's datastore taking a dirt nap under demand. Or, as Google puts it in a recent post: "There are a lot of different reasons for the problems [with data store] over the last few weeks, but at the root of all of them is ultimately growing pains. Our service has grown 25 percent every two months for the past six months." There are also many cloud outages and availability issues that aren't reported, but have the same negative affects on the cloud users. What we hear in the press is the tip of the iceberg. I think this increase in outages caused by saturation is just the start. With the increased use of cloud computing this year and next, clouds falling over due to stress will be more commonplace. The core issue is the saturation of resources by too many users doing too much on the cloud provider's servers. Putting any architecture and design issues aside for now, it is as simple as that — it is also a very old problem. The solution is also as simple. It is called "capacity planning" — making sure the capacity of your current system (in this case, virtualised and multitenant server clusters) will meet the demands of the number of users working in the cloud, as well as their patterns of consumption. Back in the day, there were many capacity planners running around, but with the advent of commodity hardware and software, capacity planning (including performance modeling) has become a lost art. For the most part, there is no excuse for availability issues due to lacking capacity. You know where your saturation point is, so you need to make sure you have enough resources on hand never to reach that precipice. That said, I suspect most of the capacity planning that occurs within cloud providers these days is to watch the usage graphics move upward and to try to add more equipment before processes run out of room. That is clearly not a successful strategy.