Stories by Dave Linthicum

Opinion: Why the shortage of cloud architects will lead to bad clouds

Designing and building cloud computing-based systems is nothing like building traditional applications and business systems. Unfortunately, many in IT find that out only when it's too late.
The complexities around multitenancy, resource sharing and management, security, and even version control lead cloud computing startups -- and enterprises that build private and public clouds -- down some rough roads before they start to learn from their mistakes. Or perhaps they just have to kill the project altogether as they discover all that investment is unsalvageable.
I've worked on cloud-based systems for years now, and the common thread to cloud architecture is that there are no common threads to cloud architecture. Although you would think that common architectural patterns would emerge, the fact is clouds do different things and must use very different architectural approaches and technologies. In the world of cloud computing, that means those who are smart, creative, and resourceful seem to win out over those who are just smart.
The demand has exploded for those who understand how to build clouds. However, you have pretty much the same number of cloud-experienced architects being chased by an increasing number of talent seekers. Something has to give, and that will be quality and innovation as organisations settle for what they can get versus what they need.
You won't see it happen right away. It will come in the form of outages and security breaches as those who are less than qualified to build clouds are actually allowed to build them. Moreover, new IaaS, SaaS, and PaaS clouds -- both public and private -- will be functional copies of what is offered by the existing larger providers, such as Google, Amazon Web Services, and Microsoft. After all, when you do something for the first time, you're more likely to copy rather than innovate.
If you're on the road to cloud computing, there are a few things you can do to secure the talent you need, including buying, building, and renting. Buy the talent by stealing it from other companies that are already building and deploying cloud-based technology -- but count on paying big for that move. Build by hiring consultants and mentors to both do and teach cloud deployment at the same time. Finally, rent by outsourcing your cloud design and build to an outside firm that has the talent and track record.
Of course, none of these options are perfect. But they're better than spending all that time and money on a bad cloud.

Analysis: Don't assume cloud computing is a great investment

Cloud computing seems like a safe bet. I mean, any startup or existing company that does cloud computing and needs capital must provide a great returns for investors, right? Maybe not.
A wise venture capitalist once told me that those in the venture capital community move like flocks of birds. When they see the others moving in a certain direction, they all seem to follow. Cloud computing is another instance of that behavior. And if VCs and investors are naïve about cloud computing, IT will face business pressures to use cloud computing based on that naïveté, creating serious issues down the line.
The trouble is that cloud computing is both ill-defined and broadly defined. There is a lot of confusion about what's real cloud computing and what is not. I have to admit that I spend a good deal of my day trying to figure that out as well.
What are the top three mistakes that VCs and other investors will make as they move into the cloud computing space?
Cloud computing investor mistake No. 1: Assume a sustainable business model

Cloud outages have a simple solution

I was amused by Steve Jobs' wi-fi overload issues during his iPhone 4 presentation. While he could ask the audience to "put your laptops on the floor" and turn off their 3G-to-wi-fi devices, most cloud providers won't have the luxury of asking customers not to use their services when their cloud platforms get oversaturated.
There have been many recent availability issues with cloud providers, such as Twitter's and Google Calendar's struggles, as well as Google App Engine's datastore taking a dirt nap under demand. Or, as Google puts it in a recent post: "There are a lot of different reasons for the problems [with data store] over the last few weeks, but at the root of all of them is ultimately growing pains. Our service has grown 25 percent every two months for the past six months."
There are also many cloud outages and availability issues that aren't reported, but have the same negative affects on the cloud users. What we hear in the press is the tip of the iceberg.
I think this increase in outages caused by saturation is just the start. With the increased use of cloud computing this year and next, clouds falling over due to stress will be more commonplace.
The core issue is the saturation of resources by too many users doing too much on the cloud provider's servers. Putting any architecture and design issues aside for now, it is as simple as that — it is also a very old problem.
The solution is also as simple. It is called "capacity planning" — making sure the capacity of your current system (in this case, virtualised and multitenant server clusters) will meet the demands of the number of users working in the cloud, as well as their patterns of consumption. Back in the day, there were many capacity planners running around, but with the advent of commodity hardware and software, capacity planning (including performance modeling) has become a lost art.
For the most part, there is no excuse for availability issues due to lacking capacity. You know where your saturation point is, so you need to make sure you have enough resources on hand never to reach that precipice. That said, I suspect most of the capacity planning that occurs within cloud providers these days is to watch the usage graphics move upward and to try to add more equipment before processes run out of room. That is clearly not a successful strategy.

Relational databases aren't right for the cloud

I attended the Cloud Connect 2010 conference in Santa Clara, California, one of the first major gatherings of the year on cloud computing. One of the main topics that came up is not using relational databases for data persistence. Called the "NoSQL" movement, it is about leveraging more efficient databases that are perhaps able to handle larger data sets more effectively. I have already written about the "big data" efforts that are emerging around cloud, but this is a more fundamental movement to drive data back to more primitive, but perhaps some more efficient models and physical storage approaches.
NoSQL systems work with data in memory, typically, or uploading chunks of data from many disks in parallel. The issue is that "traditional" relational databases do not provide the same models and, thus, the same performance. While this was fine in the days of databases with a few gigabytes of data, many cloud computing databases are blowing past a terabyte, and we will see huge databases supporting cloud-based systems going forward. Relational databases for operations on large data sets are contraindicated, because SQL queries tend to consume many CPU cycles and thrash the disk as they process data.
If you think we have heard this song before, you are correct. Object and XML databases made some inroads back in the 1990s, but many enterprises kept the relational databases around, such as Oracle, Sybase, and Informix, despite the fact that many nonrelational databases did indeed provide better performance. However, the cost and risks of moving from relational databases, as well as the relatively small sizes of the databases, kept it pretty much a relational world.
However, the cloud changes everything. The requirement to process huge amounts of data in the cloud is leading to new approaches to database processing, based on older models. MapReduce, the fundamental way Hadoop processes data, is based on the older "share-nothing" database processing model from years ago, but now we have the processing power, the disk space and the bandwidth.
I believe the movement to cloud computing will indeed reduce the use of relational databases. It is nothing we have not heard before, but this time we have a true need.
When this column was posted online, it attracted the following comment: "Its always been the same. You want to process high volumes of data with large CPU requirements? Don't use a database. You want resilience, recoverability, reliability, consistency and a language that allows those without a PhD in machine code to join data up in new ways - ways not intended by the original design? You need a database. Yes, there's an overhead, go figure. But if it won't scale, you're either using a duff relational database or a duff designer. Built right, a set of relational tables will take 10 times as long to process 10 times the amount of data in a join. That's scalability. Anything claiming more is likely to be wool pulling or smoke and mirrors. (Also applies to your article's talk about a new database that can do memory-caching of data and pull data from many disks at a time. Any relational database which cannot do that does not deserve to be called a database)."

People, process and architecture key to SOA success

SOA may have seemed the saviour of bad software architecture and poor development project planning, but the reality is that it's a complex and difficult venture. Thus, the number of failed SOA projects is about equal to the successful ones.

Learning to live with enterprise mashups

With the advent of mashups, innovative developers all over the enterprise are seeking new ways to leverage the value of corporate information through the use of external web applications, APIs, or services.