First, let's define what we mean by lock-in. This description from wikipedia provides a good, workable definition:
In economics, vendor lock-in, also known as proprietary lock-in, or customer lock-in, makes a customer dependent on a vendor for products and services, unable to use another vendor without substantial switching costs. Lock-in costs which create barriers to market entry may result in antitrust action against a monopoly.
In the world of software development and IT, examples of things that have caused lock-in headaches would be:
- Using proprietary operating system features
- Using proprietary database features
- Using hardware which can only run a proprietary OS
But in the process of doing that, you have used features in the operating system or database or other component that are unique to that vendor's product. There may be have been good reasons at the time to do so (e.g. better performance, better integration with development tools, better manageability, etc.) but because of those decisions the cost of changing any of the significant components of that software becomes too high to be practical. You are locked-in.
In that scenario, the root cause of the lock-in problem seems to be the use of proprietary API's in the development of the application so it kind of makes sense that the focus of concern in Cloud Computing Lock-In would also be the API's. Here's why I don't think it's different for the Cloud Service case:
- Sunk Cost - While the use of proprietary API's in the above example represent one barrier to change, a more significant barrier is actually the sunk cost of the current solution. In order to deploy that solution internally, a lot of money had to be spent upfront (e.g. hardware, OS server and client licenses, database licenses). To move the solution off the locked-in platform not only involves considerable re-write of the software (OpEx costs) but also new CapEx expenses and potential write-down of current capital. In the case of a Cloud Service, these sunk costs aren't a factor. The hardware and even the licensing costs for software can be paid by the hour. When you turn that server off, your costs go to zero.
- Tight Coupling vs. Loose Coupling - Even if you focus only on the API's and the rework necessary to move the solution to a different platform, the fact that Cloud Computing services focus on REST and other HTTP-based API's dramatically changes the scope of the rework when compared to moving from one low-level tighly-coupled API to another one. By definition, your code that interacts with Cloud Services will be more abstracted and loosely-coupled which will make it much easier to get it working with another vendor's Cloud Service.
So, how do you mitigate that concern? Well, you could try to keep local backups of all data stored in a service like S3 but for large quantities of data that becomes impractical and diminishes the value proposition for a data storage service in the first place. The best approach is to demand that your Cloud Service vendors provide mechanisms to get large quantities of data in and out of their services via some sort of bulk load service.
Amazon doesn't yet offer such a service but I was encourage by this thread on their S3 forum which suggests that AWS is at least thinking about the possibility of such a service. I encourage them and other Cloud Services vendors like Rackspace/Mosso to make it as easy as possible to get data in AND out of your services. That's the best way to minimize concerns about vendor lock-in.