Cloud computing has gotten enormous coverage lately, with claims for benefits that may or may not be realized. The cloud does not enable a wired and informed electorate; that comes from online services and Internet linkages, whether they are hosted in the cloud or in the government's own infrastructure. The cloud does not provide 100% uptime and availability of packages and data; the net will stay up, but a vendor can still go down, can still be hacked through a denial of service attack, and can still provide faulty data or faulty applications. Senior executives, politicians, and CIOs with corporate government agency responsibilities, need to know what the cloud is and is not, and what it can and cannot promise. They need to understand the risks and the rewards, in order to use it safely and effectively, and in order to contract cloud services safely and at a fair price.
Essentially, cloud computing is an extreme form of outsourcing, one in which hardware ownership and operation, software version updating, data storage and backup, and occasionally other functions as well, are all outsourced to a singe vendor. Moreover, hardware is generally located at the cloud vendor site, where massive racks of servers and farms of storage devices process the applications and maintain the data of huge numbers of users to achieve efficient operation. Since the data and the programs are located "somewhere" in the apparently nebulous structure of the web and accessed remotely through the Internet, this form of shared remotely hosted service is called cloud computing.
Seen as a form of outsourcing, cloud computing offers a well-defined set of benefits; these are the benefits that are traditionally associated with outsourcing, and most are well known and well studied. The first of course are associated with economies of scale: (1) a large vendor will make much more efficient use of personnel, and (2) a large vendor sees much less variation day by day in demand than each individual user will encounter and therefore can do more effective load leveling and will require less excess capacity or "safety stock" in computing resources. As a result, a large vendor can charge for actual usage, allowing those users with high demand at a given time to consume unusually high levels of resources and pay higher total fees, while allowing users with lower demand to consume fewer resources and to pay lower fees. (3) Economies of scale also allow large vendors, whether cloud-based or not, to perform more R&D than smaller users could perform.
There are several of benefits to modern online computing that are wrongly lumped with cloud computing, like online access from any location, social networking, community outreach, and ubiquitous connectivity. These are more accurately attributed to remote web-based access, and indeed are not inextricably linked to the cloud. The cloud is an outsourcing service delivery mechanism, and the web is the medium for delivery.1
As an extreme form of outsourcing, cloud computing has the risks that are traditionally associated with outsourcing. Indeed, since cloud computing is an extreme form of outsourcing, which moves data storage and backup, ownership of all facilities, and all aspects of facilities management to a single vendor, its risks are somewhat exacerbated compared to other forms of outsourcing or facilities management. In a sense this may not appear to be very different from the timesharing model of computing that was prevalent in the late 1960s and early 1970s, except that in the era of timesharing we tended to use shared remote facilities to run ad hoc analyses, while now we use the cloud to run operational software that controls every aspect of an organization, from product scheduling, inventory control and vendor management to sales and customer support and relationship management. Thus, while cloud computing may be just another form of shared facilities outsourcing, its risks may be more extreme than the risks of earlier forms of outsourcing.
The risks of outsourcing are well-known and well-studied. These risks include:
- Shirking, or the principal-agent problem, which is deliberate underperformance while claiming full payment when the client cannot verify the vendor's effort or service levels
- Poaching, which is the deliberate misuse of the client's data, software, or intellectual property in ways that benefit the vendor while damaging the client, when the client cannot detect this misuse
- Opportunistic Renegotiation, or vendor holdup, which occurs in the presence of high switching costs, economic lock-in, and strategic dependence upon a single vendor
Why is the cloud emerging now?
Each of the pieces required for cloud computing has been around for some time, so why is the cloud emerging only now?
- While cheap computing hardware is not new, we now see almost total standardization of the entire stack of hardware computing resources. The Intel X86 architecture is emerging as the chip of choice for everything from laptops to mainframes. We are standardizing on a small set of server operating systems, usually either Windows-based or Unix-based. There is a small set of virtualization hypervisors, which allocate jobs to servers whether in a large data center or in the cloud. There is even a growing set of largely standardized to enterprise applications, from office functions to ERP systems and vendor and customer management.
- Paradoxically, the decrease in hardware cost is driving data center consolidation. The non-hardware costs, especially systems administration personnel costs, greatly dominate the cost of hardware acquisition. Scale in systems administration personnel may represent the greatest cost advantage of cloud computing. Small and medium enterprises will be the greatest beneficiaries of this consolidation; in a small shop one sys admin may manage 50 servers, but with the move to cloud computing even SMEs can now share in the vendors' sys admin ratio of 1 per 150,000 servers, or even in the ratio of 1 per 1.5 million that some vendors hope to achieve. SMEs may adopt the best practices of the cloud vendors, but without sufficient size they can never obtain the economies of scale in automation of automation, that is, in automation of systems administration.
- Rapid response to changing demand is the greatest cloud innovation, providing both the ability to scale back resource payments when the demand for them decreases, and most importantly the ability to burst overflow demand into the cloud when demand increases. While in some sense this is not so different from time-sharing, enterprises can now handle bursts in demand for core services, not just for analytics. Again, since small enterprises usually encounter wider fluctuation in demand, SMEs will be the greatest beneficiaries.
- And, with dedicated sys admin personnel supported by automated sys admin services, cloud vendors will be able to offer much more reliable backup, software release management, etc. Once again, the greatest beneficiaries will be SMEs.
Risks of cloud computing
While the risks of cloud-based outsourcing remain shirking, poaching, and opportunistic renegotiation, as with all outsourcing, the forms taken in cloud-sourcing are slightly different.
Shirking can have several forms.
- The vendor may fail to invest in sufficient peak load excess capacity -- Unique to the cloud is the risk that the vendor may not have invested in sufficient excess capacity for worst-case peak loads. We have learned how to cope with overwhelming and correlated peak demand, like the demand for holiday travel. No private company has invested in sufficient capacity for us all to book rail or air travel for American Thanksgiving weekend; we know this and we stagger our travel schedules accordingly. And yet, customers expect to be able to burst excess demand for computing services into the cloud; peak demand is, once again, likely to be correlated and likely to be overwhelming. Demand for services in the days leading up to Christmas is likely to overwhelm not only the resources of many firms in hospitality and retailing, but their cloud vendors as well. Likewise, enthusiasts of cloud-sourced government computing should remember that April 15 is tax day in many countries, not just the US, and last minute filers might produce correlated demand spikes and again overwhelm vendors. Vendors may be tempted not to provide the full computing resources needed for peak capacity; this underinvestment will be difficult to detect until service quality actually does degrade and is likely to catch many firms unprepared.
- The vendor may not make adequate investments in security and in security monitoring -- How thoroughly will data be protected? How quickly will security breaches be detected and how quickly will clients be notified? Delay in notification of identity theft can be catastrophic. Again, this underinvestment will be difficult to detect until it actually has become a problem.
Poaching will be extremely difficult to monitor and detect, and therefore extremely difficult to limit should vendors choose to violate their ethical and legal obligations. This, combined with the potential for shirking security responsibilities, explains why security always features so prominently among clients' lists of concerns with cloud computing. These problems are not unique to the cloud, but the extended chain of custody, from the vendor, through the net, to the client, may make it more difficult to establish the source of leakage conclusively.
- The vendor may performing data mining in aggregate to learn the characteristics of a clients' own customers, products, or order flow; while some data mining may be benign or harmless, and the vendor may grant itself some rights to data mining in the terms of the contract, it is not always clear what data mining is being performed, how it will benefit the vendor, or how it will affect the client
- The vendor may profit from leakage of small amounts of critical, sensitive, or private information about a client, its personnel, or its customers; this is specific and identifiable individual data, not aggregate statistics resulting from data mining.
- The vendor may even profit from the leakage of critical business plans to the vendor if the vendor seeks to compete in the client's business (which some might call theft of IP rather than leakage), or more likely to the vendor's other clients. Remember, poaching is the misuse of information provided for one purpose but used for another in a way that harms the client; surely the client was aware that the vendor was handling its data, but surely the client did not expect this information to be used by the vendor or others in a way that competes directly with the client's core business.2
Opportunistic Renegotiation can again come in several forms:
- The first form of vendor hold-up actually comes from platforms that are uniquely innovative, resulting in a true source of vendor competitive advantage. If the vendor offers unique software (software as a service) that is not yet available elsewhere, or offers a superior development platform (platform as a service), then the vendor's innovation may make it more difficult to justify leaving. This is economic lock-in, not absolute lock-in.3 Although the client won't leave the vendor, this is because the client doesn't want to do so and not because the client can't; the client is not strictly trapped, and if the vendor's prices subsequently became too high leaving would be both possible and economically attractive. In some sense the vendor's superior development platform can be viewed as a platform for rapid prototyping, and the client can always move your systems to another platform if and when it decides that it makes sense to do so.
- Ecosystem holdup will occur if a large number of the client communicates with a large number of its customers or its suppliers through a single cloud vendor, suing the vendor's equivalent of a proprietary commercial social networking service or commercial instant messaging service. This similarly turns out to be a manageable problem, as long as the client can maintain a small presence in the vendor's network, operated as a data embassy for communicating with other members of the cloud vendor's social ecosystem.
- The data hostage problem is truly the most dangerous source of client vulnerability, and therefore the most likely source of true or absolute lock-in. If necessary a client can rewrite all of its applications over time if it choose to leave an abusive vendor, but the client cannot regenerate the history of all online interactions. A client cannot leave a vendor without its history as represented by its data, because it cannot operate without its data; loss of data would result in total corporate amnesia, and in many cases in total corporate paralysis. The solution is to ensure all clients access to their data, in a timely fashion, in a standardized PDBF (portable data base format).
Of these risks, and indeed, of all risks, we believe that the data hostage problem is the most important. The other problems may create economic lock-in, since clients may find that it does not make economic sense to change vendors, but if the vendor raises prices high enough I can and will flee. In the case of hostage data, fleeing is not an option. We believe that at least at present the data lock-in problem may be the most overlooked.
We offer the following simple guidelines.
- Remember why you are moving to the cloud. This is a risk reward tradeoff.
- Remember what the rewards are, and what they are not. Cloud sourcing is a form of outsourcing intended to deliver economies of scale, in systems administration, in load leveling and the cost of serving excess capacity, and in the development of special platforms to facilitate software development. The cloud is not about achieving ubiquitous access and customer or constituent engagement; that is the role of the web, social networking, Facebook, and Twitter. Social networking is a cloud-based application, but it is not necessary to move core operations to the cloud in order to have a Facebook presence, and moving core operations to the cloud does not ensure effective social networking.
- Don't forget the risks! Shirking, and deliberate underperformance, and poaching, or the deliberate misuse of data resources and intellectual property, are always potential problems in any form of outsourcing. However, the threat of absolute lock-in that comes from the data hostage problem, and the degree of strategic dependence upon the vendor that this creates, may create unprecedented opportunities for vendor holdup. Indeed, the possibility of lock-in and opportunistic renegotiation, and the current lack of client protection, may represent the greatest limitation to the adoption of the cloud.
A subsequent post will address future sources of protection that may be available to clients, including the following:
- Improved standards on transparency, monitoring, and reporting should reduce all forms of shirking.
- The interaction between improved standards on reporting of data access and improved legal codes may provide protection from poaching, but at present there are unresolved legal and technical issues.
- A standard for a portable database format PDBF, improved contracts, and clarified legal codes will reduce the data hostage problem.
At present cloud standards, vendors' SLAs and contracts appear to offer very little explicit protection; this is the subject of our ongoing research and will be the subject of a future more technical posting.
1 - I can own a kitchen and hire a chef, with the associated fixed costs, and then deliver meals to my constituents via a fleet of taxis or via Federal Express. When I choose to use FedEx I have chosen to use the web, rather than private connections, as I might have done decades ago. Likewise, I can choose to outsource meal preparation from ARA or Marriott, pay only for the meals I order, eliminate my fixed costs, and still use Fed Ex for delivery. Here ARA is roughly equivalent to "cooking in the cloud" and Fed Ex is roughly equivalent to remote online (internet) access. By analogy, I can provide a host of eGovernment services to my constituents with or without the cloud.
2 - Poaching -- the deliberate misuse of information, expertise, algorithms, design, or other forms of P -- seems so egregious that one is tempted to assume that it cannot occur. It has occurred, and will continue to occur, in outsourcing contracts in a range of industries, from manufacturing (kitchen appliances, stereo) to hosted reservations services.
3 - Economic lock-in occurs when it does not make rational economic sense to change vendors, because the combined costs and benefits of staying outweigh the combined costs and benefits of leaving. If the vendor raises prices beyond some level, the client will leave. Absolute lock-in occurs when leaving is virtually impossible. The vendor can raise prices until it captures virtually all of the client's profits. Raising prices beyond that level is pointless, not because the client will leave, but because the client will declare bankruptcy.