In the late 1990s and early 2000s, the idea of grid computing, a type of distributed computing that harnesses the power of many computers to handle large computational tasks, was all the rage, at least among organizations with high-performance computing (HPC) needs. One of the most notable projects to make use of grid computing was SETI@home, which utilized thousands of Internet-connected computers to search for extraterrestrial intelligence (and still does).
Yet despite the promise of grid computing and the efforts of major vendors such as Sun Microsystems (NASDAQ: JAVA), IBM (NYSE: IBM) and HP (NYSE: HPQ), grid computing failed to catch on in mainstream enterprises, remaining mainly the province of governmental and scientific institutions with data-intensive storage and computing needs and few users. Enterprise uses have been more along the lines of R&D and data-intensive financial simulations instead of the mainstream data centers the technology’s proponents had hoped to win over.
Now enterprises are embracing a similar technology: Cloud computing and services such as Amazon’s (NASDAQ: AMZN) Simple Storage Service (S3), which provide companies with scalable, high-speed data storage and services at an attractive price.
Can cloud computing succeed where grid failed and find widespread acceptance in enterprise data centers? And is there still room for grid computing in the brave new world of cloud computing? We asked some grid computing pioneers for their views on the issue.
Differences Between Clouds and Grids
While there are many similarities between grid and cloud computing, it is the differences that matter most. Grid computing is better suited for organizations with large amounts of data being requested by a small number of users (or few but large allocation requests), whereas cloud computing is better suited to environments where there are a large number of users requesting small amounts of data (or many but small allocation requests).
“Grids are well suited for complex scientific work in virtual organizations,” explained Wolfgang Gentzsch, who was behind Sun’s grid efforts and now sits on the board of directors of the Open Grid Forum and is an advisor to the EU DEISA project. Clouds, on the other hand, are well suited for simple work such as many short-running jobs, he said.
Another key difference between the two: Grids require batch job scheduling or sophisticated policies for allocating jobs, while clouds do not. Also, by their nature, clouds do not require as large an upfront investment, as the cloud provider is responsible for running and maintaining servers.
“If you have computations that are large, portable, … have stringent performance requirements, and can be done on a best-effort basis [submitted to a batch queue], then I’d say traditional grid computing is for you,” said Kate Keahey, a scientist in the Mathematics and Computer Science Division at Argonne National Laboratory who frequently writes about grid and cloud computing. Argonne was the birthplace of grid computing and the Globus project, the de facto grid computing standard.
“If, on the other hand, your computational needs are small, or very large but only occasionally, or irregular/bursty in general, or unpredictable, or exhibiting fast/irregular growth, then I would say go for cloud computing because each of these patterns will either keep your data center idle at times, or not give you the economy of scale to amortize the investment in running a data center,” she said.
However, she said, “if you have a steady, predictable stream of large computations that does not vary or fluctuate much … forget either cloud or grid computing and buy yourself a large data center and just keep filling it. You will have economy of scale enough to pay for your data center.”
Yet another option, particularly for enterprises with sensitive data that want to keep that data proprietary but are hoping to save some money, is to put in place a hybrid model: “Run a data center and use clouds for overflow,” said Keahey, a model she is seeing more organizations embrace and adopt.
— Kate Keahey |
Judith Myerson, a systems engineer and architect who has written extensively about distributed systems, said enterprises must also decide whether they are looking for a solution for large-scale problems (a grid) or to temporarily extend resources (a cloud). Your budget, need to avoid latency, and the sensitivity of your data are other concerns.
While cloud computing has many advantages, most notably cost, “it’s not a good idea to put sensitive data on a public cloud,” cautioned Myerson, as it’s more susceptible to being hacked.
Similarly, while cloud computing seemsmuch less expensive on the surface compared to operating your own data center or running dozens of servers in house, she warned CIOs to be aware of hidden costs, such as higher network charges from service providers for applications containing terabytes or petabytes of data, as well as potential latency issues during peak times.
Interoperability can also be an issue with cloud computing. “For instance, if your company outsources its data to one cloud computing vendor, there may be problems changing over the applications [later on] to a different computing vendor due to proprietary APIs for exporting and importing data to a public cloud,” she noted.
The World Wide Grid?
But while clouds may be all the rage, Myerson still sees plenty of room for grids.
“There will be instances where thousands of computer workstations will [still] be needed for computationally intensive operations,” said Myerson, because doing that work in the cloud (at least right now) is too expensive.
Gentzsch agreed. “Clouds will not replace grids, as grids have not replaced capability HPC, over the last 10 years, as some have predicted,” he stated. All three technologies have their place, he believes. “What we will see over the next couple of years is that these different computing nodes will more and more grow together with the World Wide Web and the Internet, until all these resources become one global infrastructure for information, knowledge, computation and communication, the World Wide Grid.”
But Keahey offered a slightly different prediction. “I think it is more likely that grids will be re-branded or merge into cloud computing,” she said. “Grid computing helped create a certain technology reality which made clouds possible. And when it comes to IaaS [infrastructure as a service], I think in five years something like 80 to 90 percent of the computation we are doing could be cloud-based.”
Follow Enterprise Storage Forum on Twitter