The grid can be not just a tool for a homogeneous set of calculations like SETI, but for co-operative use of a number of different applications resources, for design, stress testing, materials requirements planning and budgeting in a construction project, but distributed geographically.
Different modes of interaction will be required, for example, peer-to-peer and client-server.
Even routine complex commercial tasks can be fragmented and put out on the grid, as a secure model of distributed computing, say the proponents of the Open Grid Services Architecture.
These different modes of relationships are discussed in a seminal paper, “The Anatomy of the Grid”.
Besides applications there will be specific resources such as computer cycles, storage, and special peripheral equipment that needs to be brought into play “on the fly”, according to the demands of the problem at the time.
A companion paper, “The Physiology of the Grid”, details the practical ways in which these capabilities are invoked and terminated. This highlights the complexity of dealing with the interworking of computers, many of which belong to organisations beyond the bounds of the initiating organisation. The grid promoters have devised, as part of the Open Grid Services Architecture, procedures for authenticating the user of processes to the provider and vice-versa. A process cannot be simply set to run until the calculation is done; it may jam or fail, or communication from the initiating computer may be disrupted.
This would mean the invoked process continues to use resources to no purpose. So the process is set up with an initial lifetime, and repeated “keep-alive” messages are sent from the initiating process for as long as it is needed. If something fails, the invoked process will stop receiving messages, terminate and free up resources.
The grid maps well on to the notion of a “virtual organisation”, a co-operative effort by various organisations or parts of organisations, linked specifically for the duration of that project.
As Michael Nelson of IBM says, the arrival of web services mates nicely with the concept of the grid. OGSA uses pre-established web services standards like the simple object access protocol (Soap), which defines both standards for passing both invocations of remote procedures and messages, including input data for the remote application to chew on.
The contributing real organisations will have to set up their own constraints as to what can be done on their systems at what time, so as not to steal capacity or resources from their own tasks.
Parts of the grid may operate in a hierarchical way, with one resource whose services have been requested, in turn delegating parts of the task to another.
The fundamental property of a grid is interoperability, says the "anatomy" paper, and this requires a set of protocols for consistent communication, as well as APIs defining standard interfaces for invoking various resources – both applications and generic resources like security services. One protocol may carry messages and information into several APIs, and one API may respond to a number of protocols.
An important principle is to provide a simple restricted set of abstractions for data and processes, which can be invoked by many different kinds of requests, serviced by many different technologies. The power and flexibility of the components of the grid is immense, but the protocols and APIs provide a narrow “neck” through which requests are passed. This the authors call the “hourglass model”.
Routines for effective use of resources, for example discovery of their current state and reservation of resources in advance for a particular timeslot can be provided either by the supplier of the resource or contained in a separate grid toolkit.
Workload management is naturally an ingredient of grid operation, providing for co-ordination of tasks and authorisation to perform tasks, much as it does within a conventional organisation.
The machines and processes must be interoperable, but at the same time each process must be able to use particular capabilities of the hardware or software it is on, such as a particular application or database, an ability to perform rapid calculations of a certain type, or a connection to specialist peripherals.
This demands an elaborate combination of a standard virtual machine summarising the capability that all platforms should have, and a series of special-purpose platform definitions with some kind of standard interface with the rest of the grid. The latter is necessary to avoid falling into the “least common multiple” trap often associated with virtual machines; that to represent only the common characteristics of all actual machine is to cripple most of them.
Previous attempts at implementing the principle that “the network is the computer” were based on local computing facilities sitting self-sufficiently on the edge of a huge network. This overlooked the fact that the “network” linking machines and processes at the local end needs managing in a similar way, say the OGSA creators. It is effectively part of the grid.
One of the major challenges, they say, is the melding of grid applications with legacy applications on the local machine.
What commercial IT really wants is the return of the mainframe’s tight centralised control and reliability, the OGSA champions say.
“New applications are being developed to programming models such as the Enterprise Java Beans model, that insulate the application from the underlying computer platform and support portable deployment across multiple platforms."
Fast services required for e-business has led to further fragmentation, and a need for “reintegration”. Many of these problems, they say, can be traced to an inadequate infrastructure. OGSA is a way of providing a more stable, reliable, secure and scalable framework to underlie distribution.