In recent years a number of Grid projects have emerged to help coordinate institutions and enable Grids. Today we face a situation where most of these projects, many of which have a strong regional presence, have slightly different middleware.

Establishing interoperation between Grids is vital to bridge these differences and enable virtual organizations to access resources at the institutions, independent of the Grid project's affiliation. Without Grid interoperation, collaboration would be artificially limited to one Grid or the collaboration would have to create multiple virtual organizations and manage the diversity itself.

Anatomy of the Grid

To understand the nature of this problem it is worth stepping back and comparing the situation today with the original concept of a Grid. In the influential paper The Anatomy of the Grid [1], a Grid is defined as being "coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations".

In this statement there are three fundamental entities: resources, institutions and virtual organizations. A virtual organization is a group of users from multiple institutions who collaborate to achieve a specific goal. An institution is an administrative domain and has complete control over the resources within its domain. Institutions support a virtual organization and hence allow users, who may belong to different institutes, access to the resources.

As each institution has control over its domain, each institution may have different systems and policies. To overcome this heterogeneity, Grid middleware is used to provide an interface at the boundary of this administrative domain.

Grid middleware follows the "hour glass" model. At one end there is a diverse set of resources and at the other end there are many virtual organizations that have their own applications. The applications can gain access to the heterogeneous resources through a small set of well defined interfaces. As different Grids have their own middleware and polices, they can also be seen as different administrative domains. In a sense, the challenge of Grid interoperation can be viewed as a problem analogous to that of users accessing resources at different institutions, but now with virtual organizations accessing resources on different Grid infrastructures.

Understanding interfaces

How can interoperation between Grids be achieved? The first thing to do is compare the interface at the boundary of the administrative domain for both Grids. Once this has been done, it should be possible to create an interoperability matrix between two infrastructures. This matrix will reflect the similarities and differences of the interface. Not only is it necessary to compare the interfaces, it is also important to understand how the interfaces are used. Once the differences have been understood, steps can be taken to overcome these differences.

Common interfaces seem to be the most straightforward approach. However, with the absence of standards, which interface should one choose?

As a Grid infrastructure has invested heavily in one interface, it may be difficult to move to another interface. So although a common interface is the ideal solution, reaching agreement on which interface to use and the deployment of a production quality implementation across all infrastructures will take time.

Adapters and gateways

In the short term, adapters and translators can be used in the higher level services so that the software can work with both interfaces. Adapters bridge incompatible interfaces, and translators convert information to a format that another system can understand. This approach requires some parts of the Grid middleware to be modified, but it does mean that it will be possible to use the existing interfaces. This method will enable interoperation to be achieved without having to modify the existing interfaces, which means the institutes will not be affected. Where and how the adapters and translators are used will highlight areas that need standardization.

If it is not possible to modify the higher level services, then gateways can be used. A gateway is a bridge between Grid infrastructures. It uses the same technique as adapters and translators but the gateway is a specific service, it is not built into the middleware. The problem with Grid gateways is that they can be a single point of failure and also a scalability bottleneck. As such they are only useful as a temporary solution.

Grid operations

Once technical interoperability has been achieved, it is important to start looking at Grid operations, which cover everything that is needed to operate a Grid infrastructure. This includes service monitoring, user support, resource accounting, problem resolution and so on – issues that are all procedure oriented.

The support teams within the different Grids may rely on different software tools but it is not necessary to harmonize these tools. However, it must be ensured that the tools from one Grid infrastructure will work on the others. The procedures used on each Grid infrastructure need to be analysed to ensure that the necessary operations can still be carried out with the additional institutions and virtual organizations. For example, ways to route trouble tickets between Grid operations centres need to be investigated.

EGEE activities

Grid interoperation is usually a bilateral activity between two Grid infrastructures. One of the first interoperation activities was between the infrastructure used by the Enabling Grids for E-science (EGEE) project and the Open Science Grid.

The initial analysis showed that the middleware used by both infrastructures was similar, and for this reason both infrastructures decided to use common interfaces. After an initial proof of concept was carried out in January 2005, the changes needed were integrated in the middleware stacks for the respective infrastructures. It took about six months for these changes to be included in the official software releases and rolled out across the infrastructures in August 2005.

After this work was done, discussions moved towards Grid operations. Policies had to be aligned and modifications were needed in the operational tools. After about another six months, by January 2006, the Grid infrastructures were seamlessly interoperating and virtual organizations were successfully using both infrastructures.

There are now several ongoing bilateral activities. The activity between EGEE and the Nordic DataGrid Facility is well under way and is trying to use a combination of gateways alongside adapters and translators to achieve interoperation. The results of initial testing look promising and discussions on Grid operations have begun. Other infrastructures where EGEE is involved in bilateral activities include NAREGI, Unicore and ChinaGrid.

Moving forward with GIN

Within the Open Grid Forum, the Grid Interoperability Now (GIN) community group has been trying to build upon these bilateral activities. The GIN group is a focal point where all the infrastructures can come together to share ideas and experiences on Grid interoperation. It is hoped that each bilateral activity will bring us a step closer to the overall goal of a uniform Grid landscape. The recent achievements of the GIN group will be demonstrated at Supercomputing 2006 at Tampa in November.

Grid evolution

To achieve interoperation, interoperability between middleware is not the only issue. As Grid infrastructures evolve, different middleware will be used. Of primary importance, then, is to ensure that even if the middleware evolves, interoperation is maintained. Using different Grid middleware stacks is a major obstacle but the problem can be overcome.

Production implementations of real Grid standards would go a long way to help harmonize the Grid middleware stacks and help interoperability. However, the current Grid paradigm is "a Grid of Grids" – different Grid federations working together to provide a seamless Grid infrastructure. As a result, even with technical interoperability assured, a truly federated Grid will bring a whole new set of operational challenges.

References

[1] I Foster et al. 2001 Intl J. of High Performance Computing Applications 15 3 200–222
[2] Maps generated by Google Earth (http://earth.google.com) using configuration file at http://lfield.web.cern.ch/lfield/gin.kml.