The recent luminosity upgrade of the Hadron Electron Ring Accelerator (HERA) at DESY has led to a sizable increase in electron-proton interaction rates (CERN Courier March 2005 p17). This means that much larger samples of simulated events are now required to exploit the new reach in physics. The enhanced detectors are also much more demanding in terms of computing power, and insufficient simulation capacity could lead to significant uncertainties in the analysis of the data. It has therefore been essential to more than double the simulation resources for Monte Carlo production. Fortunately, Grid technology is an attractive way to meet this challenge.

The ZEUS experiment at HERA is one of the first projects not related to the Large Hadron Collider (LHC) to employ Grid-based simulation successfully for day-to-day data analysis on a mass-production scale. After one year of study and development the ZEUS group commissioned its automated production environment, which is built on top of the LHC Computing Grid 2 (LCG-2) middleware suite, and started production in November 2004. The core site at DESY acts as the central hub, while most of the production jobs are sent to Grid clusters outside DESY.

The number of connected sites running ZEUS production jobs has grown steadily and now amounts to 25, including sites in Canada, Germany, Italy, the UK and Spain. These countries also have strong groups participating in the ZEUS experiment. Production has recently reached a total of about 70 million events (figure 1), with weekly rates of up to 12 million events. Figure 2 shows the relative contributions of the individual sites.

The ZEUS collaboration had in fact been pioneering distributed Monte Carlo production since 1996 with a script-based system named Funnel. This system served all of the collaboration's production needs until 2003. The ZEUS Grid scheme is now seamlessly integrated with Funnel, and Grid-based jobs have dramatically enhanced the production capacity: they are contributing to more than 60% of the total production and the figure is still rising. Most importantly, the collisions the ZEUS team simulates on the Grid are not for testing purposes; they are immediately used by the physics groups in their analysis of the HERA data.

The routine operation achieved shows that Grid technology is ready for today's particle-physics projects and that a successful connection has been made to the end-users requesting the events. The H1 collaboration at HERA is now also preparing to launch Grid-based Monte Carlo production.

Making it work

The DESY Grid infrastructure exploits the LCG-2 Grid middleware (a software suite developed for the global LHC data analysis), giving DESY a spot on the worldwide map of active LCG-2 sites. The DESY Production Grid provides core Grid services such as a resource broker, an information index, and replica and data management of the virtual organizations (VOs) maintained at DESY, including a basic level of computing resources; the bulk computing power is brought in by the experiments. The system enables opportunistic use of the resources to exploit CPU cycles and level out the peaks in demand. Computing resources at the University of Hamburg contribute to the worldwide computing effort for the Compact Muon Solenoid experiment at the LHC.

In addition to hosting and supporting dedicated VOs for the HERA experiments H1 and ZEUS, the infrastructure at DESY also fosters the Grid activities of the lattice quantum chromodynamics community in the framework of the International Lattice Data Grid and the International Linear Collider groups.

Storage is an important element of the DESY infrastructure. While the Grid infrastructures deployed today generally enable distributed scientific communities to collaborate and share resources, extra capabilities are needed to cope with the challenges associated with scientists accessing and manipulating very large, distributed collections of data. In co-operation with Fermilab, DESY has developed a Grid Storage Element (SE) that consists of dCache as the core storage system and an implementation of the Storage Resource Manager (SRM), which was developed by a collaboration between European and US Grid groups. The SE allows both local access (POSIX-like) and Grid access (GridFTP) to mass-storage facilities based on hierarchies of tape and disk technology as well as on disk-only configurations. The SRM protocol supports secure data transfers with protocol negotiation and reliable replication mechanisms over wide-area networks. It has become a standard for Grid interfaces to managed storage with existing implementations that have been deployed for production use at major high-energy physics laboratories including CERN, DESY and Fermilab. Many LCG sites are expected to follow soon.

Access to the entire DESY data space of 0.5 petabytes is provided by the SE based on dCache, a software-only Grid storage appliance that can manage the storage and exchange of hundreds of terabytes of data, transparently distributed among dozens of disk storage nodes (see figure 3). Although the location and multiplicity of data is autonomously determined by the system, the name space is uniquely represented in a single file-system tree.

Efficiency improvements

Designed as a caching system, dCache has been found to improve significantly the efficiency of back-end storage systems such as serial media (magnetic tape). Upon detecting hot spots, the aggregate throughput to client applications increases by dynamically replicating files over multiple storage nodes. The system will tolerate failures in its data servers, allowing administrators to use commodity computing and disk-storage components. The increasing number of sites using dCache technology demonstrates that it is well suited to solving many storage and data-distribution issues in large and small institutions.

The HERA experiments have shown that Grid technology is ready to be employed in ongoing research programmes, and it is being explored for use in applications beyond its present scope. Grid computing and the development of novel storage technologies are strategic components in DESY's research programme. They are viewed as essential ingredients for conducting high-energy physics experiments and for participating in future global projects, such as developing and operating an international linear collider within a worldwide collaboration.

Further reading

For details about dcache see www.dcache.org.
For more information about the Storage Resource Manager see http://sdm.lbl.gov/srm-wg.