Geant4 is a toolkit to simulate particle interactions with matter (Agostini et al. 2003). The simulated particles are propagated through magnetic and electrical fields, and through the materials of the detectors. It is used in high-energy physics experiments (BaBar, HARP, ATLAS, CMS, LHCb, etc), as well as in other fields such as space science, medical physics and radiation studies (Pia and Knobloch 2002, Allisson et al. 2006).
Geant4 consists of roughly half a million lines of C++ code containing components to model in detail the geometry and materials of complex particle detectors, in addition to a comprehensive set of electromagnetic and hadronic physics processes. Given its complexity and its wide range of applications it is mandatory to carefully and thoroughly test and validate each of its components, especially before major releases, generally twice a year.
During the testing phase we perform a set of regression tests, comparing two Geant4 versions to detect any significant difference between them by looking at calorimeter observables. The idea is to check automatically, using a variety of statistical tests, a large number of distributions. For those that are statistically different we produce a figure showing the two distributions in the same plot. Finally, this set of figures needs to be visually examined by a Geant4 expert in order to understand the cause of such difference, either a statistical fluctuation or a bug or a change in the physical models.
A test suite consisting of a simple, configurable, cylindrical sampling calorimeter aims to reproduce, in a simplified way (without the geometrical details or the instrumental effects of a real calorimeter), the major calorimeter types used in the LHC experiments: ion–scintillator; copper–scintillator; copper–liquid argon and tungsten–liquid argon (LHC hadronic calorimeters); lead–scintillator; lead–liquid argon; and PbW04 (LHC electromagnetic calorimeters). As particles enter the calorimeter, we consider positive and negative pions; positive, negative and neutral kaons; and protons, neutrons and electrons.
Beam energies range from 1 GeV to 300 GeV – 23 energies in total. The motivation for having so many values is that showers are largely determined by the very first interaction. In addition there are several physics models available in Geant4 (during the regression tests we test five different physics lists, each of them being a consistent set of five physics models) that can be activated only within a certain kinetic energy interval of the incoming particle.
The observables considered are the visible energy deposited in each active layer, the total energy deposited in the whole calorimeter, the energy deposited in each active layer and the energy deposited in each radial ring.
Each parameter combination (energy–particle–material–physics list) defines a job and each job will produce 5000 events for the reference Geant4 version and 5000 events for the candidate one. Each set of events is stored in a separate n-tuple. Then a stand-alone program reads the two n-tuples and performs the statistical tests between all the pairs of observable distributions.
Geant4 in the Grid
Since December 2004, the regression tests of Geant4 have been executed in the Grid environment. To test the various hadronic models in a large number of configurations of beam particle type, beam energy, materials and physics configurations we need a few years of CPU time. All of this CPU time is needed during a short period, about two weeks before a Geant4 release.
During the first production the Worldwide LHC Computing Grid (WLCG; http://cern.ch/lcg) team created an initial framework that basically used the Grid in order to submit, keep track of and retrieve the outputs of large bunches of jobs. The main purpose of this tool was to hide much of the complexity of the Grid environment and to involve the new users in an easy, fast and Grid-unaware way.
During the first week of production we distribute the Geant4 software (also containing the reference version) to all participating sites. The software is provided in a tar file format. All sites require locally 5 Gbyte of mirrored disk space in a file shared system, accessible by all worker nodes. During the software installation phase we distribute this tar file into the shared file system at all sites and perform a small production to validate this installation. Thanks to this small production we can select the sites that will run the production. In addition we can aid those sites having any configuration problems by debugging their systems.
During the second week we perform the whole production, sending bunches of jobs – one for each physics list. In this phase the Geant4 developers continuously check the results and fix any eventual problems in the new candidate.
After the first production with Geant4 on the Grid the production tool was extended and generalized for any community or experiment. Since December 2005 the Geant4 collaboration has been a fully recognized virtual organization in the WLCG/Enabling Grids for E-sciencE environment (EGEE; http://public.eu-egee.org). Several sites have provided the required resources, guaranteeing the success of the production and achieving an efficiency of 100% during the last two productions. Within the Geant4 production we define efficiency by the number of submitted jobs compared with the number of successful jobs returning the desired output.
In terms of resources and services, Geant4 asks for a minimum of 120 CPUs for the whole production. Normally CERN provides 50% of the required CPUs. The rest is spread over 30 sites currently supporting the Geant4 collaboration.
Most of the Grid services are also provided by and centralized at CERN. The rest of the sites ensure access to their batch system with dedicated queues for Geant4 and access to the shared file system for software installation purposes.
The total output of the production, about 20 Gbyte, is fully retrieved in an AFS area of Geant4 created for this purpose.
Optimization of the Geant4 production
In June 2006 we included the Ganga/DIANE framework developed by the ARDA (A Realisation of Distributed Analysis) team at CERN, to increase the performance of the Geant4 production and to optimize the use of the Grid resources (Ganga: A Grid User Interface, http://cern.ch/ganga; DIANE: a Distributed Analysis Environment, http://cern.ch/DIANE; ARDA; http://cern.ch/arda).
Ganga is a tool jointly developed for the ATLAS and LHCb experiments at CERN. Ganga simplifies access to the Grid for the end users and is a convenient job management tool. The history of the user jobs is stored in a personal job repository; the status of submitted and running jobs is monitored and the output automatically retrieved. Ganga provides an extensible framework that supports various applications and helps to configure them, effectively hiding any differences between running the application locally or on the Grid. A user may interact with Ganga via a command line or a graphical user interface. Python is used as the interface language. Ganga is a tool suitable for running analysis jobs in high-energy physics but may also be used for any user Grid activities.
DIANE is a Grid optimization layer. When the user jobs are submitted to the Grid, DIANE creates an overlay "virtual cluster" using available worker nodes. The cluster consists of a "master" agent and a number of "worker" agents. User input is split into a large number of fine-grained tasks. The execution of the tasks is coordinated by the master agent providing automatic load balancing and instant error recovery. The task execution flow and error recovery policies may be easily customized according to the specific application needs.
DIANE improves the job turnaround time and reliability of the application execution. It also enables estimates of the job completion time based on the delivered partial output.
The Ganga/DIANE framework greatly improved the performance of the Geant4 production. In addition, the run completion time is more predictable and allows a better planning of the production and less intervention from the production operator.
Further reading
S Agostini et al. 2003 Geant4: a simulation toolkit, NIMA 506 250–303
J Allison et al. 2006 Geant4 developments and applications, IEEE Transactions on Nuclear Science 53 270–278
M G Pia and J Knobloch 2002 Particle physics software aids space and medicine CERN Courier 42 (5) 33; www.cerncourier.com/main/article/42/5/17/1