Computing in High Energy and Nuclear Physics (CHEP) is a major series of conferences that has been held at roughly 18 month intervals since the 1980s, alternating between Europe, North America and other parts of the world. The latest meeting - CHEP '04 - was organized by CERN and took place in Interlaken from 26 September to 1 October 2004. As the conference chairman, Wolfgang von Rüden from CERN, pointed out in his welcome address, this was the last opportunity for CERN to organize the conference prior to the start-up of the Large Hadron Collider (LHC). An important theme therefore was to review progress in making the Grid a powerful and reliable computing resource in time for processing LHC data. The conference also aimed to learn from the experience of experiments that are currently running, to stay in touch with other sciences and to have a look into the future.
The conference began with several interesting reviews of computing at existing experiments. Amber Boehnlein from Fermilab described the software and computing facilities that have been developed to support the CDF and D0 experiments in Run II of the Tevatron. These experiments are already starting to experience data management and data-access rates on the scale expected in the first years of the LHC. Good progress has been made in handling event data and structured metadata with throughput disk caching of 60 terabytes (Tb) a day being reported by the CDF experiment.
Nobu Katayama of KEK reported on the situation at the Belle experiment at the KEKB accelerator, which is currently operating at the world's highest luminosity of 1034 cm-2 s-1. Belle accumulates more than 1.2 Tb of data each day and the complete dataset now exceeds 1.4 petabytes (Pb). Belle adopted a traditional computing model, which has been successfully implemented with very few people, and is planning to upgrade its system to prepare for the expected luminosity upgrade of the machine.
Peter Elmer, from Princeton, described the status of computing in the BaBar experiment at the PEP II collider at SLAC. BaBar has developed a highly distributed and automatic production facility for managing the generation of simulated data samples at more than 25 sites, as well as the full reconstruction of real data within 24 hours at a site remote from the collaboration centre (Padova). A sophisticated analysis model has also been introduced in the past year that makes more than 100 analysis-specific skims of events that can be accessed via the same distributed-computing system. It is clear that much of the knowledge and experience being gained from providing computing at running experiments can be abstracted and applied to the design and planning for future experiments with large-scale computing.
Unprecedented challenges
High-energy physics software and computing infrastructures will be confronted with unprecedented challenges by data from the LHC and much attention focused on reporting progress in preparations for the start-up in 2007. It is the job of the LHC Computing Grid Project (LCG) to prepare, deploy and operate the computing environment to provide a general service to all four LHC experiments. The enabling technology is Grid computing, which is a paradigm for creating a worldwide network of computers interconnected so that they perform as a coherent system. The LCG project aims to deploy and operate a Grid comprising as many as 100,000 of today's fastest PCs for storage and analysis of the ˜15 Pb of data that will be produced each year at the LHC.
The first production Grid (LCG-1) started to provide a bare-bones service to high-energy physicists a year ago, with at first only 12 centres involved. Currently version 2 of this Grid (LCG-2) has been deployed on some 80 computing centres in 25 countries contributing more than 9800 computers. Significant improvements in reliability, performance and scalability of the service offered by LCG-2 were reported, largely resulting from the improved stability of the middleware - the software that handles supply and demand of resources on the Grid effectively, as well as security and the other necessities that a distributed system entails.
However, LCG is by no means the only Grid project and particle physicists have access to substantial resources provided, for example, by NorduGrid, a Scandinavian Grid initiative, and Grid3, a consortium based in the US. In addition LCG is working closely with the EGEE project, which is funded to create a Europe-wide production-quality Grid infrastructure on top of present regional Grid programmes. The core infrastructure of the LCG and EGEE Grids is now operated as a single service spanning North America, the Asia-Pacific region and Europe, and encompasses other scientific disciplines in addition to high-energy physics. As Les Robertson, LCG project leader, pointed out, "It is clear that on the timescale of LHC start-up we are going to have to live with a few different middleware implementations and standards. Nevertheless strenuous efforts are being made to improve compatibility and inter-working."
Successful stress tests
LHC physicists reported on a number of "data challenges" designed to act as stress tests of the robustness, performance and quality of the complete computing infrastructure. The basic goal is to produce many millions of simulated events in a concentrated period and to analyse them, thereby exercising the full software chain. During 2004 more than 400 Tb of data have been generated and stored using new data-management software developed in the ROOT and POOL projects. Moreover, extensive comparisons with test-beam data have demonstrated the ability of physics models in the simulation engines (e.g. Geant 4 and Fluka) to predict the detector response to better than a few per cent. This validation activity is still in progress, but the vast improvements reported over the past two years have been achieved as a result of a huge combined effort by both experimentalists and simulation experts. Attention must now be given to providing a simple and transparent data-analysis framework so that physicists can get the fast delivery of physics results they need.
The data challenges also provide a way of measuring the progress made in deploying and operating resources on the Grid, as well as testing the production-software tools used to submit and manage jobs and the data they generate. For instance, one challenge used the Grid to process simulated LHC data, but at only one-quarter of the rate physicists expect from the collider. In another, more than 3500 production jobs were run concurrently on resources located in more than 30 computer centres. This is still far short of the requirements of the LHC, but with three years to go before the collider is ready to produce collisions the prospects for narrowing the gap look good.
Peter Clarke, from Edinburgh, reported on the status of the global wide-area network, which is more than ever taking its role as the great "enabler" for many branches of science and research. The LHC computing models estimate that a connectivity of around 100 Gbit/s between CERN and the Tier-1 centres will be required in order to distribute data for processing at remote sites. Clarke's main message was that wide-area networks currently provide our community with excellent performance and reliability, his primary concern being that they are in fact significantly underused at present. Most networks, such as GÉANT in Europe and ESNET in the US, currently have 10 Gbit/s backbones with 1.0-2.5 Gbit/s links into national networks, but they typically experience peak sustained loads of only 10-30%. High-energy physics traffic is barely visible. Clarke's appeal to the conference was for high-energy physics to perform network data challenges demonstrating sustained data flows of at least 1 Gbit/s between the main centres in 2005.
Ken Peach, of the Rutherford Appleton Laboratory, gave an extremely informative and entertaining talk on e-Science, which is already making a big impact on many scientific disciplines and facilitating new scientific discoveries. The new methodology of e-Science claims that by connecting different sources of data collected independently and analysing them with computers, new knowledge and understanding can be extracted. Policy-makers in government, academia, and industry are driving the initiative. Clearly massive data storage and large-scale computing are required, which explains the significant investment worldwide in support of underlying technologies, in particular Grid computing.
Delegates were also able to interact with 16 high-tech companies through the CHEP '04 Industrial Programme - coordinated by Chris Parkman and Evelyne Dho of CERN - which featured an exhibition and a number of special seminars. The involvement and support of these companies added a vital dimension to the overall success of the conference. In particular, CERN's partners in the CERN openlab for DataGrid applications (Enterasys, Hewlett-Packard, IBM, Intel and Oracle) not only gave generously to sponsor the event but also provided keynote speakers from their research labs in a special plenary session dedicated to looking at future technology trends. Special thanks are due to Enterasys, which provided the wireless network at the conference venue and donated wireless cards to conference delegates.
Jai Menon, from IBM, predicted continuous improvements in storage density and disk capacity such that by 2010 desktop commodity machines will have 1 Tb storage capacity on a single 2.5 inch (10 cm) disk. Large systems with 10,000 spinning disks will have a total capacity of 10 Pb, with around 7 Tb/s of streaming bandwidth. However, the traditional RAID (redundant array of independent disks) systems will not be sufficient to protect against multiple simultaneous disk failures and new schemes are under development to provide higher forms of redundancy at the expense of additional storage overhead. A great deal of growth in tape capacity is also foreseen and attempts will be made to demonstrate 8 and 16 Tb of storage on a single cartridge in the near future.
Stan Williams from HP reported on work that his group is doing on fundamental research in nanotechnology, which is expected to have an impact on a 10-20 year timescale. The combined wisdom of the semiconductor industry foresees feature sizes of semiconductor devices reducing to 65 nm by 2007-2008 and 45 nm by 2010-11 giving a 10 to 100-fold increase in performance over today's CPUs. Beyond that, problems arise due to fundamental limits on the ability to improve the electrical-power efficiency of traditional microprocessor chips as well as the relatively large number of "defects" that must be contended with at the nanometre scale. New CPU architectures that run at very low clock speeds but with many processes in parallel were presented as a way to achieve the best electrical-power efficiency.
Conference wrap-up
In the last talk of the conference Lothar Bauerdick from Fermilab gave his personal impressions of the conference highlights. He looked forward to the opportunity of working with people from existing experiments that are joining the LHC programme. He concluded that our Grid systems are successfully enabling broad participation although much needs to be done to improve the completion efficiency of the jobs that run and to use all the resources made available by Grids more effectively. He reiterated a comment, made earlier in the conference by CERN's David Williams, that the LHC experiments are running at the limit of what is feasible, not in funding, nor in complexity of the detectors, but rather in the possibility of keeping a large number of smart people working actively towards a common goal.
CHEP '04 was one of the best attended CHEP conferences with 520 delegates coming from all over the world. The organizers are very grateful to INTAS, UNESCO-ROSTE and the Abdus Salam International Centre for Theoretical Physics for agreeing to sponsor more than 25 of these delegates who would otherwise have been unable to attend. The Programme Committee considered a total of 493 abstracts of which 34 were scheduled as plenary talks, 219 more were scheduled for oral presentation in seven parallel sessions and 153 were presented as posters. They also edited and produced the proceedings published on DVD.
Thanks to the hard work of the enthusiastic team of organizers and student volunteers, led by Alan Silverman and Miguel Marquina of CERN, there were no noticeable problems during the whole week of the conference. The network worked perfectly with a total of 246 portables connected (and only four were infected with viruses, a good omen for the future). Even the weather was kind and permitted the delegates to explore the surrounding mountains and lakes in clement conditions during the half-day devoted to excursions.
By coincidence, the day of the conference banquet (29 September) was also CERN's 50th birthday, and the conference delegates celebrated the event in the presence of the director-general, Robert Aymar. His speech was transmitted live from Interlaken to the birthday party held near Geneva via a special teleconference link. In his closing address von Rüden warmly thanked all those who had contributed to the success of the conference and invited delegates to reconvene for CHEP '06 in February 2006 in the Tata Institute of Fundamental Research (TIFR), Mumbai, India, the venue selected by the International Advisory Committee.
Author:
John Harvey, CERN, chair of the CHEP '04 Programme Committee.