A new model for partnership between CERN and industry is both integrating and testing emerging computer technologies for the LHC Computing Grid.
Grid computing is the computer buzzword of the decade. Not since the World Wide Web was developed at CERN more than 10 years ago has a new networking technology held so much promise for both science and society. The philosophy of the Grid is to provide vast amounts of computer power at the click of a mouse, by linking geographically distributed computers and developing “middleware” to run the computers as though they were an integrated resource. Whereas the Web gives access to distributed information, the Grid does the same for distributed processing power and storage capacity.
There are many varieties of Grid technology. In the commercial arena, Grids that harness the combined power of many workstations within a single organization are already common. But CERN’s objective is altogether more ambitious: to store petabytes of data from the Large Hadron Collider (LHC) experiments in a distributed fashion and make the data easily accessible to thousands of scientists around the world. This requires much more than just spare PC capacity – a network of major computer centres around the world must provide their resources in a seamless way.
CERN and a range of academic partners have launched several major projects in order to achieve this objective. In the European arena, CERN is leading the European DataGrid (EDG) project, which addresses the needs of several scientific communities, including high-energy particle physics. The EDG has already developed the middleware necessary to run a Grid testbed involving more than 20 sites. CERN is also leading a follow-on project funded by the European Union, EGEE (Enabling Grids for E-Science in Europe), which aims to provide a reliable Grid service to European science. Last year, the LHC Computing Grid (LCG) project was launched by CERN and partners to deploy a global Grid dedicated to LHC needs, drawing on the experience of the EDG and other international efforts. This project has started running a global Grid, called LCG-1.
Enter the openlab
The CERN openlab for DataGrid applications fits into CERN’s portfolio of Grid activities by addressing a key issue, namely the impact on the LCG of cutting-edge IT technologies that are currently emerging from industry. Peering into the technological crystal ball in this way can only be done in close collaboration with leading industrial partners. The benefits are mutual: through generous sponsorship of state-of-the-art equipment from the partners, CERN acquires early access to valuable technology that is still several years from the commodity computing market the LCG will be based on.
In return, CERN provides demanding data challenges, which push these new technologies to their limits – this is the “lab” part of the openlab. CERN also provides a neutral environment for integrating solutions from different partners, to test their interoperability. This is a vital role in an age where open standards (the “open” part of openlab) are increasingly guiding the development of the IT industry.
The CERN openlab for DataGrid applications was launched in 2001 by Manuel Delfino, then the IT Division leader at CERN. After a hiatus, during which the IT industry was rocked by the telecoms crash, the partnership took off in September 2002, when HP joined founding members Intel and Enterasys Networks, and integration of technologies from all three led to the CERN opencluster project.
IBM joins CERN openlab to tackle the petabyte challenge
The LHC will generate more than 10 petabytes of data per year, the equivalent of a stack of CD ROMs 20 km high. There is no obvious way to extend conventional data-storage technology to this scale, so new solutions must be considered. IBM was therefore keen to join the CERN openlab in April 2003, in order to establish a research collaboration aimed at creating a massive data-management system built on Grid computing, which will use innovative storage virtualization and file-management technology.
IBM has been a strong supporter of Grid computing, from its sponsorship of the first Global Grid Forum in Amsterdam in 2001 to its participation in the European DataGrid project. The company sees Grid computing as an important technological realization of the vision of “computing on demand”, and expects that as Grid computing moves from exclusive use in the scientific and technical world into commercial applications, it will indeed be the foundation for the first wave of e-business on demand.
The technology that IBM brings to the CERN openlab partnership is called Storage Tank. Conceived in IBM Research, the new technology is designed to provide scalable, high-performance and highly available management of huge amounts of data using a single file namespace, regardless of where or on what operating system the data reside. (Recently, IBM announced that the commercial version will be named IBM TotalStorage SAN File System.) IBM and CERN will work together to extend Storage Tank’s capabilities so it can manage the LHC data and provide access to it from any location worldwide.
Brian E Carpenter, IBM Systems Group, and Jai Menon, IBM Research.
At present, the CERN opencluster consists of 32 Linux-based HP rack-mounted servers, each equipped with two 1 GHz Intel Itanium 2 processors. Itanium uses 64-bit processor technology, which is anticipated to displace today’s 32-bit technology over the next few years. As part of the agreement with the CERN openlab partners, this cluster is planned to double in size during 2003, and double again in 2004, making it an extremely high-performance computing engine. In April this year, IBM joined the CERN openlab, contributing advanced storage technology that will be combined with the CERN opencluster (see “IBM joins CERN openlab to tackle the petabyte challenge” box).
For high-speed data transfer challenges, Intel has delivered 10 Gbps Ethernet Network Interface Cards (NICs), which have been installed on the HP computers, and Enterasys Networks has delivered three switches equipped to operate at 10 gigabits per second (Gbps), with additional port capacity for 1 Gbps.
Over the next few months, the CERN opencluster will be linked to the EDG testbed to see how these new technologies perform in a Grid environment. The results will be closely monitored by the LCG project to determine the potential impact of the technologies involved. Already at this stage, however, much has been learned that has implications for the LCG.
For example, thanks to the preinstalled management cards in each node of the cluster, automation has been developed to allow remote system restart and remote power control. This development confirmed the notion that – for a modest hardware investment – large clusters can be controlled with no operator present. This is highly relevant to the LCG, which will need to deploy such automation on a large scale.
Several major physics software packages have been successfully ported and tested on the 64-bit environment of the CERN opencluster, in collaboration with the groups responsible for maintaining the various packages. Benchmarking of the physics packages has begun and the first results are promising. For example, PROOF (Parallel ROOT Facility) is a version of the popular CERN-developed software ROOT for data analysis, which is being developed for interactive analysis of very large ROOT data files on a cluster of computers. The CERN opencluster has shown that the amount of data that can be handled by PROOF scales linearly with cluster size – on one cluster node it takes 325 s to analyse a certain amount of data, and only 12 s when all 32 nodes are used.
Data challenges galore
One of the major challenges of the CERN opencluster project is to take maximum advantage of the partners’ 10 Gbps technology. In April, a first series of tests was conducted between two of the nodes in the cluster, which were directly connected (via a “back-to-back” connection) through 10 Gbps Ethernet NICs. The transfer reached a data rate of 755 megabytes per second (MB/s), a record, and double the maximum rate obtained with 32-bit processors. The transfer took place over a 10 km fibre and used very big frames (16 kB) in a single stream, as well as the regular suite of Linux Kernel protocols (TCP/IP).
The best results through the Enterasys switches were obtained when aggregating the 1 Gbps bi-directional traffic involving 10 nodes in each group. The peak traffic between the switches was then measured to be 8.2 Gbps. The next stages of this data challenge will include evaluating the next version of the Intel processors.
In May, CERN announced the successful completion of a major data challenge aimed at pushing the limits of data storage to tape. This involved, in a critical way, several components of the CERN opencluster. Using 45 newly installed StorageTek tape drives, capable of writing to tape at 30 MB/s, storage-to-tape rates of 1.1 GB/s were achieved for periods of several hours, with peaks of 1.2 GB/s – roughly equivalent to storing a whole movie on DVD every four seconds. The average sustained over a three-day period was of 920 MB/s. Previous best results by other research labs were typically less than 850 MB/s.
The significance of this result, and the purpose of the data challenge, was to show that CERN’s IT Division is on track to cope with the enormous data rates expected from the LHC. One experiment alone, ALICE, is expected to produce data at rates of 1.25 GB/s.
In order to simulate the LHC data acquisition procedure, an equivalent stream of artificial data was generated using 40 computer servers. These data were stored temporarily to 60 disk servers, which included the CERN opencluster servers, before being transferred to the tape servers. A key contributing factor to the success of the data challenge was a high-performance switched network from Enterasys Networks with 10 Gbps Ethernet capability, which routed the data from PC to disk and from disk to tape.
An open dialogue
While many of the benefits of the CERN openlab for the industrial partners stem from such data challenges, there is also a strong emphasis in openlab’s mission on the opportunities that this novel partnership provides for enhanced communication and cross-fertilization between CERN and the partners, and between the partners themselves. Top engineers from the partner companies collaborate closely with the CERN openlab team in CERN’s IT Division, so that the inevitable technical challenges that arise when dealing with new technologies are dealt with rapidly and efficiently. Furthermore, as part of their sponsorship, HP is funding two CERN fellows to work on the CERN opencluster. The CERN openlab team also organizes thematic workshops on specific topics of interest, bringing together leading technical experts from the partner companies, as well as public “First Tuesday” events on general technology issues related to the openlab agenda, which attract hundreds of participants from the industrial and investor communities.
A CERN openlab student programme has also been created, bringing together teams of students from different European universities to work on applications of Grid technology. And the CERN openlab is actively supporting the establishment of a Grid café for the CERN Microcosm exhibition – a Web café for the general public with a focus on Grid technologies, including a dedicated website that will link to instructive Grid demos.
Efforts are ongoing in the CERN openlab to evaluate other possible areas of technological collaboration with current or future partners. The concept is certainly proving popular, with other major IT companies expressing an interest in joining. This could occur by using complementary technologies to provide added functionality and performance to the existing opencluster. Or it could involve launching new projects that deal with other aspects of Grid technology relevant to the LCG, such as Grid security and mobile access to the Grid.
In conclusion, the CERN openlab puts a new twist on an activity – collaboration with leading IT companies – that has been going on at CERN for decades. Whereas traditionally such collaboration was bilateral and focused on “here-and-now” solutions, the CERN openlab brings a multilateral long-term perspective into play. This may be a useful prototype for future industrial partnerships in other high-tech areas, where CERN and a range of partners can spread their risks and increase their potential for success by working on long-term development projects together.
For more information about CERN openlab, see the website at www.cern.ch/openlab.