The commissioning of a new class of massively parallel computers, called QCDOC, should lead to important advances in the numerical study of quantum chromodynamics (QCD), the theory of quarks and gluons. Physicists from Columbia University, the University of Edinburgh, the RIKEN BNL Research Center and IBM's T J Watson Research Laboratory have designed and constructed three 10 teraflop computers optimized for lattice QCD calculations. One machine is now installed in Edinburgh for the use of the UKQCD collaboration, and two computers funded by the RIKEN Laboratory in Japan and the US Department of Energy (DOE) have been installed at the Brookhaven National Laboratory.

QCD, in which quarks and gluons participate in strong interactions and bind to form hadrons, has been viewed as a correct description of nature for the past 30 years. Nevertheless, low-energy QCD poses daunting theoretical challenges. So far this mathematically elegant theory has stymied traditional analytic approaches. However, propelled by the rapid advances in computer technology and a series of important algorithmic innovations, numerical simulation, known as lattice QCD, has proved to be a fruitful pursuit. It offers a "first-principles" demonstration of extraordinary phenomena such as the confinement of quarks, as well as increasingly accurate predictions of particle masses, matrix elements and collective behaviour under extreme conditions (see CERN Courier June 2004 p23).

The new computers, which are specially designed to tackle this problem, are called QCDOC for "QCD-on-a-chip", reflecting the system-on-a-chip architecture of the computer. Each of the large machines shown above contains 12,288 independent processing nodes. Each node is a stand-alone computer that is fabricated on a 50 million transistor chip and attached to a standard memory module, providing 128 MB of memory per node.

Because of the nearest-neighbour interaction that is inherent in lattice QCD, the processors in QCDOC are joined by a mesh network. A six-dimensional mesh is used to permit efficient calculation in four-dimensional space-time. The extra two dimensions permit four-dimensional surfaces to be folded into the computer and support a new five-dimensional lattice fermion formulation.

The QCDOC chip holds a complete PowerPC processor with a double-precision floating point unit, making the machine standards-compliant and easy to program. The six-dimensional network is fast with low latency, permitting a difficult physics problem to be finely divided among many nodes. With the 4 MB of memory inside each QCDOC chip, if the problem size per node is small, extra efficiency results from avoiding all off-chip memory access. The modular design and low power (8 W per node) make these computers inexpensive to operate and easy to maintain.

Physicists in the US and UK are now putting the three QCDOC machines to work. The bulk of the machine time is devoted to generating large Monte Carlo ensembles that will be used to study the QCD particle spectrum, K-meson decay matrix elements, nucleon structure, and topics in heavy-quark physics central to determining the parameters of the Cabibbo-Kobayashi-Maskawa matrix from experiments (see CERN Courier July/August 2005 p13). Also under way are similar studies of high-temperature QCD, the physics being explored in heavy-ion collisions.

The US QCDOC will be operated as part of the US National Lattice QCD Computing Project funded by the DOE. This project will also build and operate lattice QCD supercomputers based on commodity computer clusters and high-performance networks, such as InfiniBand. These clusters will be housed at Fermilab and at the Thomas Jefferson National Accelerator Facility. Clusters now have a cost effectiveness on lattice QCD code similar to the QCDOC. The QCDOC will play the major role in generating gauge configurations for the next few years, while the commodity clusters will initially be devoted to analysing these gauge configurations.

Compiled by Hannelore Hämmerle and Nicole Crémel