The well-established governance, licensing and collaborative mechanisms of
open-source software set a standard for open-science movements.
Of all the movements to make science and technology more open, the oldest is “open source” software. It was here that the “open” ideals were articulated, and from which all later movements such as open-access publishing derive. Whilst it rightly stands on this pedestal, from another point of view open-source software was simply the natural extension of academic freedom and knowledge-sharing into the digital age.
Open-source has its roots in the free software movement, which grew in the 1980s in response to monopolising corporations and restrictions on proprietary software. The underlying ideal is open collaboration: peers freely, collectively and publicly build software solutions. A second ideal is recognition, in which credit for the contributions made by individuals and organisations worldwide is openly acknowledged. A third ideal concerns rights, specifically the so-called four freedoms granted to users: to use the software for any purpose; to study the source code to understand how it works; to share and redistribute the software; and to improve the software and share the improvements with the community. Users and developers therefore contribute to a virtuous circle in which software is continuously improved and shared towards a common good, minimising vendor lock-in for users.
Today, 20 years after the term “open source” was coined, and despite initial resistance from traditional software companies, many successful open-source business models exist. These mainly involve consultancy and support services for software released under an open-source licence and extend beyond science to suppliers of everyday tools such as the WordPress platform, Firefox browser and the Android operating system. A more recent and unfortunate business model adopted by some companies is “open core”, whereby essential features are deemed premium and sold as proprietary software on top of existing open-source components.
Open collaboration is one of CERN’s founding principles, so it was natural to extend the principle into its software. The web’s invention brought this into sharp focus. Having experienced first-hand its potential to connect physicists around the globe, in 1993 CERN released the web software into the public domain so that developers could collaborate and improve on it (see CERN’s ultimate act of openness). The following year, CERN released the next web-server version under an open-source licence with the explicit goal of preventing private companies from turning it into proprietary software. These were crucial steps in nurturing the universal adoption of the web as a way to share digital information, and exemplars of CERN’s best practice in open-source software.
Nowadays, open-source software can be found in pretty much every corner of CERN, as in other sciences and industry. Indico and Invenio – two of the largest open-source projects developed at CERN to promote open collaboration – rely on the open-source framework Python Flask. Experimental data are stored in CERN’s Exascale Open Storage system, and most of the servers in the CERN computing centre are running on Openstack – an open-source cloud infrastructure to which CERN is an active contributor. Of course, CERN also relies heavily on open-source GNU/Linux as both a server and desktop operating system. On the accelerator and physics analysis side, it’s all about open source. From C2MON, a system at the heart of accelerator monitoring and data acquisition, to ROOT, the main data-analysis framework used to analyse experimental data, the vast majority of the software components behind the science done at CERN are released under an open-source licence.
The success of the open-source model for software has inspired CERN engineers to create an analogous “open hardware” licence, enabling electronics designers to collaborate and use, study, share and improve the designs of hardware components used for physics experiments. This approach has become popular in many sciences, and has become a lifeline for teaching and research in developing countries.
Being a scientist in the digital age means being a software producer and a software consumer. As a result, collaborative software-development platforms such as GitHub and GitLab have become as important to the physics department as they are to the IT department. Until recently, the software underlying an analysis has not been easily shared. CERN has therefore been developing research data-management tools to enable the publication of software and data, forming the basis of an open-data portal (see Preserving the legacy of particle physics). Naturally, this software itself is open source and has also been used to create the worldwide open-data service Zenodo, which is connected to GitHub to make the publication of open-source software a standard part of the research cycle.
Interestingly, as with the early days of open source, many corners of the scientific community are hesitant about open science. Some people are concerned that their software and data are not of sufficient quality or interest to be shared, or that they will be helping others to the next discovery before them. To triumph over the sceptics, open science can learn from the open-source movement, adopting standard licences, credit systems, collaborative development techniques and shared governance. In this way, it too will be able to reap the benefits of open collaboration: transparency, efficiency, perpetuity and flexibility.