Gaël Varoquaux

Thu 19 September 2013

←Home

Publishing scientific software matters

Christophe Pradal, Hans Peter Langtangen, and myself recently edited a version of the Journal of Computational Science on scientific software, in particular those written in Python. We wrote an editorial defending writing and publishing open source scientific software that I wish to summarize here. The full text preprint is openly available in my publications list as always. It includes, amongst other things, references.

Software is a central part of modern scientific discovery. Software turns a theoretical model into quantitative predictions; software controls an experiment; and software extracts from raw data evidence supporting or rejecting a theory. As of today, scientific publications seldom discuss software in depth, maybe because it is both highly technical and a recent addition to scientific tools. But times are changing. More and more scientific investigators are developing software and it is important to establish norms for publication of this work. Producing scientific software is an important part of the landscape of research activities. Very visible scientific software is found in products developed by private companies, such as Mathwork’s Matlab or Wolfram’s Mathematica, but let us not forget that these build upon code written by and for academics. Scientists writing software contribute to the advancement of Science via several factors.

First, software developed in one field, if written in a sufficiently general way, can often be applied to advance a different field if the underlying mathematics is common. Modern scientific software development has a strong emphasis on generality and reusability by taking advantage of the general properties of the mathematical structures in the problem. This feature of modern software help close the gap between fields and accelerate scientific discovery through packaging mathematical theories in a directly applicable way.

Second, the public availability of code is a corner stone of the scientific method, as it is a requirement to reproducing scientific results: “if it’s not open and verifiable by others, it’s not science, or engineering, or whatever it is you call what we do.” (V. Stodden, The scientific method in practice). Emphasizing code to an extreme, Buckheit and Donoho have challenged the traditional view that a publication was the valuable outcome of scientific research: “an article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment […]”.

It is important to keep in mind that going beyond replication of results requires reusable software tools: code that is portable, comes with documentation, and, most of all, is maintained throughout the years. Indeed, software development is a major undertaking that must build upon best practices and a quality process. Reversing Buckheit and Donoho’s argument, publications about scientific software play an increasingly important part in the scientific methodology. First, in the publish-or-perish academic culture, such publications give an incentive to software production and maintenance, because good software can lead to highly-cited papers. Second, the publication and review process are the de facto standards of ensuring quality in the scientific world. As software is becoming increasingly more central to the scientific discovery process, it must be subject to these standards. We have found that writing an article on software leads the authors to better clarify the project vision, technically and scientifically, the prior art, and the contributions. Last but not least, scientists publishing new results based on a particular software need an informed analysis of the validity of that software. Unfortunately, much of the current practice for adopting research software relies on ease of use of the package and reputation of the authors.

[…]

Today, software is to scientific research what Galileo’s telescope was to astronomy: a tool, combining science and engineering. It lies outside the central field of principal competence among the researchers that rely on it. Like the telescope, it also builds upon scientific progress and shapes our scientific vision. Galileo’s telescope was a leap forward in optics, a field of investigation that is now well established, with its own high-impact journals and scholarly associations. Similarly, we hope that visibility and recognition of scientific software development will grow.

Go Top