Software code used for the COVID-19 simulation models
Dear Imperial College London,
I refer to the paper titled "Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand" published by the Imperial College COVID-19 Response Team on 16 March 2020.
Within the text of the paper, it states:
"We modified an individual-based simulation model developed to support pandemic influenza planning [Ferguson 2006], [Halloran 2008] to explore scenarios for COVID-19 in GB. The basic structure of the model remains as previously published. In brief, individuals reside in areas defined by high-resolution population density data. Contacts with other individuals in the population are made within the household, at school, in the workplace and in the wider community. Census data were used to define the age and household distribution size. Data on average class sizes and staff-student ratios were used to generate a synthetic population of schools distributed proportional to local population density. Data on the distribution of workplace size was used to generate workplaces with commuting distance data used to locate workplaces appropriately across the population. Individuals are assigned to each of these locations at the start of the simulation."
* https://www.imperial.ac.uk/mrc-global-in...
In the accompanying audio interview Professor Azra Ghani said:
[Minute 6:18] "Much of the computer code that we would use for this has been developed over a number of years, there have been pandemic planning models in the department for many many years now, and we've also built up a whole set of code that we've applied recently for example to the ebola outbreak in Africa, so a lot of the tools are there, but what we find is that every little bit about a new disease is slightly different, the data will have a slightly different format, or the statistics that we need to use will need to be modified. So whilst we have a lot of this machinery ready to go, it isn't really the case of just pressing the button, there is quite a lot of scientific input."
* https://www.imperial.ac.uk/news/196137/c...
I have searched very hard for where your code might be published, but have been unable to find it. Therefore, under the Freedom of Information Act 2000, please can I be sent copies of:
(1) The source code for the simulation models being used in the above paper;
(2) The documentation to compile and run the simulation models;
(3) The data used as input to the the simulation models that is either public or you have permission to publish, or the anonymized synthetic data derived from the unpublishable data;
(4) The scrapers used to download the data.
I do hope that this request does not cause too much disruption during this very busy time. However, it is a very important matter and it should have been done long ago. (If the code is already published, and I was merely unable to find it, please accept my sincerest apologies for meddling in this matter.)
In case there are any problems, I would like to refer you to the following articles:
* From an editorial in the Journal of Nature, "The case for open computer programs" https://www.nature.com/articles/nature10...
"We argue that, with some exceptions, anything less than the release of source programs is intolerable for results that depend on computation."
* From Galaxy and HyPhy developments teams, bioRXiv preprints:
"No more business as usual: agile and effective responses to emerging pathogen threats require open data and open analytics" https://www.biorxiv.org/content/10.1101/... :
"The current state of much of the Wuhan pneumonia virus (COVID-19) research shows a regrettable lack of data sharing and considerable analytical obfuscation. This impedes global research cooperation, which is essential for tackling public health emergencies, and requires unimpeded access to data, analysis tools, and computational infrastructure. Here we show that community efforts in developing open analytical software tools over the past ten years, combined with national investments into scientific computational infrastructure, can overcome these deficiencies and provide an accessible platform for tackling global health emergencies in an open and transparent manner."
* A column written by a professional software engineer in Journal of Nature
"Publish your computer code: it is good enough" https://www.nature.com/news/2010/101013/... addressing the common concerns and excuses for not publishing scientific code.
Yours faithfully,
Julian Todd
Dear Mr Todd,
This is to acknowledge receipt of your request below, made under the Freedom of Information Act. The College will respond to your request by 16 April 2020.
Yours,
Freedom of Information Team
Imperial College London
Dear Mr Todd,
Imperial College is committed to fostering best practice in data
management and to facilitate free and timely open access to data so that
they are intelligible, assessable and usable by others.
Professor Ferguson has advised that his team are working towards making
further information, including the simulation model code used to produce
the recently published paper "Impact of non-pharmaceutical interventions
(NPIs) to reduce COVID-19 mortality and healthcare demand", publicly
available and they hope to do so shortly. I am sure you will appreciate
that the team is very busy at the moment, so we are not able to commit to
a specific date at this point. You may wish to follow Professor Ferguson’s
Twitter feed which currently states that the team are working with
Microsoft and GitHub to document, refactor and extend the code to allow
others to use it.
Information that is intended for future publication is exempt from the
Freedom of Information Act by virtue of Section 22 of the Act. It is a
qualified exemption that requires the College to weigh the public interest
in disclosing the information requested against the public interest in
maintaining the exemption. We recognise that there is a strong public
interest in the disclosure of information relating to the research being
done into Covid-19. However, the question to be decided is whether the
public interest will be better served by disclosing the information now in
response to your request, or in the near future as is planned. In the
present circumstances, our view is that the public interest is best served
by allowing our academics to set their own priorities at this very busy
time for them and to publish further information relating to the impact of
NPIs research in due course.
There is a further exemption at Section 22A applicable to information held
on an ongoing programme of research where there is an intention to publish
a report of the research. This also requires consideration of the public
interest, to which the same factors as outlined above would apply.
There is related information already in the public domain that may be of
interest. All the code and data for reproducing the paper “Estimates of
the Severity of COVID-19 disease”, are available here by following the
links below.
[1]https://github.com/mrc-ide/COVID19_CFR_s...
The code is distributed under the MIT licence (LICENSE file within the
repository).
Similarly, the software that these scripts use to perform the analysis is
published under the MIT license, and can be found here:
[2]https://github.com/mrc-ide/drjacoby
If you are unhappy with the way that we have handled your request, you can
ask us to conduct a review. Please make your representation in writing
within 40 days of the date you received this response. If you remain
dissatisfied with how Imperial College has handled your request, you may
then approach the [3]Information Commissioner’s Office.
Yours,
Freedom of Information Team
[4]Imperial College London
Dear Imperial College London,
Thank you for your reply on 16 April 2020 refusing my FOI request made on 17 March 2020 for the software code supporting the research paper titled "Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand" published by the Imperial College COVID-19 Response Team on 16 March 2020.
https://www.imperial.ac.uk/mrc-global-in...
I'm afraid that the exemptions given in Sections 22 and 22A of the Freedom of Information Act do not apply to this request.
Section 22(1)(b) states: "Information is exempt information if... the information was already held with a view to such publication at the time when the request for information was made"
According to their own statements, the team has been working on this code for over ten years. But at no time prior to 17 March 2020 have they given any indication in any of their scientific output that they held any view on publishing the code.
What we do have are Professor Ferguson's tweets on 22 March 2020, saying:
"I'm conscious that lots of people would like to see and run the pandemic simulation code we are using to model control measures against COVID-19. To explain the background - I wrote the code (thousands
of lines of undocumented C) 13+ years ago to model flu pandemics...
"They are also working with us to develop a web-based front end to allow public health policy makers from around the world to make use of the model in planning. We hope to make v1 releases of both the source and front end in the next 7-10 days..."
https://twitter.com/neil_ferguson/status...
And we also have Oral evidence he gave to the House of Commons Science and Technology Committee, 25 March 2020:
"(Q32) On the question of modelling across the whole country, yes, we are intending to roll out models across the whole of Europe and allow policymakers to use our model in different settings, and indeed we will be releasing the open source code in the next week or so."
https://committees.parliament.uk/work/91...
Both of these statements were made after my request on 17 March.
Now, this might seem like a technical point, but the Act is written in such a way that you cannot back-date a promise to release the information on a timetable of your own choosing.
I am aware of the code that is being worked on at their github account @mrc-ide, but none of the repositories fit the description of the above.
Unless you can provide evidence that there was ever any view to such publication prior to 17 March 2020, this exemption cannot stand.
Section 22A(1)(a) states that: "(1) Information obtained in the course of, or derived from, a programme of research is exempt information if (a) the programme is continuing with a view to the publication, by a public authority or any other person, of a report of the research (whether or not including a statement of that information)."
The problem with this argument is that the written part of the research was published on 16 March 2020.
While the issue is less clear-cut than in Section 22 in its ruling-out of an obvious ploy, we can't have a situation where the researcher can merely say that they might one day do another report sometime in the future, and therefore the programme is continuing, and so the information is exempt ad infinitum.
In this case the Professor Ferguson's research reached a definitive break-point with the publication of this paper, because it led to a quick reversal of government's "herd immunity" policy, and subsequently the national lockdown. I will have no problem arguing before the Information Commissioner that this outcome de facto draws a line under the research programme, and we have now entered phase 2, managing the lockdown, and everything from Phase 1 must be published.
I understand that the team is under a lot of pressure right now, but I don't see why it is a burden to produce a copy of the code in the form it was back in 16 March 2020. I am not interested in having code that is in any way fixed-up, debugged, changed or made presentable from that which was used to produce the conclusions made back then. It is essential to archive and preserve it in the form that it was actually used so that proper lessons can be learned from the evidence about the way that scientists write and handle some of the most consequential software code in this nation's history.
Please pass this complaint on to the person in charge of conducting an internal review. Please also consider disclosure under The Environmental Information Regulations 2004 by virtue of the definition of "environmental information" given in Section 2(1)(f) "the state of human health and safety".
A full history of my FOI request and all correspondence is available on the Internet at this address: https://www.whatdotheyknow.com/request/s...
Yours faithfully,
Julian Todd
Dear Mr Todd,
This is to acknowledge receipt of your Internal Review request below, made under the Freedom of Information Act. The College will respond to your request by 19 May 2020.
Yours,
Freedom of Information Team
Imperial College London
Julian Todd left an annotation ()
Yes that's very good and was inevitable, but it's a bit late now. It has already been cleaned up by some professional software engineers.
https://twitter.com/ID_AA_Carmack/status...
I've decided that I won't stop until I get the original source code so it's available for future research into the way that we have been mishandling the most consequential software in the life of the nation.
I mean, the astronomers and physicists seem to have no problem hiring very good programmers to run their simulations of deep space black holes which, frankly, don't make any difference to life, so why can't epidemiologists seem get it together?
Richard Taylor left an annotation ()
A critical review of the released derivative code has been published:
https://lockdownsceptics.org/code-review...
Dear Mr Todd,
Thank you for your Internal Review request received on 20 April. I
apologise for the delayed response.
A version of Professor Ferguson’s modelling code has been made available
since your Internal Review request was submitted.
In your Internal Review request, you challenged the College’s reliance on
Section 21 of the Freedom of Information Act, information intended for
future publication, on the grounds that the Professor and his team had not
indicated before 17 March 2020 that they intended to publish the code. You
stated that unless the College can provide evidence that there was a view
to such publication prior to 17 March 2020, this exemption cannot stand.
The exemption applies if there is an intention to publish at the time the
FOI request was received. On receipt of your request, we made enquiries of
the team on 18 March to ask whether the information requested was already
published or was intended for publication. The response (received the same
day) advised that the team were aware of the demand and were working on
releasing the code. That response indicates a pre-existing intention to
release the code information, they were already working on releasing the
code. That is sufficient for the exemption at S.21 to apply to the
information in question.
You have challenged the applicability of the exemption at S.22A of the
Act. You correctly state that for the S.22A exemption to apply, the
research has to be ongoing. You state that Professor Ferguson's research
had “reached a definitive break-point” because the publication of the
paper led to a change in government policy. The impact of the paper on
government policy is not a measure of whether or not the research is
ongoing.
You asked us to consider whether the applicable access to information
regime for this information is the Environmental Information Regulations,
rather than the Freedom of Information Act and make reference to Section
2(1)(f) "the state of human health and safety". I have copied the relevant
sections of the definition contained at Section 2 of the EIRs below for
ease of reference. Section 2(1)(f) states that information relating to
“the state of human health and safety” is environmental information
inasmuch as it is or may be “affected by the state of the elements of the
environment referred to in (a) or, through those elements, by any of the
matters referred to in (b) and (c)”.
While we appreciate that the list is not exhaustive, (f) gives some
indication of the type of health and safety considerations that might be
covered. The spread of disease is not referred to and is quite different
from the examples given.
For (f) to apply, the information must also relate to the impact on health
and safety of the environmental factors listed at (a). The College’s view
is that Professor Ferguson’s modelling code – while relating to the impact
on human safety of the virus – does not primarily concern itself with the
impact on human safety of the environmental factors listed at (a) below.
Looking at the examples given in (a) and the intention behind the EIRs, we
do not think that the COVID-19 virus is a state of the elements or a
factor likely to affect those elements as defined in the EIRs. We
therefore maintain that the correct access to information regime is the
Freedom of Information Act.
(a) the state of the elements of the environment, such as air and
atmosphere, water, soil, land, landscape and natural sites including
wetlands, coastal and marine areas, biological diversity and its
components, including genetically modified organisms, and the interaction
among these elements;
(b) factors, such as substances, energy, noise, radiation or waste,
including radioactive waste, emissions, discharges and other releases into
the environment, affecting or likely to affect the elements of the
environment referred to in (a);
(c) measures (including administrative measures), such as policies,
legislation, plans, programmes, environmental agreements, and activities
affecting or likely to affect the elements and factors referred to in (a)
and (b) as well as measures or activities designed to protect those
elements;
(f) the state of human health and safety, including the contamination
of the food chain, where relevant, conditions of human life, cultural
sites and built structures inasmuch as they are or may be affected by the
state of the elements of the environment referred to in (a) or, through
those elements, by any of the matters referred to in (b) and (c)
Having reviewed the College’s original response, your request for a review
and relevant guidance, I am satisfied that the College correctly applied
the exemption at S.21 of the Freedom of Information Act. If you are
unhappy with the outcome of my review of your request, you have the right
to complain to the Information Commissioner’s Office.
Yours,
Anita Hunt
Access to Information Manager
Central Secretariat
Imperial College London I South Kensington Campus I Faculty Building
Level 4 I London SW7 2AZ
Tel: +44 (0)20 7594 5107
Dear Anita Hunt,
Thank you for your response on 21 May 2020 turning down my requests to review the exemptions applied.
This is to let you know (and anyone else who is following this request ) that I have submitted a complaint to the ICO.
The text of my complaint is as follows:
"""
A release of code has been made that was already cleaned up by experienced coders at: https://github.com/mrc-ide/covid-sim
However, this code does not fit the descriptions of the code that was actually used to inform government policy of a "single 15k line C file"
https://twitter.com/ID_AA_Carmack/status...
This work is so consequential that we must have the original code on the record in order to properly account for how it came to be in the state that it was and to inform policy measures on how academic code is managed.
It is arguable that, had experienced coders been permitted to see it at any time during its long development, Prof Ferguson would have been informed of methods of coding and debugging that would have substantially improved the efficacy of his team's work. This could have produced reliable scientific results sooner and convinced the government to lock down earlier.
I believe that in Section 2(1)(b) of the EIR a virus contamination should be read as a "factor", even though it is not explicitly listed in the "such as" clause.
http://www.legislation.gov.uk/uksi/2004/...
"""
Yours sincerely,
Julian Todd
Dear Mr Todd,
You may be interested to learn that on 1 June, Imperial’s COVID-19
Response Team published the script to reproduce its 16 March coronavirus
report (commonly referred to as [1]Report 9). The code, script and
documentation is [2]available on Github. All assumptions are documented in
the report and are available on Github. Further information about the
release and the code-check certification can be found on the [3]College’s
news pages
Yours,
Freedom of Information Team
[4]Imperial College London
References
Visible links
1. https://www.imperial.ac.uk/mrc-global-in...
2. https://github.com/mrc-ide/covid-sim/tre...
3. https://www.imperial.ac.uk/news/197875/c...
4. http://www.imperial.ac.uk/
Dear IMPFOI,
Thank you for the update. That's great news to hear, and quite a relief.
However, I still believe we are due a copy of the original 15,000 line C file for its historical value as evidence for how we got to this place.
As I have commented elsewhere, this software ought to be running its simulations continually and with high granularity -- exactly as we do with storm predictions -- instead of once in a blue moon leaving most of the policy decisions in the dark most of the time as we speak.
Given the substantial progress that has been made since the code was belatedly published, maybe we could have had it this good had it been published 10 years ago. So I am doing what I can with this request to make sure there is enough clear evidence to assure that this lesson never gets overlooked again.
Yours sincerely,
Julian Todd
Alistair Haimes left an annotation ()
Julian: did you ever hear from the Information Commissioner? I think it is vital that somebody outside Imperial is able to review the original code as it was presented to SAGE, not a cleaned up copy.
Jacob Halsey left an annotation ()
This was finally answered in a similar request:
https://www.whatdotheyknow.com/request/f...
https://www.whatdotheyknow.com/request/f...
https://drive.google.com/file/d/1cwTDgvU...
Julian Todd left an annotation ()
Jacob has it right, the code was made available in another FOI.
I was contacted by the ICO with an offer to send me the code, but -- entirely due to my own incompetence -- I didn't read it in time to reply and give them a place to send the code. I was very embarrassed. I'm glad someone else has got it.
The conclusion by those who assessed the code in detail was that it was perfectly fine, probably better than most academic software in the field. My opinion was that we were too late at making these requests and putting pressure on. If this had been worked on as open science since 20 years ago it could have gotten a lot further ahead in terms of its capability to use many more datasets (eg anonymized phone tracking), make fine-grained real-time predictions (as we do for the weather forecasts), and immediately tell when government policies were working or not making any difference. That's the actual tragedy of the situation, not that there were any mistakes in the code, because there wasn't.
We work to defend the right to FOI for everyone
Help us protect your right to hold public authorities to account. Donate and support our work.
Donate Now
Ganesh Sittampalam left an annotation ()
Looks like it's just been published here: https://github.com/mrc-ide/covid-sim