Sir,

As the initial shock of the SARS-CoV-2 pandemic subsides, attention turns to comparison of the effectiveness of countries’ strategies in suppressing the spread of the virus. As is apparent from much of the ongoing political and scientific discussion, this task is more challenging than one might expect and needs careful handling to maintain public trust in the science presented. The difficulty stems from uncertainty in the data compared, from a source that we rarely consider.

Most scientists and even most of the public are familiar with the concept of measurement uncertainty. Indeed, it is widely appreciated outside the metrology community that this uncertainty may arise from the variability of repeated measurements and other inputs such as the calibration of measuring equipment. When considering the differing effects of SARS-CoV-2, it is unavoidable that we compare, between countries, the number (usually per head of population) of deaths, new infections, hospital admissions, diagnostic tests, etc. The uncertainty of ‘counting’ measurements such as these is a less familiar concept and, until relatively recently, counting was not even viewed as measurement. Nevertheless, uncertainty in counting is important, especially when we are unable to individually count every event or item because they are too numerous, or because we can only access a sample of the target population, for instance, when measuring the number concentration of respirable particles in ambient air, determining radioactive decay rates, or estimating employment statistics from information gathered from a sample of the whole population.

Deaths caused by SARS-CoV-2 may be counted in their entirety, and yet there is still uncertainty present in this apparently simple task, even if this is not expressed in the public presentation of data with error bars and confidence intervals. This is because uncertainty arises from a source rarely considered and certainly not understood by the public: the measurand—the ‘quantity intended to be measured’ [1]. Put simply, we need to describe in enough detail how we are doing the counting, what is included and what is not, otherwise it is unclear, or uncertain, what we mean by our measurement result. Behind the summary statistic, 'number of SARS-CoV-2 associated deaths' presented to the public is a measurand needing significant qualification and explanation. For instance, does this figure include only deaths in hospital, only those accompanied with a positive diagnostic test (and then what is the cut-off between time of testing and date of death [2]), those where SARS-CoV-2 is mentioned on the death certificate, or all excess deaths over and above the long-term average (and then what long-term average is this judged against)? There are very many options, all potentially credible metrics, but all giving different results.

In fact, uncertainty in the measurand is almost always present in measurement science and is often an unrecognised source of irreproducibility in science, but usually this is insignificant in comparison with the traditional measurement uncertainty contributions from repeatability and input calibrations. Where it is not insignificant the metrology community refers to these measurands as ‘method defined’ or ‘operationally defined’ [3]. This consideration is particularly common in chemistry and material science. In these cases, we must give all the details required to adequately reproduce the method in question in order to reap the benefits that good metrology delivers: stability over time to provide confidence in trends and comparability between measurements made in different locations to ensure the overall reproducibility of results and the robustness of the conclusions we draw from them. The same is true for metrics associated with the pandemic.

Of course, such method-defined measurands usually have their definitions and measurement processes described in documentary standards or procedures, agreed by committees of experts. For SARS-CoV-2, what is being counted is usually well documented within individual countries [4], even if what should be counted is often contested. Changes to these processes within countries may still cause discontinuities in trends, for instance, a decision to begin including deaths in settings other than hospitals. Importantly, however, consensus between countries on how to compile these statistics is currently lacking. International agreement, perhaps in the form of documentary standards, is needed to address this deficiency. The effects of not agreeing these counting methods globally are to risk obfuscating identification of the most effective virus suppression measures and to confuse communication of the data to the public—attempting to explain that not every death is equal is an unedifying and ultimately futile task. There is a danger that without a clear, universal message the public will lose confidence in the science because, from their point of view, data keep changing for reasons at best unclear. We can restore this confidence by having clear agreement about what data should be presented.

International agreement on documentary standards requires the attention to detail and dedication to global comparability that is at the core of metrology and the ongoing work of National Metrology Institutes and Designated Institutes as the point of highest reference in the measurement system. Of course, this approach is more widely applicable to other SARS-CoV-2 problems—for instance, agreed standardised methods to measure false positive or false negative rates of antigen and antibody testing [5], especially when the marketplace is increasingly competitive.

Widespread adoption of metrology principles in these counting tasks is essential to retain public trust in the numbers presented and ensure the global comparability and reproducibility that we all need to learn the lessons of the SARS-CoV-2 pandemic as quickly as possible. The metrology community must be at the forefront of leading these efforts.