Visualizing COVID-19 mortality in Europe and its neighborhood
A MOOD output
1 Framework
Goal
To provide an overview of the COVID-19 related mortality in the most affected European countries, highlighting the public-health consequences of COVID-19 transmission dynamic - in particular to help knowing whether an epidemic peak in daily mortality rate has been reached and starts decreasing,
Scope
the geographical scope of the study is:
the European Union and the UK (28 countries),
plus countries in the European Free Trade Association (EFTA): Iceland, Liechtenstein, Norway and Switzerland;
plus candidate countries for EU membership: Albania, Montenegro, North Macedonia, Serbia and Turkey;
plus European, non-EU countries: Bosnia and Herzegovina and Kosovo;
plus countries in the European neighborhood Algeria, Armenia, Azerbaijan, Belarus, Egypt, Georgia, Israel, Jordan, Lebanon, Libya, Moldova, Morocco, Syria, Tunisia and Ukraine.
2 Results
Daily death rate
- Daily death rates show large day-to-day variations for a given country. A part of these variations is related to delays in sub-national data collection, and subsequently, to the correction of these delays. Therefore, it is wiser to interpret the trends, rather than specific data points.
On fig. 2.1, countries are ordered from the highest cumulative death rates on the top left corner (Belgium), to the lowest on the bottom right corner (Finland) of each subset of plots (daily death rates on the left, and cumulative death rates on the right).
Belgium is the most severely hit country (cumulative death rate: 78.7 deaths \(10^{-5}\) inh.).
The 46-d post-lockdown limit has been reached by all countries: fourth (if any) vertical, red, dashed line from the left. This means that - according to the maximum recorded delay between the infection and death times (46 days), all the deaths occurring after this 46-day limit were exposed to the virus after lockdown was implemented.
Periodic variations of daily mortality rates are observed in several countries: UK, Netherlands, Sweden… The cause is unclear, but the consequence is a high uncertainty in the latest estimated values of the trend.
Cumulative death rate
In the 54 countries accounted for\(^1\), the total number of deaths is now 166,302 for an overall population size of 943.1 millions, representing a cumulative death rate of 17.6 deaths \(10^{-5}\) inhabitants (inh.).
\(^1\) List of countries (number of deaths in brackets): Belgium (9080), Spain (27709), Italy (32007), United Kingdom (34796), France (28239), Sweden (3698), Netherlands (5694), Ireland (1547), Switzerland (1602), Luxembourg (107), Portugal (1231), Germany (8007), Denmark (548), Austria (629), Romania (1107), Finland (300), Moldova (215), Slovenia (104), Turkey (4171), Estonia (64), Hungary (467), Norway (233), Bosnia and Herzegovina (132), Israel (272), Iceland (10), Czech Republic (297), Serbia (231), Liechtenstein (1), Poland (936), Croatia (95), Lithuania (59), Armenia (61), Belarus (171), Bulgaria (112), Greece (165), Kosovo (29), Montenegro (9), Cyprus (17), Malta (6), Algeria (555), Ukraine (535), Albania (31), Latvia (19), Egypt (645), Morocco (192), Slovakia (28), Azerbaijan (40), Tunisia (46), Lebanon (26), Georgia (12), Jordan (9), Libya (3) and Syria (3).
By country
Spatial distribution
All the most COVID-19 affected countries are located in south-western Europe, at the exception of Sweden (ranking 6 on fig. 2.3). Sweden is also one of the only EU countries where strict lockdown measures were not adopted.
The cumulative death rate (fig. 2.2) is contrasted between these countries\(^1\) (in red on the plot) and the others, with lower death rates in Germany (2.8 \(10^{-5}\) inh.), or even lower in Poland (0.4 \(10^{-5}\) inh.).
Though between-country comparisons should be done very carefully, and the epidemic is far from its end, low death rates are reported in Finland, the Baltic countries, Central Europe, and south-eastern Europe. These countries are heterogeneous in terms of socio-economic features, as well as connectivity and international movements.
\(^1\) Belgium, France, Ireland, Italy, Luxembourg, Portugal, Spain, The Netherlands, United Kingdom
Mortality growth rate
Time trend
Trends in daily mortality growth rates are shown on fig. 2.4:
The mortality growth rate is now negative in all the most affected countries but Switzerland (probably related to periodic variations of the daily mortality rate).
In addition, post-lockdown days 11 and 19 actually represent pivotal times for the trend:
Spain, Italy, the UK, and the Netherlands show an early response (11 days) to the lockdown measures,
Belgium, France, Sweden, and Ireland show a later response (19 days).
For the other - less affected, countries, the trend in daily mortality growth rate is slowly decreasing.
Mortality pattern
The set of indicators that were estimated is shown in tab. 2.1.
Country | d2peak (d) | ldpeak (d) | cumdr (/100 k) | maxdr (/100 k) | rgr (%) | T50 (d) |
---|---|---|---|---|---|---|
Austria | 104 | 24 | 7 | 0.19 | 17 | 26 |
Luxembourg | 102 | 26 | 17 | 0.60 | 26 | 28 |
Spain | 100 | 26 | 59 | 1.41 | 26 | 28 |
Switzerland | 104 | 27 | 19 | 0.45 | 20 | 29 |
France | 105 | 28 | 43 | 1.25 | 32 | 30 |
Ireland | 115 | 28 | 32 | 1.40 | 18 | 30 |
Italy | 97 | 28 | 53 | 1.06 | 19 | 30 |
United Kingdom | 111 | 28 | 52 | 1.30 | 21 | 30 |
Germany | 110 | 30 | 10 | 0.28 | 31 | 32 |
Netherlands | 106 | 30 | 33 | 0.79 | 28 | 32 |
Portugal | 46 | 31 | 12 | 0.28 | 28 | 33 |
Belgium | 109 | 32 | 79 | 2.38 | 73 | 34 |
Denmark | 105 | 34 | 9 | 0.21 | 93 | 36 |
Romania | 119 | 34 | 6 | 0.14 | 14 | 36 |
Sweden | 115 | 40 | 37 | 0.93 | 140 | 42 |
Finland | 116 | 41 | 5 | 0.25 | 432 | 43 |
Several of these indicators are highly correlated (fig. 2.5.
The time from lockdown to the peak
ldpeak
is strongly correlated with the relative increase in cumulative mortality between the lockdown and the peak (rgr
), as well as with the time from the mortality peak to its reduction by 50% (T50%
). We discardldpeak
from the rest of the analysis.The cumulative daily mortality rate (
cumdr
) is highly correlated with the maximum daily mortality rate (maxdr
). We keep the latter for subsequent steps.
Therefore, we are left with three indicators: (i) the time interval
between the first death and the peal (d2peak
), (ii) the time interval
between the lockdown and the peak, and (iii) the maximum daily
mortality rate (maxdr
).
We use a principal component analysis (PCA), followed by a hierarchical classification analysis on the PCA scores to identify clusters of countries sharing similar mortality patterns (fig. 2.6).
Results of the PCA (fig. 2.6 show:
there is not a single prominent axis (A: bar plot of PCA’s eigen values),
the number of days to the peak (
d2peak
) is strongly positively correlated with the first PCA axis, while the second axis is strongly and positively correlated with high values ofT50%
, the persistence of the daily mortality rate (slow decay after the peak). Countries with high values of maximum daily mortality rate are on the bottom part of the plane (B: PCA’s correlation circle).We retain a partition of the countries into 4 classes (C: dendrogram of the hierarchical ascending classification).
The scatter plot of countries according to the PCA scores (fig. 2.7) reveals class-C countries are located in the bottom-right part of the plane (high value of the cumulative death rate, and low values of the 50% persistence of post-peak mortality rate: Italy, France, the UK, Ireland, Spain, and Belgium.
3 Conclusion
This study allows categorizing the EU countries with respect to the the shape of their daily death rate pattern. Therefore, this should make possible between-country comparisons. The next step will be to use mobility and lockdown data - possibly split into sub-national data, to explain these differences.
Limitations
Because COVID-19 mortality is highly dependent on the age of infected patients [2], death rates standardized on the age structure of the target population are needed for safer country comparisons. Unstandardized rates are used here.
National epidemics, starting here by definition on the day of the first reported death, are not synchronized. Therefore, country comparisons for overall death rates must take this into account.
European countries are heterogeneous in areas and population densities. In the largest countries (e.g., France, Italy…), there are strong sub-national differences, including in disease transmission, both in time and incidence rate.
Case definition (what is a COVID-19-related death?) is not the same according to the country. In addition, death notifications were initially based on mortality occurring in the hospitals, ignoring those recorded in the retirement homes (at least in Belgium, France, and Spain). Progresses were made to correct these issues, but there is still room for improvements. Moreover, diagnostic tests were not always done in deceased people without a prior COVID-19 diagnostic (both in hospitals and in retirement homes), thus introducing a downward bias in the estimates.
The smoothing method used in this document (Bayesian random walk model) is robust but the amount of smoothness to be implemented is somewhat subjective because we’re looking for stability rather than the best fitting model. The statistical challenge is to provide robust estimates in context of strong outliers, making the trend unstable at its edges. Simulation studies would be needed to answer the question.
Also, the between-country comparison is limited to countries with a mortality peak. The absence of such a peak might be related to different causes:
the daily mortality rate is stable: there ain’t such countries in the subset studied here,
the daily mortality rate is increasing and has not peaked yet. This might be the case for Sweden.
a persisting low or very low daily death rate, a situation met in many countries of northern Europe (Finland, the Baltic countries), Central Europe (Poland, Czech Republic, Slovakia, Romania…), and south-eastern Europe (Greece, Cyprus).
Perspectives
Mortality
Define other indicators for the daily mortality growth rate, e.g. cumulative mortality rate at lockdown, and change in cumulative mortality rate between lockdown and peak. This would allow the identification of a size effect in the data (absolute value of mortality rate), and get rid of it using multivariate analysis (e.g., principal component analysis) before identification of country clusters with respect to daily mortality patterns.
Should this approach of pattern recognition look interesting for ECDC and/or national public health agencies, it might be possible to extend it to sub-national data (because national data have strong spatial heterogeneity).
It would also be very useful to use standardized mortality data, in particular on age, to overcome the under-declaration bias in the elderly.
On the longer term, it is of high importance to homogenize the definition of a (mortality) case of COVID-19. For example, Belgium has a very inclusive case definition, including clinical, unconfirmed suspicion.
Other indicators of virus transmission
- We might also consider using other indicators, such as the number of patients admitted in intensive care units (ICU), or people testing positive to virological, or serological assays. Regarding mortality, a complementary approach would be to use the excess of deaths, rather than the mortality itself.
Assessing the effect of control measures
- Whatever the response used (mortality, ICU, cases, mortality excess…), the next step will be to assess the effect of lockdown measures or their lift. Moreover, several MOOD teams are now engaged in providing methods and tools to monitor these measures.
4 Data and methods
4.1 Data
Population data are available from the United Nations’World Population Prospects 2019 as a
csv
file. We use the 2019 estimates.Mortality data are available form the European Center for Disease Prevention and Control ECDC website which provides daily updates of the situation of COVID-19-related mortality.
Lockdown data have been
4.2 Indicators
Mortality indicators:
The apparent daily mortality rate \(\pi_i\) is the daily death count \(m_i\) for a given country, scaled by the population size \(N\) of this country, expressed in units of 100,000. The population size was considered as constant during the COVID-19 epidemic. The index \(i\) is the number of days after the start of the epidemic. The start is set to the day of the first reported death related to COVID-19. In some countries, the counts \(m_i\) only refer to deaths occurring in hospitals. Therefore, there is a negative bias in death count: the actual count is higher than \(m_i\).
The cumulative mortality rate \(c_i\) is the cumulative death count, from the start of the epidemic (day of the first reported death) to the current day \(i\), scaled by the population size, and expressed in units of 100,000 inh., and considered as constant during the COVID-19 epidemic.
The mortality growth rate \(g_i\) is the difference in death counts between two consecutive days \(i-1\) and \(i\): \(g_i = m_i - m_{i - 1}\), scaled by the population size - expressed in units of 100,000 inh. Because the apparent daily death rate show strong variations, a smoothed mean and its 95% credible band is fitted with a Bayesian order-2 random walk model), and added to the plots to show the average trend.
To interpret the mortality indicators, we use the following features of COVID-19 infections [2]:
an incubation period of 5 days on average (the time interval between the exposition to the virus and the onset of symptoms),
a median time of 14 days - ranging from 6 to 41 days between the onset of symptoms and the death of infected patients.
Interpretation of changes in mortality indicators:
a decrease of mortality indicators should be observed between 11 and 46 days after the adoption of lockdown measures, with a maximum effect starting 19 days after this adoption.
To visualize these effects, vertical lines are added to the plots at days 11, 19, and 46 after the start of the implementation of lockdown measures. Under these assumptions, the effect on mortality might be visible after day 19 11 After lockdown. To this end, the slope of the empirical trend is estimated as \(\frac{dg_{19}}{dt} = \frac{d^2\pi_{19}}{dt^2}\). An arrow with that slope, tangent to the trend line on day 19, is drawn on the plot (fig. 2.4). A strong, negative trend corresponds to a fast reduction in daily mortality rate.
Finally, the trend analysis described above can be used to estimate the time when the daily death rate starts to decrease, i.e. when the growth rate becomes negative. For this purpose; we use the fitted values of the trend (posteriors of the Bayesian model). Building on that, we can derive further indicators of the shape of daily mortality rate pattern. For example:
the relative increase in daily mortality rate, i.e.: \(\Delta c_{\text{peak}} = 100 \times \frac{c_{\text(peak)} - c_{LD + 10}}{c_{LD + 10}}\), where \(c_{LD + 10}\) is the cumulative daily mortality rate from the start of the national outbreak to 10 days after the lockdown (thus capturing the mortality not influenced by the lockdown), and \(c_{\text{peak}}\) is the additional cumulative daily mortality rate up to the peak.
the time at 50% decay in mortality rate (\(t_{50\%}\)), i.e. the time When the daily mortality rate is reduced by 50% with respect to its peak value. To this end, we estimate the daily mortality rate at the peak \(\pi_{\text{peak}}\) by averaging the observed daily mortality rates 3 days before, and 3 days after the peak. Then we estimate the change in mortality rate between \(t\) and \(t+1\), with \(t_0\) the time of the peak (in days). For this purpose, we first approximate the trend in \(t\) by the tangent to the fitted trend in \(t\) (straight line with slope \(\hat{b}\)) and we compute the mortality change between \(t\) and \(t+1\) as \(\hat{\pi}_t \times \hat{b}\). Then we update \(\hat{\pi}\) and iterate the process till reaching \(\frac{1}{2} \, \hat{\pi}_{\text{peak}}\).
5 Appendix
5.1 Software
The freely available R software [3] is used for data management and analysis, as well as additional packages:
lattice
[4],latticeExtra
[5],ggplot2
[6],ggrepel
[7],ggthemes
[8], andRColorBrewer
[9] for plotting;sp
[10],raster
[11], andcshapes
[12] for mapping,INLA
[13] for the smoothed mean on the growth rate plots.
This document is the output of an rmarkdown
[14] source
file compiled with Pandoc
, a universal document converter (http://pandoc.org).
Disclaimer
Much more comprehensive information is available in specific COVID-19 web resources like the European Center for Disease Prevention and Control (ECDC) and the World Health Organization (WHO), as well as on many national public-health agencies web sites. The website Our World in Data provides useful information on all aspects of COVID-19 [15].
The data analysis presented here has NOT been peer-reviewed, and thus, errors may exist. Comments and contributions are more than welcome.
References
1. Porta M. A dictionary of epidemiology. Oxford University Press; 2008. doi:10.1093/acref/9780195314496.001.0001
2. Rothan HA, Byrareddy SN. The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. Journal of Autoimmunity. 2020; 102433. doi:10.1016/j.jaut.2020.102433
3. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2020. Available: https://www.R-project.org/
4. Sarkar D. Lattice: Multivariate Data Visualization with R. New York: Springer; 2008. Available: http://lmdvr.r-forge.r-project.org
5. Sarkar D, Andrews F. latticeExtra: Extra Graphical Utilities Based on Lattice. 2019. Available: https://CRAN.R-project.org/package=latticeExtra
6. Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York; 2016. Available: https://ggplot2.tidyverse.org
7. Slowikowski K. ggrepel: Automatically Position Non-Overlapping Text Labels with ’ggplot2’. 2020. Available: https://CRAN.R-project.org/package=ggrepel
8. Arnold JB. ggthemes: Extra Themes, Scales and Geoms for ’ggplot2’. 2019. Available: https://CRAN.R-project.org/package=ggthemes
9. Neuwirth E. RColorBrewer: ColorBrewer Palettes. 2014. Available: https://CRAN.R-project.org/package=RColorBrewer
10. Bivand RS, Pebesma E, Gomez-Rubio V. Applied spatial data analysis with R, second edition. Springer, NY; 2013. Available: https://asdar-book.org/
11. Hijmans RJ. Raster: Geographic data analysis and modeling. 2020. Available: https://CRAN.R-project.org/package=raster
12. Weidmann NB, Gleditsch KS. Cshapes: The cshapes dataset and utilities. 2016. Available: https://CRAN.R-project.org/package=cshapes
13. Blangiardo M, Cameletti M, Baio G, Rue H. Spatial and spatio-temporal models with R-INLA. Spatial and spatio-temporal epidemiology. 2013;4: 33–49. doi:10.1016/j.sste.2012.12.001
14. Xie Y, Allaire JJ, Grolemund G. R markdown: The definitive guide. Boca Raton, Florida: Chapman; Hall/CRC; 2018. Available: https://bookdown.org/yihui/rmarkdown
15. Roser M, Ritchie H, Ortiz-Ospina E. Coronavirus Disease (COVID-19) – Statistics and Research. Our World in Data. 2020.