Covid-19 and a Data Detective Story
One of the the two burly men in uniform pointed his gun at me and asked me to get out of the car. Naturally, I complied without argument.
This could have happened in any city in any country, I suppose. But it actually happened in Lagos, Nigeria.
The Nigerian Paradox
My experience was far from exceptional. The Nigerian police has a very bad reputation for both brutality and corruption. According to a 102-page report by Human Rights Watch, Nigerians are more likely to encounter police demanding bribes than enforcing the law. Nigeria was recently rocked by anti-police agitations.
The year 2020 has seen a vast natural experiment play out, as different continents, countries and sub-national regions react to Covid-19 in different ways. It has become fairly clear, as a result, that an effective Covid response requires:
- A good healthcare system (in terms of both quality and access)
- Good governance
- A disciplined populace that trusts the government and follows necessary Covid prevention measures such as wearing masks and social distancing
None of the three seem to be much in evidence in Nigeria.
- A Lancet article of 2018 ranked Nigeria as the 142nd out of 195 countries in terms of healthcare access and quality. Nigerians themselves seem to have little faith in the healthcare system and those who can afford it seek medical care abroad. According to a PWC report of 2016 Nigerians spend 1 billion US dollars annually on medical tourism. To put that in perspective, this was nearly as high as the federal health budget for the same year ($1.3 billion) and nearly 20% of the entire healthcare spending ($5.85 billion). Well meaning friends in Nigeria strongly recommended whiskey, rather than a visit to the local clinic, as a cure for my persistent sore throat — induced by Harmattan, a dry, dusty wind that blows from the Sahara across Nigeria each winter, bringing respiratory problems in its wake.
- The Worldwide Governance Project within the World Bank reports six governance indicators for over 200 countries and territories. For 5 out of 6 of these indicators, Nigeria ranks in the bottom 20th percentile among all nations.
3. The fabric of communal trust is weak in Nigeria. According to the Nigerian novelist Adaobi Tricia Nwaubani, many of her countrymen consider internet scammers as role models. The UK based think tank Chatham House claims that “there is a yawning gap in trust and accountability between citizens and the state in Nigeria”. Coronavirus misinformation is rife with many believing that the disease is a hoax, or at least that the government is greatly exaggerating the scale of the problem.
The population density of Africa is low overall, with its population being about the same as that of India, while its area is ten times larger. But Nigeria’s population density is fairly high, in fact higher than that of Italy. And its largest city, Lagos, a dense megalopolis of 20 million people living cheek by jowl, is just the kind of city where Covid-19 may be expected to spread fast.
All in all, Nigeria seems to have the makings of a Covid-19 disaster.
Yet data suggests that it is anything but. New Zealand and South Korea are generally both counted among the countries which have mounted the most successful Coronavirus responses. The Covid-19 death rate in Nigeria is about 6 per million, just above that of New Zealand and less than South Korea’s.
What explains Nigeria’s success? Is the Nigerian story too good to be true?
The X Factor
In fact Africa, as a whole, has been one of the few good news stories of the pandemic. It has done much better than expected — better, in fact, than all the other continents, except Oceania- when it comes to Covid-19 mortality. Its death rate is a bit lower than Asia’s and way lower than that of Europe and the Americas. While Africa has about the same population as India, it has only recorded approximately 60,000 confirmed Covid-19 deaths to India’s 146,000.
The African success has engendered considerable speculation among journalists, healthcare specialists and scientists. Many are looking for an as yet undiscovered “X factor” to explain Africa’s success. Possible suspects include cross-immunity due to previous infections, climatic factors and genetic predisposition.
Of course these are all plausible explanations. Several are proposed by credible experts. One or more them might very well be true. Certainly all are worth researching.
Nevertheless, I find it inappropriate to invoke a hidden X Factor to explain Africa’s low Covid-19 mortality rate. It seems analogous to claiming that an UFO caused a car crash as the British tabloid Mirror did on 25 Sep 2015.
While that claim could of course be true we have no way to prove or disprove it. And does it really help us to understand the car crash better in any way?
Both X Factors and UFOs should be avoided as they violate the Principle of Parsimony, also sometimes called Occam’s Razor. This principle enjoins us to minimize the number of assumptions while modeling any phenomenon. One should avoid introducing hypothetical entities, whether X Factors or UFOs, till their existence is proved, unless it is absolutely necessary.
So What’s Behind the African Success Story?
In fact it is highly misleading to talk about an “African” success story.
Africa is a continent consisting of 54 countries. The confirmed Covid-19 death rate varies widely between countries in Africa as the above graph shows. In fact, it varies from over 400 per million for South Africa to practically zero for Burundi.
Before discussing why Covid-19 mortality rates are lower for Africa than other continents it may be a good idea to explore why they vary so much within the continent.
If we display the Covid-19 mortality data on a map this is what we see.
What stands out is that Covid-19 mortality rate is much lower in the sub-saharan countries closer to the equator than in the countries of north and south Africa.
If we take the latitude of the capital of each African country (expressed as a sexagesimal) as a proxy for the geographic location of the country then we do see a relatively small but discernible correlation (0.34) between it and Covid-19 mortality rates. The relationship is statistically significant (p value = .015) at 5% level.
And this of course hints at the presence of a climatic or genetic factor or a local pattern of disease to explain the difference. Doesn’t it provide evidence for the presence of an X Factor ?
But wait. Before we go hunting let’s remember that the countries of sub-saharan Africa are generally poorer than those of the north & south. Could it be that Covid-19 mortality rates are higher in richer African countries than in poorer ones? In fact that is the case. The correlation between Covid death per million and per capita GDP is about 0.42. The relationship is statistically significant (p value = .003)
Although possibly counter-intuitive, this is quite in line with global patterns. The reported Covid-19 death tends to be higher in many rich countries primarily because they have older populations who are more vulnerable to Covid-19. Another possible reason for the higher death rates in richer countries is that these tend to have better Health Information Systems (HIS) than poorer ones. So, at least to an extent, the apparent higher mortality rate could simply be due to better reporting of deaths.
How well does the age distribution of the population explain the variation between Covid-19 mortality rates in different African countries?
Fairly well it turns out. Median age explains about 36% of the variation in Covid-19 mortality rates. The result is highly significant statistically ( p value = .000003). Younger countries have lower death rates, with every 1 year increase in median age leading to an approximate rise of 12 deaths per million in Covid-19 mortality.
That still leaves a good bit of the variation unexplained. I tried adding other variables such as geographical location, population density and per capita gdp to the mix but that did not improve the model. The additional variables turned out to be statistically insignificant.
What else, other than age distribution of the population, is at play?
Forensic Data Science
Data science is a loose but convenient term that has gained popularity in the last 10 years. It is used to describe the use of mathematical and statistical methods, aided by software, to derive insights and build predictive and prescriptive models from data.
The starting point of data science is — well, data. However, forensic data science — an increasingly important sub-branch of data science- takes a completely different approach. It doesn’t take the data as starting point of subsequent investigation. Instead it questions the given data and checks for the possibility of error or fraud. The forensic data scientist is a detective checking the data for evidence of crime.
Forensic data science is now a well-established multi-billion dollar industry. It is used to track down suspicious credit card transactions, fraudulent insurance claims, dubious financial statements and much else. While the area is hardly new the recent explosive growth in big data and machine learning algorithms has given it a huge boost.
Data detectives investigating Covid-19 death figures have included public health experts, journalists and scientists. One of their major tools has been excess mortality calculations.
The South African Medical Research Council (SAMRC) reports the estimated excess mortality week by week. It is the difference between the number of deaths that week and the average number of deaths in the same weeks of 2018 and 2019. The excess mortality figures were much larger than the number of reported Covid-19 deaths in July and August. They then begin to decrease in step with reported Covid-19 deaths. Of course all excess deaths can’t be attributed to Covid-19. Some are certainly due to lack of access to medical facilities during the lockdown. Still this is fairly strong evidence that a significant proportion of Covid-19 deaths in South Africa have gone unreported. South Africa has had nearly 25,000 reported deaths to date but researchers estimate that the real toll is much higher.
Few African countries have data infrastructure and HIS as good as South Africa’s. In Ghana, for example, each death is recorded by hand before being copied in a central death registry in Accra. Excess mortality figures for African countries other than South Africa are therefore hard to come by. But it seems likely that Covid-19 deaths are under-reported in other countries of Africa as well, quite possibly to a much greater extent than in South Africa.
The Outliers
Outliers — data points that differ in significant ways from their peers- offer important clues to data detectives.
Two blatant outliers in African Covid-19 data are Burundi and Tanzania.
Burundi appears not to have taken Covid-19 too seriously at first, with the former president Pierre Nkurunziza claiming that Burundians were “protected by God”. Schools and places of worship continued to be open in spite of the pandemic. Burundi was also one of the few countries that did not suspend its football season. The President did not impose strict social distancing rules, and in May Burundi expelled 4 WHO officials whom it had accused of interfering in the country’s coronavirus response.
The president may have paid a steep price for his casual attitude. He is widely believed to have been the first head of state to have died from Covid-19 though the official cause of his death was a heart attack.
Evariste Ndayishimiye, Nkurunziza’s successor changed tack and launched a program of Covid testing in Bujumbura the capital in July. By late September 25,121 tests had been conducted and the government declared that the coronavirus outbreak was under control.
Among countries that have reported any Covid-19 deaths at all — not just in Africa, but anywhere in the world- Burundi has the lowest number; with a population of 12 million it has reported just 762 cases and 2 deaths to date. That is a better score than Taiwan’s — generally regarded as the Coronavirus superstar performer.
Most observers do not find the numbers credible. One western diplomat, speaking on condition of anonymity described the Covid situation in Burundi as “awful” and “really, really bad”. A nurse in a private hospital reported that several people with Covid-19 symptoms had died without being tested or diagnosed.
Tanzania is another country whose Covid-19 data is extremely suspect. President John Magafuli of Tanzania refused to close places of worship during the pandemic as he was convinced that the Coronavirus being satanic could not survive in the body of Christ. He recommended a herbal potion from Madagascar, which the international health community regards as unproven, as a Coronavirus remedy. On 29th April, when it had a recorded total of 509 cases and 21 deaths, Tanzania stopped reporting Covid data. Subsequently, President Magafuli declared Tanzania Covid free, thanks to the prayers of the citizens and invited foreign tourists to visit the country.
Data from several other African countries, seriously affected by civil strife, are also probably extremely unreliable.
The Poster Child
Rwanda, unfortunately and unfairly mostly remembered for the genocide 26 years ago, has been one of the fastest growing nations in Africa for the last two decades. It is trying to position itself as a corruption-free and efficient “Singapore of Africa” to draw investment and tourists.
Rwanda’s Covid response — commended by WHO and described as “a model of what other low-income nations should do to respond better to health emergencies” — could not have been more different from that of its neighbors Burundi and Tanzania.
Rwanda’s innovative use of technology in its Covid war included the use of drones for monitoring social distancing measures, ferrying test samples & delivering supplies to affected families in remote locations during lockdown. Robots were used for temperature screening reducing the exposure of healthcare workers to infection.
Rwanda combined an early lockdown with aggressive testing. It has one of the highest rates of Covid-19 testing in the African continent. Test results used to come back in a day, though with a second wave hitting Rwanda there is now more delay. The share of positive cases in test results has also remained fairly low throughout the pandemic.
Rwanda setup a team of 2,000 recruits for contact tracing in March. It now supplements them with technology in the form of mobile apps and wearable devices. While it is unlikely that it has detected every Covid-19 death, its data is likely to be quite accurate. It looks as if Rwanda’s Covid-19 low Covid mortality rate of 5 per million is real. Of course, as in the case other African countries, Rwanda’s low median age (20.3 years) helps but its effective Covid response has also in all likelihood played a major role.
In Lieu of a Conclusion
An Agatha Christie whodunit always has a tidy ending, with definite answers and all loose ends nicely tied up. Data detective stories unfortunately don’t always have a similarly satisfactory ending.
We are left with some tentative (but evidence based) hypotheses at the end.
- The age structure of the population goes a long way towards explaining the variation in Covid mortality rates within the African continent. It probably plays an important role in explaining Africa’s overall low mortality rate as well.
- Unfortunately, generalization beyond this point is difficult. We don’t really have a single all-encompassing Covid narrative for Africa. The story of the African pandemic is the sum of many separate -and often very different-stories.
- Data from some African countries is extremely unreliable and their apparent low Covid mortality rate is probably largely a function of under-reporting.
- In others the low rate seems real and likely majorly due to an effective Covid response.
- For the majority of African countries a combination of 3 and 4 are probably at work.
Let’s look at Nigeria again. Nigeria certainly deserves our attention. It is the sixth most populous nation in the world and has a sixth of the African population. Inaccuracies in Nigerian data will therefore have a big impact on continental calculations.
Nigeria’s median age is particularly low (18.1 years) but that doesn’t seem to fully explain of Nigeria’s exceptionally low Covid mortality rate.
Some of the credit for it should probably go to Nigeria’s Coronavirus response, which was relatively prompt. It was more or less in line with that of many lower and middle income countries such as India, Philippines and Peru. Like them it imposed a fairly early lockdown which probably saved lives. After a few weeks they had to relax it because of the economic pain even though the epidemic growth curve was far from flattened. The lockdown did probably however buy Nigeria some much needed time to strengthen its contact tracing, testing and healthcare facilities. In time the curve did flatten — as in many other low and middle income countries-though the reasons for this are complex and not yet fully understood. Nigeria, again like much of Africa, now faces a second wave.
However, there are valid grounds to question Nigeria’s data. Nigeria lacks South Africa’s data infrastructure and Rwanda’s administrative capacity. Its Covid testing rate (4.3 per 1,000) has been low and the share of test positivity has been fairly high throughout the pandemic.
Parts of northern and north-eastern Nigeria continue to be affected by extremist violence and one wonders how accurate data collected from these parts can be. All in all, under-reporting of deaths is probably also a very important reason for the reported low Covid-19 mortality rate in Nigeria.
And finally we should be careful and not rule out the X Factor altogether! It might turn out to play a big role, in Africa and elsewhere, after all!
Endnotes
Disclosure: I am the Director of Smart Consulting Solutions Pte Ltd, incorporated in Singapore and its subsidiary Radix Analytics Pvt Ltd, incorporated in India. I am also a Visiting Faculty Member at the Indian Institute of Management, Udaipur. The opinions expressed in this article are solely mine and not necessarily shared by any company or institution with which I am affiliated.
This blog was triggered by a BBC article that I found on Prof. Gautam Menon’s Facebook wall though he is of course not responsible for my biases. I also thank him for taking time to answer some of my questions.
I am, as always, grateful to my friend Dr. Ashish Kumar Dawn for his insightful comments.