Gilding the data lily; govt’s selective Covid-19 interpretations – Zach de Beer

Unpacking the government’s Covid-19 figures indicates that the ANC is underplaying how the epidemic is impacting – child’s play compared to when it peaks, according to this mechanical engineer’s data digging. He also uses the data to illustrate how taxes lost to the ban on cigarette sales would feed 73% of SA’s hungry via funded food parcels – and to argue that school-children are a lot safer from the virus than our union-fearing Minister of Education’s behaviour indicates. Yet, one has to take the travel infection risk of vulnerable, large, low-income populations into account when comparing South Africa and similar countries’ younger populations with global Covid-19 data. What emerges clearly though, is that the most alleged fiddling (or selective interpretation) of infection data is around our testing and tracing capacity, ability and application, never mind the crippling lab result delays. Then there are the scary projections of our ICU bed threshold, (due to be reached early this month or early next) and actual capacity. The optimistic projected national need for ICU treatment is 25,000 beds – we have 3,300. No attempt is made to quantify what this dramatic shortage of ICU capacity will do to the fatality rate used in the model. – Chris Bateman

By Zach de Beer*

The only way to understand large amounts and complex data is by using visualisation.

Mechanical engineer and data analyst Zachie de Beer.
Zachie de Beer.

We all are more or less aware of the Covid-19 virus. Some people are very scared and most people know where this will all end up. Once all is said and done politicians are in charge and here in South Africa they are making sure we understand that. The choice of the name for the group is as we very well know “The National Corona Command Council”. The name says it all. I associate the choice of the name with people like Idi Amin, Joseph Stalin and Adolf Hitler, and closer to home Robert Mugabe.

When I decided that I would like to study and publish the results of my Covid insights it was early days and I naively did not expect that politics would influence the mitigation of the pandemic to the degree that essential items would exclude most clothes, it boiled down to food and petrol.

The most staggering decision the NCC made was the banning of cigarettes. Research has already indicated that recent quitters have no better chance of surviving and most of all instead of collecting tax, criminals are making huge profits. With the profits the criminals could well start branching out to selling drugs, promoting prostitution, illegal gambling, human trafficking and the list goes on. We could soon have our own Mafia, we just don’t know what it will be called or more likely maybe the imported Mafia is already in charge.

The tax loss we make each month should be expressed in food parcel units. The numbers bandied around by the press vary but let’s say that R300,000,000 is lost and a food parcel costs R1,200. Then the tax loss would buy 250,000 parcels and, if I am correct, would feed one million people per month. Some estimates put the figure at R1.5bn and now we are looking at 1,250,000 parcel units – enough to feed five million people per month.

By the governments own statistics, 6.8 million people in our country go hungry every day.

Enough said.

Who is dying?

Viral outbreaks don’t affect all age groups in the same way. The Spanish flu targeted the young, Covid-19 seemingly the aged.

If we look at more developed data from New York where the epidemic looks close to ending, we find comparable numbers.

The message that is not getting across in South Africa is that the sub-40-year-old group is almost completely safe from dying. Mr Mmusi Maimane is misguided and should not threaten the government with legal action – science and data proves that school-going people are safe from the virus. In my mind he is spreading fake news with his petition. Trade unions are making a meal of this too.

What do we know about Covid-19 in SA so far?

The government is very selective with the information that they share and the little that we can get is spread over a number of web pages and it is hard and tedious work to combine the data into one set in order to look at trends and correlations.

I am a mechanical engineer, educated in statistics, maths, physics, chemistry and many other subjects. I have no qualifications in medicine or anything related to that. I simply use the published data and apply statistics and mathematics in order to study the underlying phenomena, make correlations and comment on poor data or misguided interpretations.

I am not providing health advice or anything related to that.

On the one hand I am looking for insights and on the other hand I am looking for signs that the government is misleading us, providing erroneous or false information etc. I am also looking for bad decisions by the government. I do not trust the government with my life.

Testing and cases

Let’s start with the local data that we have and then we will take the next step. The information I am using now comes from Wikipedia where one or more committed people are spending a lot of time collecting and organising the data. We start with countrywide data.

The blue line represents the total number infected people, the number we wait for every night, and the other line reflects the number of tests done. So we are close to 600,000 tests and 25,000 infections. The trends of tests and cases follow each other very closely . This is a well established phenomena all over the world. Comparing the two curves we can see that the confirmed cases are now climbing a steeper curve than the tests and the lines crossed.

The rate of testing cannot keep up with the growth of the disease – the Western Cape has a test result backlog of more than 12,000 cases and if one can believe the press, tests are now taking more than 10 days to return a result. I estimate that in the WC you need less than 20 tests to find a case. If the labs are 10,000 tests in arrears one could expect that the Western Cape’s case figure is understated by 500.


The blue line in the graph to the left and the dots are the number of tests done nationally per day. It is jagged and this is not unique to South Africa.

The numbers of tests have risen substantially but we can see the average number if daily tests are flattening – this is very bad news and a huge risk. 

More interesting information is found if we calculate the numbers of tests required to find one case. Again looking to the left we see some erratic numbers and then a curve emerging from about 20 March.

The curve starts at about 28 tests per case and then climbs quite rapidly to 40 tests per case. At the same time the daily tests also rise. As the number of tests per day approaches the beginning of May the test ratio drops from 40/case to 25 per case.

As much as the Minister of Health brags about testing and screening we will see below how far we are behind other countries.

Deaths mimic the infection curve but lags by two to three weeks. The infection rate is an early predictor of deaths. For interest sake I include the per capita death rates for a few countries.

As we can see there is good agreement between the UK, USA and France after the starting points for each curve are set to 0.1 deaths/million. As we can see our SA deaths lag Iran by 60 days and France by about 60 days.

What we are seeing here is that late starters and/or those countries that don’t test fast enough get punished. On the other hand we are comparing ourselves to countries that are much richer than us. If we take our per capita GDP into account we look much better.


While testing and infection rates may fluctuate and positive tests very strongly, on the test strategy and grid deaths occur in spite of tests and test results. It is much easier to fiddle with tests than deaths. 

On a semi log graph we compare deaths with cases we found:

Lockdown started 26 March and three days later the trajectory of the infection count does a right turn and develops into a trend with just the slightest concave. The kink in the infection curve is a mathematical discontinuity, I did not expect such a dramatic change in direction. I can only guess that testing moved to what we consider vulnerable communities and as it soon turned out most of these communities were virus free or almost virus free.

It amounts to leaving a big fire behind to go looking for small ones.

Comparing deaths per million people in a few countries we find the following on the graph below.

The horizontal scale starts when a country reaches 0.1 deaths per million to remove the starting dates of the country and line them up in a fashion. South Africa and Egypt don’t fit with the trends of the EU or even Iran. Egypt’s death rate is better than ours and our trajectory over the last 15 days or so is much steeper than before. 


Instead of explaining the three ways of expressing fatality please read it here.

In our graph above we can see that the median age alone is not a strong predictor of the case fatality rate. What we are seeing here has more to do with how good a country’s health system is and/or policy decisions made by governments during the pandemic.

Predicting the future 

Under public pressure the government eventually started providing some information related to their projections and in fact some or maybe even the projections. 

As we can see, a big team of academics and academic organisations or groups and individuals worked together to produce projections for the country as well as each province. The team also had to do research in order to estimate rather that just guess the input parameters of the simulation.

The simulation model is called the SEIR (susceptible, exposed, infected, removed) model with some modifications or extensions. So far so good. SEIR is widely used.

My first problem is that the jargon used in the report is not defined. Secondly the total number of infections has been removed from the information provided, which makes it difficult to compare predictions with reality.

We will now step through the presentation.

It is not really clear what our learned friends are telling us but it is clear that they are papering over  the deficiencies of simplifications. Maybe they are telling us that they modelled the entire country in one simulation, well knowing that the result will not be good or reflect reality.

Now let’s have a look at what this means:

The optimistic projected national need for ICU treatment is 25,000 beds but we only have 3,300 beds. No attempt is made to quantify what this dramatic shortage of ICU capacity will do to the fatality rate used in the model. 

One can at least expect that the learned team calculates an upper bound of lives lost due to ICU incapacity by using the same data that they are feeding into the model.

The people in the red box are at high risk if they do not get into ICU. 

How frequently will new data be added to the model? 

We now have a look at how accurate our South African model is.

Considering the short period of projection it is not looking that good.

Above is a very crude comparison between actual confirmed cases and projected active cases. Forget about the numbers on the vertical scale.The projected active cases in GP peak at over 1.5 million and the country optimistically peaks at over 8 million – we are going to win the Covid World Cup for the worst projections in the world. In three months’ time we will have more cases than the rest of the world combined. The Western Cape is two months earl, the EC is slowly waking up and the other provinces should go to level 1 if the government believes its own numbers.

Is it fraud or just a typo?

Because of the poor quality and formatting in the press release, I had to zoom in to make sure I am not seeing double or counting the zeros wrong. In line with the optimistic peak for active cases I found this on the vertical axis.

Look at the spacing of the zeros.

It is highly unlikely that some sort of software produced this spacing. The only explanation is that somebody had fingers in the till, and not a smart somebody. Lest we forget who approved the document.

Did somebody multiply by ten to scare us? 

Considering what we know now, we have to question the entire validity of the government’s press release and by implication the validity of the model used. 

Since we only have numbers for active cases rather than the total number of infections we have to do a bit of homework. If the funny number above that starts with an 8 is actually 8 million active cases at the peak of our infections, let’s compare the number of active infections to Italy – they had 110,000. Our 8 million is the optimistic case. We must be thankful, it could be 12 million people. 

Is this fake news and a criminal offence? 

Key model parameters 

Key parameters are based on old data and in some cases very small samples, e.g. symptomatic cases. The scientists know this. The May 9 report is three weeks old and the data two months or older in some cases. The fatality stats are three months old and one of the papers is a WHO/China concoction. Good enough at the time but not anymore.

What next?

It is days ago now that I started writing and researching almost everything I write down. I quote from page 3 of this paper:

On the one hand I am looking for insights and on the other hand I am looking for signs that the government is misleading us, providing erroneous or false information etc. I am also looking for bad decisions by the government. I do not trust the government with my life.

The scientists, the Minister and the press need to explain how fake news of this magnitude is spread under the name of the SA government.


  • Zach de Beer post grad mechanical engineer with a passion for healthcare. He spent most of his life in manufacturing of parts from advanced materials and his company Aerodyne served the Aviation and Automotive industry.
Visited 3,209 times, 1 visit(s) today