đź”’ WORLDVIEW: OK, fine PANDA, but what should government do?

The chaps at PANDA – Pandemics – Data & Analytics, a rag-tag band of actuaries who have been grabbing headlines with their criticism of government Covid-19 models – are very clear about what they don’t want, namely strict lockdowns. They’re a little less vocal when it comes to what they do want.

First, are PANDA any good?

My purpose here is not really to weigh into any intensive debate about the quality of PANDA’s critique. They make some valid points, and they also make some obvious errors – some of which have already been pointed out. But I will say that their analysis is just as susceptible to criticism as the analysis that they critique.

Let’s take, for example, their paper on coronavirus mortality. None of the authors on that paper (Trevor Nel, Ian McGorian, and Nick Hudson) have a medical background – according to PANDA’s website, Nick is an actuary specializing in private equity, Trevor is a data scientist in the online casino industry, and Ian is a “chartered reward specialist” (no idea what that is). So, make of that what you will.
___STEADY_PAYWALL___

Looking over the paper fairly briefly, I immediately have several comments/questions:

  • The dependent variable (DV) is log cumulative deaths per million at time t. In other words, a static measure of deaths at a particular time (t) to which they apply a log transformation due to the skewness of the data. Sadly, there are no notes on where they got their numbers, although the picture mapping their data credits Wikipedia, Bing, and TomTom (the only accreditation on deaths I could find). At any rate, there are over 100 countries in their study (as far as I could tell, they fitted two models with different numbers, and there are different numbers of observations here and there) and it’s a heroic assumption that the data from all of these are comparable and reliable. Interestingly, per the picture on page 7, South Africa is not one of the countries selected? But SA is later referenced?
  • They control for the duration of the outbreak with an independent variable measuring days since deaths passed 0.1 per million. In what I think is the final version of their model on page 43, most of the variation in death rates can be explained by three things: obesity, proportion of people over 70, and how long the epidemic has been going on. There is no mention here of the “hygiene hypothesis” – part of the “Panda hypothesis” that this is supposed to test – and comorbidity is not a significant variable. This paper doesn’t really break much new ground here, especially as comorbidity and age are highly correlated per the VIF stats.
  • Why are there iterative versions of the models if the goal was to test the “Panda hypothesis” presented on page 3 in a Popperian fashion? To me, it looks like the Panda Hypothesis is tested on page 31 and found wanting? Also, the iterative model-building looks like fitting theory to data, which is fine but is not Karl Popper’s hypothesis testing.
  • If you wanted to test lockdown stringency, why not include it in the model? Also, why are there 130-odd cases in the basic stringency model, 157 cases in the more detailed “curve-flattening” model, and 140-odd in the other models? Be great to see more information on your datasets or the actual data?
  • I’d like to see more on the relationships among hygiene, obesity, and population over 70. The paper says that both hygiene and obesity are mediated by poverty (true to a degree, although not in every country – SA, for example, has heroic obesity levels and a lot of poverty). But surely achieving old age is also mediated by wealth (seems to be the case when you test for health expenditure, a proxy for wealth if I ever saw one)? Should wealth be a control variable and if so, how would it affect the picture? Again, when you throw in health expenditure pretty much everything ceases to be explanatory, presumably because of multicollinearity again?
  • Under the discussion on age on page 19, it says adding age to the model improves the r-square significantly compared to just time, but the improvement is just around 0.074. Time is the best predictor with an r-squared of 0.329, throwing everything else in only improves your r-squared by about 0.1 overall – negligible compared to time.
  • In the kitchen sink model on page 33, looks like if you put everything in, the only things that really matter are obesity (proxy for age, health, poverty etc.?) and time. In other words, the longer Covid-19 has been at work in a population, the more people are dead. In fact, time since the pandemic started is the best overall predictor of deaths. So, maybe SA has fewer deaths because the pandemic has been active for less time here? That’s what the model suggests.

As you can see, my questions are boring and technical, but the basic point is that PANDA’s work, like everyone else’s, is best guess based on problematic/questionable data with a sprinkling of looking for what you want to find. That’s fine – lively debate is a good thing. But PANDA’s work is not especially sophisticated stuff, based on my initial quick look. The problem with these big, country-level regression analyses is that there is so much noise in the data and such a lack of real comparability in measures that it’s all just a best guess mess.

The bigger issue: What do we do?

Putting aside methodological questions, the basic thing that is going on here is that PANDA is saying that, so far, fewer South Africans have died than we expected and therefore we shouldn’t have lockdown.

Obviously, it’s impossible to know how many people would have died if there had been no lockdown, because we don’t have a second South Africa that we can test that theory in.

But that’s neither here nor there. The bigger point is, what is PANDA suggesting? Should the South African government throw caution to the wind, hope that SA will be OK despite our high obesity rate (a strong predictor of mortality according to most versions of PANDA’s model), and let everyone do what they like? Are they suggesting a less rigorous lockdown, without the nutty booze rules?

It’s pretty clear that the chaps at PANDA bring an important alternative perspective to the table – their general point about balancing economic costs against perceived risks is a good one, although I think a lot of their modelling and data are pretty slapdash. But the key question is, what do we actually do, keeping in mind that the economic damage from the Covid-19 pandemic is affecting every country in the world regardless of how strict its lockdown was, from Sweden to China. It’s easy to throw bricks at people’s models, much harder to build a house.

Visited 905 times, 1 visit(s) today