martedì 12 maggio 2020

 More comments on the models.

I've spent the last few days picking through Neil Ferguson's Covid model.

Thread 👇

It contains 450 different parameters, each of which is either a single float or a float matrix. The majority of these are not based on any ground truth data that I can see and are just...


... "magic numbers" presented in the model. Any one of them can change the outputs of the model in unpredictable ways.

The model is essentially a giant, complex state machine. Its stochastic, which basically just means its contains random elements - so the no two runs...

... produce the same results.

The code itself contains many special rules held in imperative C++. There is no explaination as to why these are present. For example, why are hotel's excluded in the "place sweep"? Who knows, but they are.

The complexity is hence...

...Cartesian product of the input parameters and the embedded code rules.

All this wouldn't matter if it delivered reasonable results. However, no amount of fiddling with parameters deliver Swedish deaths of less than 90k. The model just consistently over counts infected and...

... hence dead.

I am not even going to go into the bugs that are listed in the issues list. Some of which results in +/- 80k deaths.

Here are just a few of the free parameters you can fiddle with in order to "make up" a result which suits your narrative. Remember there are...

...450 such cogs and switches to fiddle with.

All in all, this is not a clear, transparent model, based on firm ground truth data. Its a fairly arbitrary Heath Robinson machine, which over counts infections on a consistent basis.

Thanks all for your responses, questions and challenges. Too many to respond to them all I'm afraid. But all will be read.

Strange that many academics have jumped in with "get off my land" style comments. This is such a thread. Worth a read.

"fake news" is lazy rhetoric to shut down challenge. Perhaps point out the flaws in my analysis or present your own analysis. The model was a "secret implementation" until last week. Citing unpublished models is clearly not good science.

Also, its said that the seed numbers generated wont produce repeat results like they're supposed to.

Yes thats right, the seed numbers don't seem to generate deterministic outputs when running in a multi-threaded mode - but do in a single threaded mode. Woman shrugging


I guess that is because the random number generators (there are many) need to be serialised over all active threads.

Was Ferguson’s model independently peer reviewed by other scientists or mathematicians? How about by the excellent brains in our secret services? Surely it was verified and not just accepted? Surely? .....

Recall that many of these matrices of numbers will be derived from real datasets and their derivation will reside in published papers, postdoc work, PhD (and sometimes Masters) theses. You can't intuit where such numbers came from from the code itself.

"proportion of hotel guest who live locally", "chance shop will open in holiday", "likelihood key worker lives with key worker", "chance digital contact will ignore signal". Really? You think there is ground truth data for these?

No. I work on financial modelling, but did this at my employers request. They got a much thicker and deeper analysis than I posted here.