To gather my various plots into one place; if there's enough interest I'll update each Friday. Let me know if you have any questions about methods etc; happy to answer but it's a lot of work to cover every plot on the off chance someone wants to know about one.
Plot 1 - Infections
Plot 2 - Infections and fatalities
Plot 3 - Estimating the IFR for England
Plot 4 - Test and Trace Efficiency
Plot 5 - Jitter in case data
Plots 6, 7, 8
Plot 9 - Characteristic times (UK level data)
Plot 10 - CFR estimates
Estimates of CFR as per Plot 3, but using UK level deaths and UK level cases. Convert to an IFR estimate using the ratio of cases to your favourite random sampling survey or nowcast.
Using the geographic data from the dashboard under "Age demographic of cases by specimen date". This comes with a pre-applied 7-day rolling sum. This is all cases/day per region (UTLA).
Plot 11 - UTLA case data
Plot 12 - Rate of change
Plot 13 - Rate of change and acceleration
Example traces for each of the 4 quadrants from Plot 13 above. Having looked at a few, I'm cautious about over-interpreting this plot as the individual UTLA data is quite noisy and there's a lot going on. I might look at doing more smoothing on the data...
Thanks for this. Too sleepy to read it now, but will do when I'm awake.
Excellent, many thanks for your hard work
You have a job? How the f*ck do you have the time to do this? And thanks very much.
Don't work for GCHQ do you?
It would be interesting to have ukc allow you to write an article. I think alot of readers would be interested, especially comparing actual data you produce vs what we get via media and government press releases.
Thanks for the work.
I recently dropped one job after some introspection brought about by lockdown so don’t have the time same pressures I used to...
The plots don’t take that much time - I’ve been adding and refining bits for a few months. It only takes a few minutes to drop a new day’s data through the plotting pipeline and there’s been maybe 18 hours put in to the codes over 5 months.
I did end up spending too long on plot 12 which I’m not yet happy with. Si dH has rightly pointed out that this stuff has to be considered at a local level, and trying to present all the different facets of local data for quick assimilation across the country isn’t the most obvious set of plots to make. Still, it’s not like I could have gone down to the pub instead...
In reply to Cwarby:
I don’t think UKC would want to lend their editorial identity to what I have to say. In terms of quality of understanding and insight in the reporting of daily numbers by the press it’s not just unilluminated its veers in to nonsense when speculating on causes behind what is nothing more than noise on the system, and swings between sensationalism and missing important signs of problems. What I have to say their pales in to insignificance compared to my take on one of the organisations “professionally” presenting and interpreting evidence on the situation.
Thanks again for all this. On the UKC point they allow opinion pieces all the time. It would be fabulous to see an article summarising your views and concerns over the year and for the future.
> Thanks again for all this. On the UKC point they allow opinion pieces all the time. It would be fabulous to see an article summarising your views and concerns over the year and for the future.
I agree. I just need to get to a bigger screen to look at all this properly!
Working to combine all the information
To do this, I've re-assigned the quadrant colours from Plot 13 to work better, and made a map.
I'm not best please with the measurement of the rate of growth in cases and the acceleration in that, I might get the method re-worked for next week.
Plot 14a - fixing plot 14 which looked nice but was totally messed up (it was assigning colour based on rates in two different weeks, not rate in one week and the change between rates...)
Plot 15 - the cases/day summed for each of the four separate quadrants of Plot 13a and Plot 14.
The cumulative plot on the left shows how growth in "red" regions has taken over driving the total numbers for England as the T2/T3 regions in green and orange started to decay.
The sum of the "red" regions looks like it's tipping over in to decay, but not within the timescale I measure the rate on; in a few days those regions might start turning blue or green...
There's a lot of blue areas - cases are falling but the halving time of the cases is getting longer - this means that the fall is becoming less aggressive. Lots of interpretations of this; one is that household transmission still has to play out after the initial effect of closing hospitality kicks in. These rates I plot are based on the % day on day change so they normalise for the exponential mechanic.
I like flipping between plots 13a and 14a so I've re attached 13a.
I think this analysis is quite "twitchy" to the variation in the data so it's not one to over interpret but it makes a nice snapshot. I look forwards to seeing a lot more green and blue next Friday.
For some reason a couple of regions are missing from these plots...
An errata on my 22:15 plot with the individual regions - the y-axis is the 7-day rolling sum not average.
I like your map plots, thanks. The guardian do a reasonably good one showing direction of change that I tend to use alongside the dashboard map to get a quick snapshot each day of the latest developments, but I'm not aware of anywhere else showing acceleration/deceleration. I assume you only have that data by storing past data yourself?
Is it showing d(rate of change in cases per week)/per day or d(rate of change in cases per week)/per week.
I need some nomenclature to avoid writing things out.
Thanks. The data source for this is the joint demographic and geographic (UTLA) breakdown from the gov dashboard; this is a time series by day with 5-year age bins within each UTLA. It’s got a 7-day rolling sum applied inside each age bin presumably for privacy reasons. This only appeared online a few weeks ago.
The velocity - the rate of change - is the finite differences method over 7 days expressed as a percentage of the most recent case rate. So it’s
velocity[day X] = (cases[day X] - cases[day X-7]) x (100 / cases[day X])
The acceleration is just the difference in velocities over a week so the unit doesn’t change from that of velocity other than a /week term.
acceleration [day X] = velocity[day X] - velocity[day X-7]
I should move to using an SG method to do the differentiation more holistically, or fit an exponential to the period and use that to measure the growth rate. Before I do I want to try and deconvolve the 7-day box filter as otherwise I’m compounding the blurring effect of the box filter with whatever I do, but that’s all a bit too much like hard work...
The other thing I thought would be interesting is to plot the average age vs time for each category, within the confines of the age bins.
I think the acceleration is an important concept to look at as it shows areas with rising cases that are tipping over - the orange areas. I’m less convinced that it tells us anything so profound about green vs blue regions.
Thanks, I haven't read it fully yet but have had a quick skim through. I appreciate now that the IFR is probably now around half what it was in March. Mainly down to improved treatment, perhaps a higher proportion of cases in younger people but not because of any mutation in the virus. I think I have been a bit stubborn to accept this. Most of my arguments are with people of the "let it rip" persuasion who often quote very low IFR's. However, I don't believe the virus is any let deadly, so if we "let it rip" now, the IFR would be the same (well very similar) to what a "let it rip" IFR would have been in March.
Ok thanks. I can see why that would give a lot of noise. I had assumed it was using weekly averages or sums to produce the velocity. I think it would probably be fairly straightforward do that?
Velocity (day X) = [sumcases(dayX : dayX-7) - sumcases(dayX-1 : dayX-8)] *100 / sumcases(dayX : dayX-7)
Yes; that should be straight forward, but I didn't do it as the data has already had a 7-day moving sum (or scaled average) applied to it as part of its release process - hence I didn't do it. If I did do it, each velocity measurement would span 14 days, and so the acceleration measurement becomes very delocalised - and delocalised in a bad way using only a pair of box filters which is really not very appropriate for exponential data.
The code for these plots needs a bit of a tidy up to make it more idiot proof (i.e. me) as it's looking a bit fragile and mistake prone. Then I'll have anther look. I think deconvolving the box filter out will be simple as the data is all positive which I think means there are no pathological/degenerate entries and I don't care about the correct assignment of cases to days in the first week of the time series which themselves are very small compared to values now so the error from whatever approach is used for the initial 7 days will be small, and will anyhow not affect velocity. Getting back to the raw data seems like a much more appropriate starting point...
What are your conclusions about where we currently are overall?
Mine - from much less analysis and thinking than you've done (orders of magnitude less) is that the 2nd wave might be peaking or just about have peaked with deaths at about 400-500/day, but we really need another week of figures to have confidence that this is really so.
I think you've got it right.
My main worry had been that small but rising cases in some areas can be masked by high by falling in other areas giving premature or false hope. This has been happening since the very start of the pandemic both with geographic regions and demographic bulges (especially university outbreaks)
Figure 15 digs in to this and it says to me that even areas crudely classified as "rising" are starting to bend over into level or falling behaviour so we're past the point that "masking effect" could happen and things should keep falling - and falling faster as more of the classifications from Figure 15 tip over.
The most recent day feeding in to Figure 15 is November 15th - 10 days after lockdown started and so far enough in that the effects of lockdown are starting to show in cases.
So, I have confidence that the plateau of cases is real and the decay is here and growing. The plateau has already translated in to a plateau in hospitalisations and likely deaths (characteristic time plot up thread).
So - all that analysis effort and the only difference is I'm confident to call it a week before you... Not really worth it is it!
Still, the pucker factor remains very high over the hospital situation over the next few weeks; hospitalisations may be in a national level plateau but that's people going in to hospital; it takes them some time to go out again so the total level may keep rising. Further, winter is just around the corner and it remains to be seen what direct and indirect effects that will have on Covid and the - so far - absent influenza season.
Ah but in any arguement you can back it up with numbers and analysis whereas mine isn't much more than gut feel.
Yup, with a bit of effort, the 7-day sum can be deconvolve out; a rather ugly plot below to show the verification of the method and note something of importance. The method is shown being tested in plot A.
Having recovered the actuals data for each day, an SG filter can be applied to smooth the random fluctuations whilst preserving a higher order shape than a 7-day box filter, and without introducing the same lag as the box filter. This is shown in plot B:
This is why I didn't want to do more averaging in the measurement of velocity and acceleration. The next step is to use the SG filtered, deconvolved data to get a more up-to-date and less noise sensitive map plot.
I was expecting a link that would allow us to eavesdrop on tonight's zoom call between Nigel Farage, Gupta, Chris Evans, the "Covid Recovery Group" MPs, etc, and all I got was your bloody graphs!
Seriously, thanks for amazing amount of unpaid work, it is extremely useful in trying to understand what's going on.
Mr Tree, that is excellent thank you for taking the time to post it. It certainly looks to me that we have passed the peek of this second wave in the most part and starting on the down slope again.
Someone mentioned the Guardian doing similar? I think I would rather trust yours with no political agenda.
> it is extremely useful in trying to understand what's going on.
That’s why I’ve been doing it; plotting stuff is often my best way of understanding something - not so much the outputs but the process of doing it.
> the "Covid Recovery Group“
I see they’re gathering momentum and just wrote to the PM stating that the post lockdown tiers could be a “cure worse then the disease”. Do you think they’ll also tell Johnson what their Doctor friend was telling them only the other day about hospitals being quiet?...
Thanks Dax. I'll need to dig out the Guardian map. I've not had problems with their reporting but I get a lot more out of making my own plots, mainly because it forces me to learn about the limitations of the data etc...
Having got the deconvolution working, I've redone my dashboard map where colour shows the direction of cases and it's "acceleration" - is the case rate (characteristic time, which is invariant of the absolute number of cases and is a proxy for the R number) going up or down?
These maps are about 5 days more current than the previous ones, as I've downloaded data one day newer and the deconvolution removes the blurring, lag effect of the 7-day rolling average. I'm going to have to make a movie of this over time and put it on YouTube... This is also doing the measurement of the direction of the case rate in a much less noise sensitive way than before and a better measurement of the acceleration which has reduced the blue it seems...
> Thanks Dax. I'll need to dig out the Guardian map. I've not had problems with their reporting but I get a lot more out of making my own plots, mainly because it forces me to learn about the limitations of the data etc...
It's interesting how different people learn. I don't have anything like your skill at processing the data and doing quantified statistical analysis myself but I'm very attuned to looking at limitations and trends. I get most out of poring over the covid dashboard map every evening, it's an incredible source of insight into the pandemic if you look at it every day and spot the trends on a local and regional level (whether or not this is good for mental health is a different question). However the change in rate that the dashboard map shows can only be accessed by clicking on an individual LA or MSOA, which is time consuming if you are interested in more than a very small area. You can also move the slider bar too but then you lose any granularity by day. So I use the daily guardian 'on the rise' map too, which is just a visualisation of the same data nationally showing whether an LA is increasing/decreasing slow/fast since the previous week. It's nothing special but useful for me alongside the dashboard.
Really appreciate the time you spend to do some of the statistics. The map you post most recently with the deconvolution of current data looks both very useful and very promising for the current situation if it is accurately removing the lag effect of the 7-day average. I don't really understand what you've done, would the method usually increase uncertainties much?
> What are your conclusions about where we currently are overall?
> Mine - from much less analysis and thinking than you've done (orders of magnitude less) is that the 2nd wave might be peaking or just about have peaked with deaths at about 400-500/day, but we really need another week of figures to have confidence that this is really so.
For what it's worth, although cases have almost certainly peaked now, I'm not as confident about deaths. On a national average basis, case rates went through a significant bump at the beginning of November before dropping again; hospitalisations then did the same but so far deaths have not, as far as I can tell. It might be that they don't because the variation in time to death smooths out the kink in the curve such that it becomes unnoticeable, but equally there could be another bump yet before deaths drop on a continuous basis.
You can see the effect I'm talking about in Wintertrees doubling time graph.
On a more local level there will be lots of areas where hospitalisations and deaths are still increasing; equally there are others that peaked a while ago. Liverpool hospitalisations have been going down for a month.
> I don't really understand what you've done, would the method usually increase uncertainties much?
In short, I think no it doesn't.
In terms of pouring over data and mental health - turning this in to a hobby project lets me park it completely out of my head when I'm not working on it, and it leaves me pretty immune to the news. It has rather displaced my other hobby projects of trying to type up and edit some children's stories and getting a wild animal detector/classifier working on the CCTV to buzz me when the ******* rabbits are about.
Have you any recent data on excess deaths? I can only find info up to 6th Nov or thereabouts.
Cant you not eliminate the lag of the moving average filter just by offsetting it in time? I agree that if you just take the last 7 points, sum them and divide by 7 then call that the current day's result then there is a lag. However if you relabel that result to be an earlier day then the lag goes away.
On a Monday, if the reported data are the mean of previous Tuesday-Monday results then just label that as previous Friday's result and the lag is gone.
I think the danger with deconvolution is that errors propagate and could skew the data.
> Cant you not eliminate the lag of the moving average filter just by offsetting it in time? I agree that if you just take the last 7 points, sum them and divide by 7 then call that the current day's result then there is a lag. However if you relabel that result to be an earlier day then the lag goes away.
I agree with all of that. However, if I do as you suggest I then know the data up to day X - 3.5 (or so), where-as if I do the deconvolution I know the data up to day X. If this data was all some historic event it wouldn't matter one jot - for the reasons you give, but in this case the I want to know what the current situation is, the deconvolution approach gives insight 3-4 days closer to now than the moving average. I still need to deal with the "noise" (weekend sampling effect etc) in the data that had been smoothed by the 7-day filter , but I can use an implementation of SG filtering that has a symmetric kernel so as not to introduce lag, and that dovetails in to fitting polynomials at the edges of the window (where the lack of future data prevents a symmetric kernel being applied) and this means the filtered data remains lag free up to and including the time of the final data point on day X.
> I think the danger with deconvolution is that errors propagate and could skew the data.
You are right to be cautious. I check all the deconvolutions by running the result forwards again through a 7-day moving average and checking that this matches the input (see the plots on  for example) so if errors exceeded typical rounding errors, the code would flag it. Further, the lead-in period is always 0 cases/day which gives an unambiguous starting point for the deconvolution - this gets rid of most of the likely problems. The dynamic range of the data is small, so numerical instability from the finite precision FP maths also isn't a worry.
Edit: obviously it would be nice if the dashboard just had the raw data instead of “helpfully” filtering it. I hope whoever prepares briefings for local councils and cabinet have access to the raw data and use a filtering approach that recognises the need for up to date information when managing a crisis...
> Have you any recent data on excess deaths? I can only find info up to 6th Nov or thereabouts.
I've stopped paying regular attention I'm afraid - it's way open to interpretation however you want right now - the year is so different in so many ways that this bulk measure can't be interpreted in any particular way as long as there's a reasonable lid on the direct covid deaths - where-as in March/April it was an unambiguous sign things were going badly wrong.
That report for 6th was released on 17th. Next one is due on 24th
> This is also doing the measurement of the direction of the case rate in a much less noise sensitive way than before and a better measurement of the acceleration which has reduced the blue it seems...
My calcs show overall UK 'live cases' now dropping and the rate of drop increasing. Yesterday -2396, Saturday -1153, Friday -538. This comes against rises of 1908 (13th),1685,2590,3001,2881,2251 & 903 the previous 7 days. I'm looking forward to seeing how the rate of fall over the next week or so pans out.
I see from the BBC that there's going to be a fair bit of relaxing restrictions over xmas in conjunction of enhanced testing, will be interesting to work out the likely rate of rise and date of consequent potential lockdown.
I think you're missing a trick not analysing excess death data. This is 'gold standard' information in some regards although as you say it has a number of input parameters. The naysayers are currently screaming that excess deaths are not rising (although this has started to change over the last 4 weeks or so).
> My calcs show overall UK 'live cases' now dropping and the rate of drop increasing.
Yes, it's surprising me just how fast cases appear to be dropping.The halving time for cases appears to be much shorter than the first time around but so much has changed about testing since the first lockdown that I'm wary of reading much in to it for now. It'll be very interesting to look at the halving time for hospitalisations for lockdowns 1 and 2 - unlike cases this won't be drastically changed by the roll out of pillar 2 testing.
> I see from the BBC that there's going to be a fair bit of relaxing restrictions over xmas in conjunction of enhanced testing, will be interesting to work out the likely rate of rise and date of consequent potential lockdown.
Really hard to say.
I'd be surprised if cases bottom out at less than 10,000 per day after lockdown ends.
Maybe. Don't know what I'd do with the data though - there's no obvious questions to ask of it, no obvious ways to eek out more information by plotting it differently or combining it with anything. It is what it is, and it just lacks to much context to be useful right now.
> The naysayers are currently screaming that excess deaths are not rising (although this has started to change over the last 4 weeks or so).
They may well go back to being low soon, with the flu season apparently stalled before it gets started. Wont stop the loons from ignoring that they're low only because of the covid control measures...
I think the plan explained today is actually pretty sensible and cautious and will keep infections down outside of tier 1 areas. Tier 2 is learning from what Tier 3 did before, which mostly worked. Tier 3 will be stricter still. And they said more areas will be in the higher tiers than before. If they implement it properly* I'd be very surprised if infections rise significantly again. There'll be a bump at Christmas but if it's 3-4 days it won't change the long term course.
*The key here will be whether they can react to local data fast enough. We have seen recently how quickly a city in tier 1 can suddenly become a hotspot (Hull, Bristol, for example.) I'm not convinced the arrangements are in place to react fast enough to prevent that happening again. However, the impact should remain localised.
Agreed; stricter tiers, more readiness to use higher levels etc - it might hold this time round. On the caution side, we're still going in to winter.
For it to work this time, it does need for local authorities not to resist T2/T3 classifications; last time round we had local politicians here taking the line "We need more time to see if T2 works" - the point being, by the time they found out it hadn't worked, enough future hospitalisation were locked in that the only remaining option was lockdown (Tier 4 in all but name...)I think.
Part of the problem last time I think was having less financial support for the hospitality industry under T3.
I agree on the need for rapid reaction - frustratingly there's not much sign of the various lags in the cases data decreasing; and from what I can tell that lag corresponds to the lag in entry of the data to the test and trace system as well as all the reporting outputs that go to geographic breakdowns. I really hope the UTLA and MSOA data is analysed by local authorities without the brain-dead 7-day moving average as this further compounds all the reporting lag and means you're looking ~8 days behind reality in total.
Any thoughts on the LFT rollout. Some aspects of the test appear as shady as the PCR.
> The naysayers are currently screaming that excess deaths are not rising (although this has started to change over the last 4 weeks or so).
They've been comparing the current excess deaths with 5 year average excess deaths values seen on graphs, however what they're not taking into account is the number of actual non-covid excess deaths (probably because no-one seems to be publishing this info). I believe non-covid excess deaths will be well below average - warm weather and less flu about due to covid restrictions. Thus the excess deaths total isn't particularly high due to fewer non-covid excess deaths than normal. Plus the delay in deaths resulting from infections - the infection curve didn't really get going until october, and deaths resulting from those infections are only really getting going now.
> Any thoughts on the LFT rollout. Some aspects of the test appear as shady as the PCR.
I'll say a couple of things on this. Being relatively local I followed the Liverpool trial information quite closely.
There were a few detractors of the Liverpool programme before it started. As far as I could not tell these were not entirely rational, which was unexpected given who most of the writers were, so I think they were borne from either ignorance of the specific programme or having an axe to grind. The majority of the criticism came down to (1) test accuracy and population consent weren't good enough for a screening programme and (2) there are too many false positives for the test to be used in a population with low prevalence. However, in practice (1) doesn't matter if your aim is simply to find and take out as many asymptomatics as you can to reduce transmission rather than trying to screen everyone in order to confirm there are no positives around (clearly not the case at the moment) and (2) isn't a problem as long as you confirm all the positives with a follow-up PCR, which is what is done in Liverpool.
It's not yet clear whether the above will apply future applications or not. If some places start using LFT without a confirmatory PCR for positive cases then you might get as many people isolating unnecessarily as you do actual positive cases in areas with low prevalence. I would say that confirmatory PCR should be considered an essential element of the programme but I don't know if it is. (Edit to say: from something I read yesterday, I understand that the student Christmas testing programme will include confirmatory PCR for positive cases.)
The data from Liverpool has been spun by the Government to make the trial look more effective than it has been and therefore support their policy. They are claiming the reduction of infection is down to the testing, whereas in practice the vast majority of the fall came before it started and it had no discernible impact on the trend. The trend in Liverpool has closely followed that in other adjacent areas (Sefton, Knowsley.) Having said that, finding and isolating 2000 positive cases can be no bad thing and outside a lockdown the effect would probably have been greater - but there is no evidence. In areas in the new tier 3 I suspect the effect will continue to be quite marginal. Their rates will fall anyway.
The local director of public health made an interesting point yesterday. Apparently take-up in affluent areas of Liverpool has been very high - some over 50% of population - but in most deprived areas has been low. That's where they believe there is more infection so they are going to move the testing locations and refocus their campaign specifically to target more deprived areas from December. This is something other cities will need to consider too.
One criticism of the Liverpool programme would be that in the first few days there were big queues outside test centres. This isn't a very good thing if you are trying to keep people apart. Other areas should probably phase their programmes a bit by ward or something so that they don't get all the enthusiastic people turning up in the first two days.
Final lesson from the trial for now, the data published mixed up the results of LFT and asymptomatic PCR with the positive case numbers in the test and trace output and the covid dashboard. To my knowledge there is no way of distinguishing the positive cases from asymptomatic PCR from those from the usual symptomatic PCR testing route. This has two impacts: (1) you lose the opportunity to compare LFT positivity against PCR positivity for a similar (asymptomatic) population and therefore get a better idea how well the LFT is working, (2) if areas with mass testing include asymptomatics in their published infection rate figures then their numbers will be increased slightly vs those without mass testing, particularly if they pick up lots of people in areas of higher prevalence. This might make it harder to make local policy decisions about restrictions. Of course, all the above data might be available and able to be used by people who are important, but it isn't there publicly, so we don't know. I tend towards scepticism because of how poorly run test and trace seems to be.
I really like the idea of using LFT to remove the need for close contact isolation if they can get it to work practicably without people needing to travel much. Potentially it could mean less people are discouraged from having a test by the prospect of their families having to isolate, which can't be a bad thing. And obviously there is an economic benefit of fewer people isolating.
Sorry, I'm off the topic of Wintertree's graphs!
I found this graph interesting, the "spike" in deaths this year will be similar to Spanish flu/WW1 and the start WW2, rather than a "bad" flu year.
Seems to me that there was a lot of sex going on after both world wars 😁
Interesting stuff despite only providing data to 2014.
Two separate points caught my eye, both based on increasing trends to 2014:
27% of births had mothers born outside the UK
47.5% births were outside a marriage or a civil partnership.
I think Si dH gives a far better analysis of the situation around the LFTs than I can.
My only additions are...
Spot on. The evidence for a day 5-7 test is out there - Jersey have been doing 'test on arrival' since the summer and now have community seeding due to negative tests on arrival becoming infectious later on. I think a test on day 5-7 will detect about 85% of cases, rising to 95%+ at day 10 or so. Any government planning on using anything other than a day 5-7 test is not going to contain the spread.
> 47.5% births were outside a marriage or a civil partnership.
The last parents evening I did, only 5 of the 28 appointments were with Mr&Mrs "Child's Surname" I'm not sure many of those marriages last 12 years.
Fred Rouhling's visionary route Akira at Les Eaux Claires, France, has finally had a repeat after 25 years and not only one, but two! Seb Bouin and Lucien Martinez made the 2nd and 3rd ascents of the route.