• Welcome to OGBoards 10.0, keep in mind that we will be making LOTS of changes to smooth out the experience here and make it as close as possible functionally to the old software, but feel free to drop suggestions or requests in the Tech Support subforum!

Non-Political Coronavirus Thread

Watching people untrained in virology and public health play epidemiologist is almost as fun as watching people untrained in law and history play constitutional lawyer. In case you're wondering how tragedies happen... this is it.

It's almost like we should stop electing lawyers and B-list celebrities to run the fucking country.
 
Some formulas for 2&2 to work on or for people to have a background of simple epi info. Everyone has been hammered with R0 so should have a good grasp of that concept, bigger the R0 the more spread, R0 1 means stable transmission, R0 less than 1 means declining transmission. R0 has somewhat been misconstrued in the media because R0 in the traditional sense is a stable basic reproductive number, which is important in the case of removing social distancing. We are artificially changing R0 based on social distancing but the actual R0 is static, which for this virus appears to be somewhat high, most likely in the 4-6 range. Once again though this is an average, which will vary person to person, so one person may be transmitting to 1 person another may be transmitting to 10 people, based off a whole bunch of factors like viral load, immune response, respiratory capacity, movements, etc…

People may also be familiar with SIR models of disease, it’s the most basic epi model there is. You have Susceptible, Infectious, and Recovered. Each will be treated as its own bin, using an ordinary differential equation. You can make this model as simple or as complex as you want, you can add birth rate to into susceptible, but that can probably be eliminated for Covid-19, but you probably want to eliminate deaths from the recovered bin. The whole process then becomes cyclical; Susceptible population (infectious rate) to infected population (recovery rate) to recovered population (minus deaths), then rate of protection decay back to part of the susceptible population. Seems simple enough, the problem in an emerging disease is its extremely difficult to know what data to input into the model. So original modeling coming from China was an SIR model, with low transmission rates and no concept of asymptomatic spreaders.

In early February the model was complicated by taking into account 2&2’s favorite thing, asymptomatic spreaders. To put these into the model you end up with an SEIR model, or Susceptible, Exposed, Infectious, Recovered. You have a group of individuals that are exposed but not infectious, and then a group that’s exposed and infectious, but within the infectious group you then bin two different differential equations, one for symptomatic infections and one for asymptomatic infections. You then need to determine the infectious rate for both bins, do they vary (probably), is an asymptomatic spreader having a lower infectious force but same rate as infectious caused by increased rates of exposure. You are now adding equations on top of equations, and in an emerging disease you are making a lot of assumptions because the data is not great.

I guess the take away is that the models are fluid and are meant to provide a tool to make informed decisions. They are complicated, you can’t just extrapolate your simple denominator math or exponential growth because it doesn’t work like that. The models are only as good as the data that is input into them, its why the models are pretty bad despite our best efforts. It’s also why the lack of testing is the biggest problem with figuring out what to do next. If anyone tells you they know what will happen or what was going to happen they are full of shit. Everything put out even by the media either doesn’t have or fails to mention confidence intervals. So whenever something is printed look at the range, do you gain any information when you are determining R0 and you have a 95% CI that its between .5 and 2.5, that means either you are having exponential growth or transmission reduction, it tells you nothing.

Now for 2&2’s homework, solve
dE/dt = S(t)/N((R0/Di)I(t) + z(t)) – E(t)/De – (Lwj/N + Lw,c(t)/N)E(t)


The answer is dE/dt = OPEN UP, AMERICA FUCK YEAH!
 
You’re still misunderstanding how things work. The ratio of asymptomatic to symptomatic is not going to radically change, because the factors you listed above aren’t changing in the general population. If 20% of 1000 infected people are symptomatic, then it’s almost certain that 20% of 10000 infected people will be too.

You don't think those factors vary across different geographic segments of the population? Every area of the country does not have the same makeup as New York regarding those factors.
 
https://statmodeling.stat.columbia....-in-stanford-study-of-coronavirus-prevalence/

It turns out that Stanford study isn’t as airtight as advertised.

ETA: LOL 2&2

Summary

I think the authors of the above-linked paper owe us all an apology. We wasted time and effort discussing this paper whose main selling point was some numbers that were essentially the product of a statistical error.

I’m serious about the apology. Everyone makes mistakes. I don’t think they authors need to apologize just because they screwed up. I think they need to apologize because these were avoidable screw-ups. They’re the kind of screw-ups that happen if you want to leap out with an exciting finding and you don’t look too carefully at what you might have done wrong.

Look. A couple weeks ago I was involved in a survey regarding coronavirus symptoms and some other things. We took the data and ran some regressions and got some cool results. We were excited. That’s fine. But we didn’t then write up a damn preprint and set the publicity machine into action. We noticed a bunch of weird things with our data, lots of cases were excluded for one reason or another, then we realized there were some issues of imbalance so we couldn’t really trust the regression as is, at the very least we’d want to do some matching first . . . I don’t actually know what’s happening with that project right now. Fine. We better clean up the data if we want to say anything useful. Or we could release the raw data, whatever. The point is, if you’re gonna go to all this trouble collecting your data, be a bit more careful in the analysis! Careful not just in the details but in the process: get some outsiders involved who can have a fresh perspective and aren’t invested in the success of your project.

Also, remember that reputational inference goes both ways. The authors of this article put in a lot of work because they are concerned about public health and want to contribute to useful decision making. The study got attention and credibility in part because of the reputation of Stanford. Fair enough: Stanford’s a great institution. Amazing things are done at Stanford. But Stanford has also paid a small price for publicizing this work, because people will remember that “the Stanford study” was hyped but it had issues. So there is a cost here. The next study out of Stanford will have a little less of that credibility bank to borrow from. If I were a Stanford professor, I’d be kind of annoyed. So I think the authors of the study owe an apology not just to us, but to Stanford. Not to single out Stanford, though. There’s also Cornell, which is known as that place with the ESP professor and that goofy soup-bowl guy who faked his data. And I teach at Columbia; our most famous professor is . . . Dr. Oz.
 
Last edited:
The stanford study is a pre print, and the USC study hasn't even published their methods yet from what I can tell. They will be heavily revised before publication as they seem to have made some pretty big errors. I guess a danger of pre prints is then the media just runs with these results as if they are somehow final or definitive
 
So your argument that the social distancing is overblown is that a TON of people have the disease so they should have been allowed to go to work and spread it around, infecting way more people?

No, my argument was that a ton of people have not social distanced so they already spread it around and infected more people, but those people are generally asymptomatic, so the symptomatic rate is much lower than generally portrayed. So, the remaining question to me is, does the true symptomatic rate justify the response given the effects of the response?
 
Now this would really suck:
https://www.jpost.com/HEALTH-SCIENC...t-30-different-strains-new-study-finds-625333

"More than 30 different mutations were detected, of which 19 were previously undiscovered.
“Sars-CoV-2 has acquired mutations capable of substantially changing its pathogenicity,” Li wrote in the paper."

Non-peer reviewed, so let's hope there are errors found. First question that comes to mind - if there are 30 strains, does that mean if you get one, you can still get the other 29? Same question as to vaccines.
 
https://statmodeling.stat.columbia....-in-stanford-study-of-coronavirus-prevalence/

It turns out that Stanford study isn’t as airtight as advertised.

ETA: LOL 2&2

I don't understand why these researchers aren't consulting with statisticians. These data need a detectability parameter and false positive parameter. It's a not too complicated hierarchical binomial regression model where the probability of infection is a function of first detecting the disease given that it is present (false negative), and second the probability of getting a positive detection when the disease is truly absent (False positive). To estimate those nuisance probabilities you'd need at least three blood samples from the same 3000 people and test each sample independently. If you get a lot of 111 and 000 results, the probability of false results is low, if you get a lot of 101, 100, 011, etc. the accuracy of the test is problematic. We do this stuff in auditory animal detection surveys all the time. Imagine doing a survey for birds where you rely on listening to their songs to determine presence, it's not too hard to confuse a chipping sparrow song with a pine warbler and it's pretty easy to get a false positive when one is present and singing and not the other, so we use repeated surveys and a hierarchical model to estimate the probability of false negatives non-detections and false positive detections.
 
Well fuck, "Andrew" has a blog and something to say, case closed I guess. Congrats on your milkwich.

The Stanford study is a non-peer reviewed pre-print. Meaning it has been posted online while the study undergoes scientific peer review at a journal and it is about equivalent to a blog post.
 
The Stanford study is a non-peer reviewed pre-print. Meaning it has been posted online while the study undergoes scientific peer review at a journal and it is about equivalent to a blog post.

1) I’ve never encountered somebody quite so proud of his relative ignorance than 2&2. It’s like a badge of pride with this one. These errors are like Stats 101 errors. My methods students could have poked holes in this crap.

2) what’s up with this (new) trend, birdman? I see it a lot in Econ, but almost never in Sociology and Political Science. It seems like all of the partisans are using non-peer reviewed pre-prints to publicize their points from what peer review quickly reveals are deeply problematic findings. Do you see stuff like this happening in your field?
 
I don't understand why these researchers aren't consulting with statisticians. These data need a detectability parameter and false positive parameter. It's a not too complicated hierarchical binomial regression model where the probability of infection is a function of first detecting the disease given that it is present (false negative), and second the probability of getting a positive detection when the disease is truly absent (False positive). To estimate those nuisance probabilities you'd need at least three blood samples from the same 3000 people and test each sample independently. If you get a lot of 111 and 000 results, the probability of false results is low, if you get a lot of 101, 100, 011, etc. the accuracy of the test is problematic. We do this stuff in auditory animal detection surveys all the time. Imagine doing a survey for birds where you rely on listening to their songs to determine presence, it's not too hard to confuse a chipping sparrow song with a pine warbler and it's pretty easy to get a false positive when one is present and singing and not the other, so we use repeated surveys and a hierarchical model to estimate the probability of false negatives non-detections and false positive detections.

This is interesting info. It doesn't exactly work this way with antibody testing. You can have 3 samples, all with antibodies detected, and it can still be a false positive result, because the problem isn't with the samples, rather it is with the test or the meaning of antibody presence. Someone can have positive antibodies but it doesn't mean they had, or were exposed to, the disease.
 
2) what’s up with this (new) trend, birdman? I see it a lot in Econ, but almost never in Sociology and Political Science. It seems like all of the partisans are using non-peer reviewed pre-prints to publicize their points from what peer review quickly reveals are deeply problematic findings. Do you see stuff like this happening in your field?

This isn't usually an issue in medicine; I think the pandemic and sense of urgency has resulted in this issue. For most medical journals, it would be a violation of the terms of the journal to put the article on the web before it was peer-reviewed and published.
 
Back
Top