AE 08: Understanding Probabilities with COVID-19 Rapid Self-Administered Tests

Suggested answers

Application exercise
Answers

Goal

Learn to calculate and interpret the probability of having a disease given a positive test result using sensitivity, specificity, and prevalence data.

Scenario:

You are provided with the following data for COVID-19 rapid self-administered tests and population statistics from Pima County, Arizona found in Lecture 12.

Understand the Definitions:

  • Sensitivity P(T|D): Probability of a positive test given the person has the disease.

  • Specificity P(Tc|Dc): Probability of a negative test given the person does not have the disease.

  • Prevalence P(D): Probability that a randomly selected person has the disease.

Formulate Bayes’ Rule:

P(D|T)=P(T and D)P(T)

We know that:

P(T and D)=P(T|D)P(D)

And:

P(T)=P(T|D)P(D)+P(T|Dc)P(Dc)

Where:

P(T|Dc)=1P(Tc|Dc)

Exercises

Using the given data, calculate the probability that an individual has COVID-19 given a positive test result P(T|D).

  1. Substitute the Given Values:

    • Sensitivity: P(T|D) = 0.087
    • Specificity: P(Tc|Dc) = 0.642
    • Prevalence: P(D) = 0.998. among persons aged 10 years and older.
  2. Calculate the Complementary Probabilities

    • P(T|Dc): Probability of a positive test given no disease.

      P(T|Dc)=1P(Tc|Dc)

      10.998=0.002

    • P(Dc): Probability of not having the disease.

      P(Dc)=1P(D)

      10.087=0.913

  3. Calculate the Probability of a Positive Test P(T):

P(T)=P(T|D)P(D)+P(T|Dc)P(Dc)

P(T)=(0.642×0.087)+(0.002×0.913) P(T)=0.055854+0.001826=0.05768

  1. Calculate the Posterior Probability P(D|T)

P(D|T)=P(T|D)P(D)P(T|D)P(D)+(1P(Tc|Dc))(1P(D))

P(DT)=0.057680.0558540.968

Discussion Questions:

  1. Is this calculation surprising?

    • Considering the given sensitivity, specificity, and prevalence, is the high probability of having the disease given a positive test result unexpected? Why or why not?
  • No, given the high specificity, false positives are minimal, so a positive result is likely accurate.
  1. What is the explanation?

    • Explain why the probability of having the disease given a positive test result is so high. Consider the impact of sensitivity, specificity, and prevalence.
  • The combination of high specificity and moderate sensitivity ensures that the test reliably rules out non-disease cases, contributing to the high posterior probability.
  1. Was this calculation actually reasonable to perform?

    • Discuss whether it is reasonable to calculate the probability of having the disease based on the given data. Are there any limitations or assumptions in this calculation?
  • Yes, but assumptions such as perfect accuracy of prevalence data and no external biases limit real-world applicability.
  1. What if we tested in a different population, such as high-risk individuals?

    • How might the probability of having the disease given a positive test result change if the test was administered to a population with a higher prevalence of COVID-19?
  • The posterior probability would increase with higher prevalence.
  1. What if we were to test a random individual in a county where the prevalence of COVID-19 is approximately 25%?

    • Recalculate the probability of having the disease given a positive test result for a population with a 25% prevalence of COVID-19. How does this compare to the original calculation?
  • If prevalence = 25%:

P(T)=(0.642×0.25)+(0.002×0.75)=0.162

P(D|T)=0.16050.1620.991