AE 08: Understanding Probabilities with COVID-19 Rapid Self-Administered Tests

Goal

Learn to calculate and interpret the probability of having a disease given a positive test result using sensitivity, specificity, and prevalence data.

Scenario:

You are provided with the following data for COVID-19 rapid self-administered tests and population statistics from Pima County, Arizona found in Lecture 12.

Understand the Definitions:

Sensitivity $P (T | D)$ : Probability of a positive test given the person has the disease.
Specificity $P (T^{c} | D^{c})$ : Probability of a negative test given the person does not have the disease.
Prevalence $P (D)$ : Probability that a randomly selected person has the disease.

Formulate Bayes’ Rule:

$P (D | T) = \frac{P (T and D)}{P (T)}$

We know that:

$P (T and D) = P (T | D) \cdot P (D)$

And:

$P (T) = P (T | D) \cdot P (D) + P (T | D^{c}) \cdot P (D^{c})$

Where:

$P (T | D^{c}) = 1 - P (T^{c} | D^{c})$

Exercises

Using the given data, calculate the probability that an individual has COVID-19 given a positive test result $P (T | D)$ .

Substitute the Given Values:
- Sensitivity: $P (T | D)$ = 0.087
- Specificity: $P (T^{c} | D^{c})$ = 0.642
- Prevalence: $P (D)$ = 0.998. among persons aged 10 years and older.
Calculate the Complementary Probabilities
- $P (T | D^{c})$ : Probability of a positive test given no disease.
  
  $P (T | D^{c}) = 1 - P (T^{c} | D^{c})$
  
  $1 - 0.998 = 0.002$
- $P (D^{c})$ : Probability of not having the disease.
  
  $P (D^{c}) = 1 - P (D)$
  
  $1 - 0.087 = 0.913$
Calculate the Probability of a Positive Test $P (T)$ :

$P (T) = P (T | D) \cdot P (D) + P (T | D^{c}) \cdot P (D^{c})$

$P (T) = (0.642 \times 0.087) + (0.002 \times 0.913)$ $P (T) = 0.055854 + 0.001826 = 0.05768$

Calculate the Posterior Probability $P (D | T)$

$P (D | T) = \frac{P (T | D) \cdot P (D)}{P (T | D) \cdot P (D) + (1 - P (T^{c} | D^{c})) \cdot (1 - P (D))}$

$P (D ∣ T) = \frac{0.05768}{0.055854} \approx 0.968$

Discussion Questions:

Is this calculation surprising?
- Considering the given sensitivity, specificity, and prevalence, is the high probability of having the disease given a positive test result unexpected? Why or why not?

No, given the high specificity, false positives are minimal, so a positive result is likely accurate.

What is the explanation?
- Explain why the probability of having the disease given a positive test result is so high. Consider the impact of sensitivity, specificity, and prevalence.

The combination of high specificity and moderate sensitivity ensures that the test reliably rules out non-disease cases, contributing to the high posterior probability.

Was this calculation actually reasonable to perform?
- Discuss whether it is reasonable to calculate the probability of having the disease based on the given data. Are there any limitations or assumptions in this calculation?

Yes, but assumptions such as perfect accuracy of prevalence data and no external biases limit real-world applicability.

What if we tested in a different population, such as high-risk individuals?
- How might the probability of having the disease given a positive test result change if the test was administered to a population with a higher prevalence of COVID-19?

The posterior probability would increase with higher prevalence.

What if we were to test a random individual in a county where the prevalence of COVID-19 is approximately 25%?
- Recalculate the probability of having the disease given a positive test result for a population with a 25% prevalence of COVID-19. How does this compare to the original calculation?

If prevalence = 25%:

$P (T) = (0.642 \times 0.25) + (0.002 \times 0.75) = 0.162$

$P (D | T) = \frac{0.1605}{0.162} \approx 0.991 $