Health warning: I am not a mathematician. That said, here is a mathematical question:
Suppose there is a 99% accurate test for a medical condition – say a symptomless infection. You take the test and get a positive result. What are your chances of having the infection?
That obvious answer might seem to be 99%. But the obvious answer is wrong. The accuracy of the test is only half the information you need to answer the question. You also need to know how common the infection is. Say it occurs once in every hundred people. On average, then, if you test a hundred people, one of whom has the infection, you will get two positive results: one that is accurate and one that is inaccurate, i.e., a false positive. Under those circumstances, a positive result means that you have a ½, or 50%, chance of having the infection (see appendix for further discussion). Under some other circumstances, a positive result on an 80% or 90% accurate test would mean that you have a higher chance of having the infection. Here’s a graphic to illustrate this apparent paradox:
The x-axis represents infection rate per 10,000 of the population, the y-axis represents one’s chance of being infected, from 0%, for no chance, to 100%, for complete certainty. The coloured curves represent tests of different accuracy: 1% accurate, for the bottom curve, and 99% accurate, for the uppermost curve. The curves between the two represent tests of 10% to 90% accuracy. Note how the curves mirror each other: the 99% accurate test rises towards certainty very quickly, but takes a long time to finally get there. The 1% accurate test stays near complete uncertainty for a long time, then finally rises rapidly towards certainty. In other words, a positive result on a 99% accurate test is equivalent to a negative result on a 1% accurate test, and vice versa. Ditto for the 90% and 10% accurate tests, and so on. But a positive (or negative) result on a 50% accurate test is useless, because it never tells you anything new: your chance of being infected, given a positive result, is the same as the rate of infection in the population. And when exactly half the population is infected, your chance of being infected, given a positive result, is the same as the accuracy of the test, whether it’s 1%, 50%, or 99%.
Here is a table illustrating the same points:
Accuracy of test →
Infection rate ↓ |
1% | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 99% |
1/100 | <1% | 0.1% | 0.3% | 0.4% | 0.7% | 1% | 1.5% | 2.3% | 3.9% | 8.3% | 50% |
10/100 | 0.1% | 1.2% | 2.7% | 4.5% | 6.9% | 10% | 14.3% | 20.6% | 30.8% | 50% | 91.7% |
20/100 | 0.3% | 2.7% | 5.9% | 9.7% | 14.3% | 20% | 27.3% | 36.8% | 50% | 69.2% | 96.1% |
30/100 | 0.4% | 4.5% | 9.7% | 15.5% | 22.2% | 30% | 39.1% | 50% | 63.2% | 79.4% | 97.7% |
40/100 | 0.7% | 6.9% | 14.3% | 22.2% | 30.8% | 40% | 50% | 60.9% | 72.7% | 85.7% | 98.5% |
50/100 | 1% | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 99% |
60/100 | 1.5% | 14.3% | 27.3% | 39.1% | 50% | 60% | 69.2% | 77.8% | 85.7% | 93.1% | 99.3% |
70/100 | 2.3% | 20.6% | 36.8% | 50% | 60.9% | 70% | 77.8% | 84.5% | 90.3% | 95.5% | 99.6% |
80/100 | 3.9% | 30.8% | 50% | 63.2% | 72.7% | 80% | 85.7% | 90.3% | 94.1% | 97.3% | 99.7% |
90/100 | 8.3% | 50% | 69.2% | 79.4% | 85.7% | 90% | 93.1% | 95.5% | 97.3% | 98.8% | 99.9% |
99/100 | 50% | 91.7% | 96.1% | 97.7% | 98.5% | 99% | 99.3% | 99.6% | 99.7% | 99.9% | >99.9% |
100/100 | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% |
Appendix
We’ve seen that we have to take false positives into account, but what about false negatives? Suppose that the rate of infection is 1 in 100 and the accuracy of the test is 99%. If the population is 10,000, then 100 people will have the disease and 9,900 will not. If the population is tested, on average 100 x 99% = 99 of the infected people will get an accurate positive result and 100 x 1% = 1 will get an inaccurate negative result, i.e., a false negative. Similarly, 9,900 x 1% = 99 of the non-infected people will get a false positive. So there will be 99 + 99 = 198 positive results, of which 99 are accurate. 99/198 = 1/2 = 50%.