Diagnostic tests: how to estimate the positive predictive value (2024)

Article Navigation

Volume 2 Issue 4 December 2015

Article Contents

  • Abstract

  • Approach 1: Conditional Probability Equations

  • Approach 2: Tree Diagrams

  • Approach 3: Natural frequencies

  • Conclusion

  • Funding

  • Acknowledgments

  • References

  • < Previous
  • Next >

Journal Article Editor's Choice

Annette M. Molinaro

Department of Neurological Surgery

,

University of California

,

San Francisco, San Francisco, California

;

Department of Epidemiology and Biostatistics

,

University of California

,

San Francisco, San Francisco, California

Corresponding Author: Annette M. Molinaro, PhD, UCSF Department of Neurosurgery, 400 Parnassus Ave A850b, Room A 808, San Francisco CA 94143-0372 (annette.molinaro@ucsf.edu).

Search for other works by this author on:

Oxford Academic

Neuro-Oncology Practice, Volume 2, Issue 4, December 2015, Pages 162–166, https://doi.org/10.1093/nop/npv030

Published:

07 September 2015

Article history

Received:

12 May 2015

Published:

07 September 2015

  • PDF
  • Split View
  • Views
    • Article contents
    • Figures & tables
    • Video
    • Audio
    • Supplementary Data
  • Cite

    Cite

    Annette M. Molinaro, Diagnostic tests: how to estimate the positive predictive value, Neuro-Oncology Practice, Volume 2, Issue 4, December 2015, Pages 162–166, https://doi.org/10.1093/nop/npv030

    Close

Search

Close

Search

Advanced Search

Search Menu

Abstract

When a patient receives a positive test result from a diagnostic test they assume they have the disease. However, the positive predictive value (PPV), ie the probability that they have the disease given a positive test result, is rarely equal to one. To assist their patients, doctors must explain the chance that they do in fact have the disease. However, physicians frequently miscalculate the PPV as the sensitivity and/or misinterpret the PPV, which results in increased anxiety in patients and generates unnecessary tests and consultations. The reasons for this miscalculation as well as three ways to calculate the PPV are reviewed here.

diagnostic tests, false positive rate, positive predictive value, sensitivity, statistics

Prevalence of glioma is 0.003%. A patient comes into the clinic complaining of headaches and memory loss. A new blood test for diagnosis of glioma is available. The patient tests positive. From the literature (see Table 1) you know that the sensitivity of the test is 96.7% and the false positive rate is 4%. What is the probability that this patient who tested positive actually has glioma?

This is an understandably difficult problem, since it pertains to conditional probabilities (sensitivity, specificity, and positive predictive value [PPV]) and varying reference populations (those with disease and those without). Nonetheless, an informed interpretation of diagnostic tests is increasingly important, especially as novel biomarkers are used in the detection of disease. Unfortunately, studies have shown that more than 75% of the doctors answer questions similar to that above incorrectly.1–5

Table 1.

Open in new tab

Fictional table from literature.

Disease StatusTotal
Glioma PresentGlioma Absent
Test Result
 Positive29231
 Negative14849
Total305080
Disease StatusTotal
Glioma PresentGlioma Absent
Test Result
 Positive29231
 Negative14849
Total305080

In this data, the prevalence of disease is P(D) = 30/80 = 0.375; the sensitivity is P(Test positive | Glioma present) = 29/30 = 0.967; the false positive rate is P(Test positive | Glioma absent) = 2/50 = 0.04. See Table 2 for formulas.

Table 1.

Open in new tab

Fictional table from literature.

Disease StatusTotal
Glioma PresentGlioma Absent
Test Result
 Positive29231
 Negative14849
Total305080
Disease StatusTotal
Glioma PresentGlioma Absent
Test Result
 Positive29231
 Negative14849
Total305080

In this data, the prevalence of disease is P(D) = 30/80 = 0.375; the sensitivity is P(Test positive | Glioma present) = 29/30 = 0.967; the false positive rate is P(Test positive | Glioma absent) = 2/50 = 0.04. See Table 2 for formulas.

The goal of this review is to ease the calculation of conditional probabilities (eg, the PPV in the example above) by explaining three ways to solve them: conditional probability equations, tree diagrams (with probabilities), and natural frequencies. You have the option of reviewing all three or just one or two of the approaches. Any of the three will get you to the correct answer. We begin with the calculation via conditional probabilities and follow with building tree diagrams for a visual representation. Subsequently, we illustrate a way to translate this information via natural frequencies for you and your patients so that they too understand the meaning of a positive or negative test result.

Approach 1: Conditional Probability Equations

Conditional probabilities are important in the interpretation of diagnostic tests because the test results influence our understanding of whether the patient has a disease. However, the test results are not synonymous with the presence or absence of disease. The conditional probabilities that we need to understand are sensitivity, specificity, PPV, and negative predictive value (NPV). These probabilities are defined by two events: the presence of disease and a positive test result.

Sensitivity is defined as the probability of a positive test result given the presence of disease, written as: P(positive test | disease present). The vertical line can be read as “given.” Specificity is defined as the probability of a negative test result given absence of disease, ie P(negativetest | diseaseabsent).PPV is defined as the probability of the presence of disease given a positive test result, ie, P(disease present | positivetest).NPV is defined as the probability of the absence of disease given a negative test result, ie, P(diseaseabsent | negativetest). Given the similarities in calculation between PPV and NPV we will only focus on the former here.

There are two important things to know about conditional probabilities. First, conditional probabilities are not reciprocal, ie,

P(EventA|EventB)P(EventB|EventA).

This is one of the most common errors that doctors make when calculating PPV – they simply equate it with the test's sensitivity.

Second, you can write a conditional probability as:

P(EventA|EventB)=P(EventAandEventB)P(EventB).

The importance of the fraction on the right has to do with how we will connect the sensitivity to PPV and will become clearer when we learn how to rewrite the numerator on the right-hand side. To do so, we need the multiplication rule, which is the probability that both events occur, ie P(EventAandEventB). This can be written as:

P(EventAandEventB)=P(EventB)P(EventA|EventB)

or with our events as:

P(disease presentand positivetest)=P(disease present)P(positivetest\;|\;disease present)

which is equivalent to:

P(True positive)=PrevalenceSensitivity.

Similarly the probability of a false positive can be written as:

P(False positive)=P(diseaseabsentand positivetest)=P(diseaseabsent)P(positivetest |\;diseaseabsent)=(1Prevalence)FalsePositiveRate

Now we can connect the PPV to the sensitivity:

PPV=P(disease present |positivetest)

Expressed as the other form of conditional probability, we can see this as:

=P(disease presentandpositivetest)P(positivetest)

And by applying the multiplication rule, we can rewrite this as:

=P(disease present)P(positivetest | disease present)P(positivetest)=PrevalenceSensitivityP(positivetest)

In the denominator, a positive test can come from those patients with the presence of disease (true positives) and those with the absence of disease (false positives). Therefore we can write: P(positivetest)=P(true positive)+P(false positive). The two probabilities on the right were defined above. We can continue the calculation to get the PPV:

=PrevalenceSensitivityP(true positive)+P(false positive)

In the example of the test for glioma above, we would substitute the values for prevalence, sensitivity, and false positives, and calculate:

=(0.00003)(0.967)((0.00003)(0.967))+((10.00003)(0.04))=0.000725

Thus, the chance that the patient has glioma given a positive test result is 0.07%.

There are many similarities between a 2 × 2 table (Table 1) and conditional probabilities. You can see from Table 2 how to calculate sensitivity, specificity, and PPV from a 2 × 2 table. However, PPV can only be calculated from a 2 × 2 table if the prevalence [P(Disease present) = number of people with disease/number of people in population (or sample)] in the table is the same as that in the population. Typically the reason the prevalence in a 2 × 2 table does not reflect the population prevalence is because the table is based on case-control data in which a specified number of cases (patients with disease) and controls (patients without disease) are studied for the purpose of finding associations. For example, in Table 1 the hypothetical data are based on a case-control study with 30 cases and 50 controls and thus the prevalence of disease is (30/80)=37.5%. Using the same calculations as above but with a prevalence of 37.5%, the PPV equals 94%, which is incorrect, as we know the prevalence in the population is 0.003%. Thus, if the prevalence of the disease in a 2 × 2 table is not the same as in the population you cannot calculate the PPV (or NPV).

Table 2.

Open in new tab

A 2 × 2 table with test results in the rows and disease status in the columns

Diagnostic tests: how to estimate the positive predictive value (3)

Diagnostic tests: how to estimate the positive predictive value (4)

Sensitivity, Specificity, and False positive/negative rate can be calculated from any such 2 × 2 table. Positive and Negative predictive values can only be calculated from a 2 × 2 table if the prevalence of disease in the table is the same as that in the population. It should be noted that the false positive rate is the P(negative test | disease absent) while the false positive in the 2 × 2 table is the P(positive test and disease absent).

Table 2.

Open in new tab

A 2 × 2 table with test results in the rows and disease status in the columns

Diagnostic tests: how to estimate the positive predictive value (5)

Diagnostic tests: how to estimate the positive predictive value (6)

Sensitivity, Specificity, and False positive/negative rate can be calculated from any such 2 × 2 table. Positive and Negative predictive values can only be calculated from a 2 × 2 table if the prevalence of disease in the table is the same as that in the population. It should be noted that the false positive rate is the P(negative test | disease absent) while the false positive in the 2 × 2 table is the P(positive test and disease absent).

Approach 2: Tree Diagrams

Another way to display the data is in a tree diagram3,6 (Fig. 1). Starting on the left at the “Individual” the first split corresponds to disease status, the patient either has disease or does not. The top line going from “Individual” to “Disease” shows the prevalence of disease while the bottom line shows the probability of not having the disease, 1Prevalence. Similar to disease status, the test result can either be positive or negative. The line between “Disease” and “Positive test” displays the sensitivity, ie P(positivetest | disease present), whereas the line between “No Disease” and “Negative test” shows the specificity, ie P(negativetest | diseaseabsent). The conditional probabilities associated with the other two lines, the false positive/negative rates, can be written similarly. Note that the two lines coming from the same box must sum to one, eg prevalence+(1prevalence)=1. That is also true for sensitivity and the false negative rate as well as the false positive rate and specificity. The four squares of the 2 × 2 table can also be calculated on the far right of the tree diagram by using the multiplication rule, eg

P(true positive)=P(disease presentand positivetest)=P(disease present)P(positivetest | disease present)=PrevalenceSensitivity

P(false positive)=P(diseaseabsentand positive test)=P(disease absent)P(positive test | disease absent)=(1Prevalence)False positive rate.

We can display the information from the original question in a tree diagram to help calculate the PPV. In Fig. 2, the known information is in bold and the inferred information is in italic. Note that the people with a positive test are either true positives (disease present and a positive test) or false positives (no disease and a positive test). Because the prevalence in the tree diagram is considered in calculating true positives a simpler way of calculating the PPV is:

PPV=P(Disease|Positivetest)

Or, as expressed as the other form of conditional probability:

=P(DiseaseandPositivetest)P(Positivetest)=P(TruePositive)P(TruePositive)+P(FalsePositive)

If we substitute numbers from the tree diagram, we can calculate:

=(0.000029)(0.000029)+(0.04)=0.000725

Thus, the chance that the patient has glioma given a positive test result is 0.07%. This PPV should be clearly communicated to the patient. As it can be difficult to explain conditional probabilities to patients, we will explore an alternative option.

Diagnostic tests: how to estimate the positive predictive value (7)

Fig. 1.

Tree diagram representing all possible outcomes of a diagnostic test. P(A) is the probability of Event A. P(B|A) is the conditional probability of Event B given Event A. FPR is the false positive rate = P(Positive test | Disease absent). FNR is the False negative rate = P(Negative test | Disease present).

Open in new tabDownload slide

Diagnostic tests: how to estimate the positive predictive value (8)

Fig. 2.

Tree diagram representing all possible outcomes and condition probabilities given in hypothetical diagnostic test example. Text in bold is given in example. Text in italic is calculated from given information in bold. FPR is the false positive rate = P(Positive test | Disease absent). FNR is the False negative rate = P(Negative test | Disease present).

Open in new tabDownload slide

Approach 3: Natural frequencies

To help patients understand conditional probabilities you can translate them to natural frequencies with or without the use of a tree diagram.1,3 Natural frequencies are the way most people are presented with statistics and, thus, make interpretation simpler. We can directly translate the original question into natural frequencies and illustrate the ease with which the question can be answered.

Three out of every 100 000 people have glioma. A patient comes into the clinic complaining of headaches and memory loss. A new blood test for diagnosis of glioma is available. She tests positive. From the literature you know that of the three people out of 100 000 with glioma, all three will likely have a positive blood test. Of the 99 997 people without glioma, 4000 will still have a positive blood test. Of the patients with a positive blood test, how many actually have glioma?

Now the answer is much more straightforward to calculate: it is 3/(3+4000)=0.0007. Again, this is the PPV, the chance that a patient with a positive test result actually has glioma.

One of the reasons natural frequencies make this problem easier to understand is that they use the same reference group. For example, three patients (with a positive blood test and glioma) and 4000 patients (with a positive blood test and no glioma) both refer to the same group of 100 000 people. In contrast, in the original question the sensitivity refers to the group of three patients with glioma while the specificity refers to the group of 4000 patients without glioma. A pitfall of using natural frequencies is that mistakes can be made in translating the conditional probabilities to frequencies and thus caution must be used.

Conclusion

Positive predictive value is the probability that a person who receives a positive test result actually has the disease. This is what patients want to know. Nonetheless, physicians frequently miscalculate and/or misinterpret the PPV, which results in increased anxiety in patients and generates unnecessary tests and consultations. One of the reasons for miscalculation is that conditional probabilities are not reciprocal, meaning that the P(B|A)P(A|B), or in our example that sensitivity does not equal PPV. A second reason is that the PPV relies on the prevalence of disease and therefore the PPV cannot be calculated from a data set that does not have the same prevalence as the population. Finally, conditional probabilities can be conceptual and many studies have shown that reframing the problem in natural frequencies (with or without tree diagrams) increases the ability of a physician to correctly calculate the PPV.1,3

Here we have shown three ways to calculate the PPV: conditional probabilities, tree diagrams and natural frequencies. In all three, we show that the PPV of the hypothetical blood test equals 0.07%. The implication of this is crucial but often goes unnoticed. For any rare disease, such as glioma, the percent of false positives tends to be appreciable even though the sensitivity and specificity may be high. The ramification is that the vast majority of positive test results will be false positives. An advantage of a low prevalence of disease is that a patient with a negative test result is very unlikely to have the disease, ie the negative predictive value (NPV) is large. In the hypothetical example the NPV can be calculated similarly to the PPV and shown to equal 99.99%.

Given the current focus on finding novel biomarkers to be used in the detection of disease, an informed interpretation of diagnostic tests is increasingly important. Equally important is the translation of this information to your patients. We hope these tools will be helpful in both understanding and relaying conditional probabilities to your patients.

Funding

This study was supported by R01 CA163687 (Annette M. Molinaro, Principal Investigator).

Acknowledgments

The author would like to thank Jennifer Clarke, David Elson, and Seunggu Han for their input and suggestions on presentation of this material.

Conflict of interest statement. None declared.

References

1

Gigerenzer

G

,

Edwards

A

.

Simple tools for understanding risks: from innumeracy to insight

.

Br Med J

.

2003-09-25 21:58:31

,

2003

;

327

(7417)

:

741

744

.

2

Casscells

W

,

Schoenberger

A

,

Graboys

TB

.

Interpretation by Physicians of Clinical Laboratory Results

.

N Engl J Med

.

1978

;

299

(18)

:

999

1001

.

3

Friederichs

H

,

Ligges

S

,

Weissenstein

A

.

Using Tree Diagrams without Numerical Values in Addition to Relative Numbers Improves Students’ Numeracy Skills: A Randomized Study in Medical Education

.

Med Decis Making

.

2014

;

34

(2)

:

253

257

.

4

Manrai

AK

,

Bhatia

G

,

Strymish

J

,

Kohane

IS

,

Jain

SH

.

Medicine's uncomfortable relationship with math: Calculating positive predictive value

.

JAMA Intern Med

.

2014

;

174

(6)

:

991

993

.

5

Eddy

D

.

Probabilistic reasoning in clinical medicine: problems and opportunities

. In:

Kahneman

D

,

Sloviv

P

,

Tversky

A

, eds.

Judgement under uncertainty: Heuristics and Biases

.

Cambridge, UK

:

Cambridge University Press

;

1982

:

249

267

.

6

Baldi

B

,

Moore

DS

.

The Practice of Statistics in the Life Sciences, 2nd ed

.

New York, NY

:

W. H. Freeman

;

2010

.

Google Scholar

OpenURL Placeholder Text

© The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Topic:

  • diagnostic techniques and procedures
  • false-positive results
  • laboratory test finding

Download all slides

Advertisem*nt

Citations

Views

68,156

Altmetric

More metrics information

Metrics

Total Views 68,156

63,982 Pageviews

4,174 PDF Downloads

Since 12/1/2016

Month: Total Views:
December 2016 1
January 2017 1
February 2017 8
March 2017 5
April 2017 6
May 2017 4
June 2017 2
July 2017 10
August 2017 19
September 2017 47
October 2017 100
November 2017 136
December 2017 353
January 2018 454
February 2018 580
March 2018 880
April 2018 1,207
May 2018 1,324
June 2018 1,123
July 2018 790
August 2018 802
September 2018 879
October 2018 741
November 2018 721
December 2018 585
January 2019 550
February 2019 605
March 2019 755
April 2019 1,076
May 2019 917
June 2019 747
July 2019 757
August 2019 719
September 2019 899
October 2019 1,032
November 2019 933
December 2019 632
January 2020 789
February 2020 1,012
March 2020 857
April 2020 1,822
May 2020 1,103
June 2020 1,319
July 2020 1,036
August 2020 930
September 2020 1,295
October 2020 1,311
November 2020 1,345
December 2020 994
January 2021 946
February 2021 913
March 2021 1,008
April 2021 786
May 2021 793
June 2021 525
July 2021 637
August 2021 960
September 2021 2,274
October 2021 2,131
November 2021 1,570
December 2021 1,496
January 2022 1,343
February 2022 990
March 2022 1,116
April 2022 872
May 2022 828
June 2022 531
July 2022 575
August 2022 531
September 2022 723
October 2022 899
November 2022 818
December 2022 561
January 2023 749
February 2023 832
March 2023 713
April 2023 615
May 2023 571
June 2023 513
July 2023 467
August 2023 477
September 2023 550
October 2023 487
November 2023 421
December 2023 452
January 2024 506
February 2024 572
March 2024 918
April 2024 407
May 2024 502
June 2024 365

Citations

Powered by Dimensions

21 Web of Science

Altmetrics

×

Email alerts

Article activity alert

Advance article alerts

New issue alert

Receive exclusive offers and updates from Oxford Academic

Citing articles via

Google Scholar

  • Latest

  • Most Read

  • Most Cited

Validation of the Graded Prognostic Assessment and Recursive Partitioning Analysis as prognostic tools using a modern cohort of patients with brain metastases
Histopathologic and molecular profile of gliomas diagnosed in Lagos Nigeria
Challenges and opportunities in newly diagnosed glioblastoma in the UK – a Delphi panel
CSF diversion prior to posterior fossa tumour resection in adults: a systematic review
Prospective assessment of end-of-life symptoms and quality of life in patients with high-grade glioma

More from Oxford Academic

Clinical Medicine

Medical Oncology

Medicine and Health

Neurology

Books

Journals

Advertisem*nt

Diagnostic tests: how to estimate the positive predictive value (2024)
Top Articles
Latest Posts
Article information

Author: Fredrick Kertzmann

Last Updated:

Views: 6128

Rating: 4.6 / 5 (66 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Fredrick Kertzmann

Birthday: 2000-04-29

Address: Apt. 203 613 Huels Gateway, Ralphtown, LA 40204

Phone: +2135150832870

Job: Regional Design Producer

Hobby: Nordic skating, Lacemaking, Mountain biking, Rowing, Gardening, Water sports, role-playing games

Introduction: My name is Fredrick Kertzmann, I am a gleaming, encouraging, inexpensive, thankful, tender, quaint, precious person who loves writing and wants to share my knowledge and understanding with you.