Accuracy of Diagnostic Tests

There are several characteristics that can be used to describe the quality and usefulness of a test. Accuracy is one characteristic.

Accuracy can be expressed through sensitivity and specificity, positive and negative predictive values, or positive and negative diagnostic likelihood ratios. Each measure of accuracy should be used in combination with its complementary measure:

Sensitivity

complements

Specificity

Positive predictive value

complements

Negative predictive value

Positive diagnostic likelihood ratio

complements

Negative diagnostic likelihood ratio

Confidence intervals can be calculated to reflect the statistical significance of each accuracy measure.

Table 1
Calculations of Accuracy

  Reference Test Results
New Test Results   + -
+ TP FP
- FN TN
TP=number of true positive specimens
FP=number of false positive specimens
FN=number of false negative specimens
TN=number of true negative specimens

top

Sensitivity

The sensitivity of a test is the probability that it will produce a true positive result when used on an infected population (as compared to a reference or "gold standard"). After inserting the test results into a table set up like Table 1, the sensitivity of a test can be determined by calculating:

TP
————
TP+FN

top

Specificity

The specificity of a test is the probability that a test will produce a true negative result when used on a noninfected population (as determined by a reference or "gold standard"). After inserting the test results into a table set up like Table 1, the specificity of a test can be determined by calculating:

TN
————
TN+FP

top

Positive Predictive Value

The positive predictive value of a test is the probability that a person is infected when a positive test result is observed. In practice, predictive values should only be calculated from cohort studies or studies that legitimately reflect the number of people in that population who are infected with the disease of interest at that time. This is because predictive values are inherently dependent upon the prevalence of infection. After inserting results into a table set up like Table 1, the positive predictive value of a test can be determined by calculating:

TP
————
TP+FP

top

Negative Predictive Value

The negative predictive value of a test is the probability that a person is not infected when a negative test result is observed. This measure of accuracy should only be used if prevalence is available from the data. (See note in positive predictive value definition.) After inserting test results into a table set up like Table 1, the negative predictive value of a test can be determined by calculating:

TN
————
TN+FN

top

Positive Diagnostic Likelihood Ratios

Diagnostic likelihood ratios (DLR) are not yet commonly reported in peer-reviewed literature or in marketing information provided by test manufacturers, but they can be a valuable tool for comparing the accuracy of several tests to the gold standard, and they are not dependent upon the prevalence of disease. Please see Likelihood ratios: getting diagnostic testing into perspective below for more information.

The positive DLR represents the odds ratio that a positive test result will be observed in an infected population compared to the odds that the same result will be observed among a noninfected population. After inserting test results into a table set up like Table 1, the positive DLR of a test can be determined by calculating:

TP/ TP+FN
—————
FP/ FP+TN

Or it can also be expressed as sensitivity:

sensitivity
—————
1-specificity

Useful tests will, therefore, have larger positive DLRs and less useful tests will have smaller positive DLRs. An example interpretation of a positive diagnostic likelihood ratio equal to 5.0 is for every 1% of noninfected subjects that test as positive, 5% of the infected subjects will test as positive.

top

Negative Diagnostic Likelihood Ratios

The negative DLR represents the odds ratio that a negative test result will be observed in an infected population compared to the odds that the same result will be observed among a noninfected population. After inserting the test results into a table set up like Table 1, the negative DLR for a test can be determined by calculating:

FN/ TP+FN
—————
TN/ FP+TN

Or

false negative rate
———————
true negative rate

Useful tests will, therefore, have negative DLRs close to 0, and less useful tests will have higher negative DLRs. As an example, interpretation of a negative diagnostic likelihood ratio equal to 2.5 is for every one false negative, we observe 2.5 true negatives.

Links to Additional Resources

These links provide information about measures of accuracy and the role of diagnostic tests from a general epidemiology perspective.

  • Diagnostic Effectiveness
    http://home.clara.net/sisa/diaghlp.htm
    This site is part of the Simple Interactive Statistical Analysis website. It includes an interactive table to calculate simple statistics and discusses indicators of diagnostic test effectiveness such as accuracy, sensitivity, specificity, positive likelihood, negative likelihood, diagnostic odds ratio, error odds ratio, prevalence, and predictive accuracy.
  • Designing Studies to Ensure that Estimates of Test Accuracy are Transferable
    http://bmj.com/cgi/reprint/324/7338/669.pdf
    Performance of a diagnostic test may vary from setting to setting. This paper explores the reasons for this variability and the implications for diagnostic test evaluation.
  • Likelihood ratios: getting diagnostic testing into perspective
    http://qjmed.oupjournals.org/cgi/reprint/91/4/247
    This article reviews the performance of diagnostic tests by their likelihood ratio, and compares them to the power of clinical assessment.
  • Communicating Accuracy of Tests to General Practitioners: a Controlled Study
    http://bmj.com/cgi/reprint/324/7341/824.pdf
    This paper discusses common mistakes made by clinicians in the use of diagnostic test statistics. The importance of including diagnostic test sensitivity and specificity as well as positive likelihood ratio in simple language is reinforced.
  • Standards for Reporting of Diagnostic Accuracy (STARD)
    http://bmj.com/cgi/reprint/326/7379/41.pdf
    http://www.clinchem.org/cgi/reprint/49/1/7.pdf
    With the mission to improve the reporting of diagnostic accuracy and to educate readers about the potential for bias in diagnostic evaluation studies, these two articles provide a mission statement, checklist, and flowchart to improve the reporting of diagnostic accuracy studies.
  • Sensitivity and Specificity of Human Immunodeficiency Virus Rapid Serologic Assays and Testing Algorithms in an Antenatal Clinic in Abidjan, Ivory Coast
    http://jcm.asm.org/cgi/reprint/39/5/1808
    This article describes the sensitivity and specificity of several rapid HIV diagnostic tests, both individually and in a testing algorithm using several tests.

General epidemiology Sites

  • Supercourse: Epidemiology, the Internet, and Global Health
    http://www.pitt.edu/~super1/index.htm
    This is a general epidemiology web site put together at the University of Pittsburgh as a "Supercourse" for medical and health students around the world. The topic sites consist of Power Point presentations and include some disease-specific lectures.
  • British Medical Journal
    http://www.bmj.com/epidem/epid.html
    This is the British Medical Journal's "Epi for the Uninitiated" web site and is also a general epidemiology web site.