Accuracy of Diagnostic Tests
There are several characteristics that can be used to describe the quality and usefulness of a test. Accuracy is one characteristic.
Accuracy can be expressed through sensitivity and specificity, positive and negative predictive values, or positive and negative diagnostic likelihood ratios. Each measure of accuracy should be used in combination with its complementary measure:
|
complements |
||
|
complements |
||
|
complements |
Confidence intervals can be calculated to reflect the statistical significance of each accuracy measure.
|
Table 1 |
|||
| Reference Test Results | |||
| New Test Results | + | - | |
| + | TP | FP | |
| - | FN | TN | |
| TP=number of true positive specimens FP=number of false positive specimens FN=number of false negative specimens TN=number of true negative specimens |
|||
Sensitivity
The sensitivity of a test is the probability that it will produce a true positive result when used on an infected population (as compared to a reference or "gold standard"). After inserting the test results into a table set up like Table 1, the sensitivity of a test can be determined by calculating:
| TP TP+FN |
Specificity
The specificity of a test is the probability that a test will produce a true negative result when used on a noninfected population (as determined by a reference or "gold standard"). After inserting the test results into a table set up like Table 1, the specificity of a test can be determined by calculating:
| TN TN+FP |
Positive Predictive Value
The positive predictive value of a test is the probability that a person is infected when a positive test result is observed. In practice, predictive values should only be calculated from cohort studies or studies that legitimately reflect the number of people in that population who are infected with the disease of interest at that time. This is because predictive values are inherently dependent upon the prevalence of infection. After inserting results into a table set up like Table 1, the positive predictive value of a test can be determined by calculating:
| TP TP+FP |
Negative Predictive Value
The negative predictive value of a test is the probability that a person is not infected when a negative test result is observed. This measure of accuracy should only be used if prevalence is available from the data. (See note in positive predictive value definition.) After inserting test results into a table set up like Table 1, the negative predictive value of a test can be determined by calculating:
| TN TN+FN |
Positive Diagnostic Likelihood Ratios
Diagnostic likelihood ratios (DLR) are not yet commonly reported in peer-reviewed literature or in marketing information provided by test manufacturers, but they can be a valuable tool for comparing the accuracy of several tests to the gold standard, and they are not dependent upon the prevalence of disease. Please see Likelihood ratios: getting diagnostic testing into perspective below for more information.
The positive DLR represents the odds ratio that a positive test result will be observed in an infected population compared to the odds that the same result will be observed among a noninfected population. After inserting test results into a table set up like Table 1, the positive DLR of a test can be determined by calculating:
| TP/ TP+FN FP/ FP+TN |
Or it can also be expressed as sensitivity:
| sensitivity 1-specificity |
Useful tests will, therefore, have larger positive DLRs and less useful tests will have smaller positive DLRs. An example interpretation of a positive diagnostic likelihood ratio equal to 5.0 is for every 1% of noninfected subjects that test as positive, 5% of the infected subjects will test as positive.
Negative Diagnostic Likelihood Ratios
The negative DLR represents the odds ratio that a negative test result will be observed in an infected population compared to the odds that the same result will be observed among a noninfected population. After inserting the test results into a table set up like Table 1, the negative DLR for a test can be determined by calculating:
| FN/ TP+FN TN/ FP+TN |
Or
| false negative rate true negative rate |
Useful tests will, therefore, have negative DLRs close to 0, and less useful tests will have higher negative DLRs. As an example, interpretation of a negative diagnostic likelihood ratio equal to 2.5 is for every one false negative, we observe 2.5 true negatives.
Links to Additional Resources
These links provide information about measures of accuracy and the role of diagnostic tests from a general epidemiology perspective.
- Diagnostic Effectiveness
http://home.clara.net/sisa/diaghlp.htm
This site is part of the Simple Interactive Statistical Analysis website. It includes an interactive table to calculate simple statistics and discusses indicators of diagnostic test effectiveness such as accuracy, sensitivity, specificity, positive likelihood, negative likelihood, diagnostic odds ratio, error odds ratio, prevalence, and predictive accuracy. - Designing Studies to Ensure that Estimates of Test Accuracy are Transferable
http://bmj.com/cgi/reprint/324/7338/669.pdf
Performance of a diagnostic test may vary from setting to setting. This paper explores the reasons for this variability and the implications for diagnostic test evaluation. - Likelihood ratios: getting diagnostic testing into perspective
http://qjmed.oupjournals.org/cgi/reprint/91/4/247
This article reviews the performance of diagnostic tests by their likelihood ratio, and compares them to the power of clinical assessment. - Communicating Accuracy of Tests to General Practitioners: a Controlled Study
http://bmj.com/cgi/reprint/324/7341/824.pdf
This paper discusses common mistakes made by clinicians in the use of diagnostic test statistics. The importance of including diagnostic test sensitivity and specificity as well as positive likelihood ratio in simple language is reinforced. - Standards for Reporting of Diagnostic Accuracy (STARD)
http://bmj.com/cgi/reprint/326/7379/41.pdf
http://www.clinchem.org/cgi/reprint/49/1/7.pdf
With the mission to improve the reporting of diagnostic accuracy and to educate readers about the potential for bias in diagnostic evaluation studies, these two articles provide a mission statement, checklist, and flowchart to improve the reporting of diagnostic accuracy studies. - Sensitivity and Specificity of Human Immunodeficiency Virus Rapid Serologic Assays and Testing Algorithms in an Antenatal Clinic in Abidjan, Ivory Coast
http://jcm.asm.org/cgi/reprint/39/5/1808
This article describes the sensitivity and specificity of several rapid HIV diagnostic tests, both individually and in a testing algorithm using several tests.
General epidemiology Sites
- Supercourse: Epidemiology, the Internet, and Global Health
http://www.pitt.edu/~super1/index.htm
This is a general epidemiology web site put together at the University of Pittsburgh as a "Supercourse" for medical and health students around the world. The topic sites consist of Power Point presentations and include some disease-specific lectures. - British Medical Journal
http://www.bmj.com/epidem/epid.html
This is the British Medical Journal's "Epi for the Uninitiated" web site and is also a general epidemiology web site.