Article,

Validation data-based adjustments for outcome misclassification in logistic regression: an illustration.

R. Lyles, L. Tang, H. Superak, C. King, D. Celentano, Y. Lo, and J. Sobel.
Epidemiology (Cambridge, Mass.), 22 (4): 589-97 (July 2011)6257<m:linebreak></m:linebreak>JID: 9009644; ppublish;.
DOI: 10.1097/EDE.0b013e3182117c85

Abstract

Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.

BibTeX key: Lyles2011
entry type: article
year: 2011
month: 7
journal: Epidemiology (Cambridge, Mass.)
number: 4
pages: 589-97
volume: 22
city: From the aDepartment of Biostatistics and Bioinformatics, The Rollins School of Public Health of Emory University, Atlanta, GA; bCenters for Disease Control and Prevention, Atlanta, GA; cDepartment of Epidemiology, Johns Hopkins Bloomberg School(TRUNCADO)
isbn: 1531-5487; 1044-3983
pmid: 21487295
issn: 1531-5487
DOI: 10.1097/EDE.0b013e3182117c85
url: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3454464&tool=pmcentrez&rendertype=abstract
note: 6257<m:linebreak></m:linebreak>JID: 9009644; ppublish;

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 Lyles2011 %A Lyles, Robert H %A Tang, Li %A Superak, Hillary M %A King, Caroline C %A Celentano, David D %A Lo, Yungtai %A Sobel, Jack D %D 2011 %J Epidemiology (Cambridge, Mass.) %K Bias(Epidemiology) Case-ControlStudies Classification Classification:methods DataInterpretation LikelihoodFunctions LogisticModels OddsRatio ReproducibilityofResults SensitivityandSpecificity Statistical ValidationStudiesasTopic %N 4 %P 589-97 %R 10.1097/EDE.0b013e3182117c85 %T Validation data-based adjustments for outcome misclassification in logistic regression: an illustration. %U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3454464&tool=pmcentrez&rendertype=abstract %V 22 %X Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling. %@ 1531-5487; 1044-3983

@article{Lyles2011, abstract = {Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.}, added-at = {2023-02-03T11:44:35.000+0100}, author = {Lyles, Robert H and Tang, Li and Superak, Hillary M and King, Caroline C and Celentano, David D and Lo, Yungtai and Sobel, Jack D}, biburl = {https://www.bibsonomy.org/bibtex/2430f3a00ef3f815b390dbb63d2084128/jepcastel}, city = {From the aDepartment of Biostatistics and Bioinformatics, The Rollins School of Public Health of Emory University, Atlanta, GA; bCenters for Disease Control and Prevention, Atlanta, GA; cDepartment of Epidemiology, Johns Hopkins Bloomberg School(TRUNCADO)}, doi = {10.1097/EDE.0b013e3182117c85}, interhash = {0921da50b79d43bf9354a8f4c5a17ef8}, intrahash = {430f3a00ef3f815b390dbb63d2084128}, isbn = {1531-5487; 1044-3983}, issn = {1531-5487}, journal = {Epidemiology (Cambridge, Mass.)}, keywords = {Bias(Epidemiology) Case-ControlStudies Classification Classification:methods DataInterpretation LikelihoodFunctions LogisticModels OddsRatio ReproducibilityofResults SensitivityandSpecificity Statistical ValidationStudiesasTopic}, month = {7}, note = {6257<m:linebreak></m:linebreak>JID: 9009644; ppublish;}, number = 4, pages = {589-97}, pmid = {21487295}, timestamp = {2023-02-03T11:44:35.000+0100}, title = {Validation data-based adjustments for outcome misclassification in logistic regression: an illustration.}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3454464&tool=pmcentrez&rendertype=abstract}, volume = 22, year = 2011 }

BibSonomy

Validation data-based adjustments for outcome misclassification in logistic regression: an illustration.

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on