Applied Logistic Regression
Applied Logistic Regression
Third Edition
DAVID W. HOSMER, JR.
Professor of Biostatistics (Emeritus)
Division of Biostatistics and Epidemiology
Department of Public Health
School of Public Health and Health Sciences
University of Massachusetts
Amherst, Massachusetts
STANLEY LEMESHOW
Dean, College of Public Health
Professor of Biostatistics
College of Public Health
The Ohio State University
Columbus, Ohio
RODNEY X. STURDIVANT
Colonel, U.S. Army
Academy and Associate Professor
Department of Mathematical Sciences
United States Military Academy
West Point, New York
Copyright © 2013 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee
to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400,
fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission
should be addressed to the Permissions Department, John Wiley &Sons, Inc., 111 River Street,
Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at
http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts
in preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifically disclaim any implied warranties of
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be
suitable for your situation. You should consult with a professional where appropriate. Neither the
publisher nor author shall be liable for any loss of profit or any other commercial damages, including
but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at
(317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print
may not be available in electronic formats. For more information about Wiley products, visit our web
site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data Is Available
Hosmer, David W.
Applied Logistic Regression / David W. Hosmer, Jr., Stanley Lemeshow, Rodney X. Sturdivant. -
3rd ed.
Includes bibliographic references and index.
ISBN 978-0-470-58247-3 (cloth)
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
To our wives, Trina, Elaine, and Mandy,
and our sons, daughters,
and grandchildren
Contents
Preface to the Third Edition xiii
1 Introduction to the Logistic Regression Model 1
1.1 Introduction, 1
1.2 Fitting the Logistic Regression Model, 8
1.3 Testing for the Significance of the Coefficients, 10
1.4 Confidence Interval Estimation, 15
1.5 Other Estimation Methods, 20
1.6 Data Sets Used in Examples and Exercises, 22
1.6.1 The ICU Study, 22
1.6.2 The Low Birth Weight Study, 24
1.6.3 The Global Longitudinal Study of Osteoporosis
in Women, 24
1.6.4 The Adolescent Placement Study, 26
1.6.5 The Burn Injury Study, 27
1.6.6 The Myopia Study, 29
1.6.7 The NHANES Study, 31
1.6.8 The Polypharmacy Study, 31
Exercises, 32
2 The Multiple Logistic Regression Model 35
2.1 Introduction, 35
2.2 The Multiple Logistic Regression Model, 35
2.3 Fitting the Multiple Logistic Regression Model, 37
2.4 Testing for the Significance of the Model, 39
2.5 Confidence Interval Estimation, 42
2.6 Other Estimation Methods, 45
Exercises, 46
vii
viii contents
3 Interpretation of the Fitted Logistic Regression Model 49
3.1 Introduction, 49
3.2 Dichotomous Independent Variable, 50
3.3 Polychotomous Independent Variable, 56
3.4 Continuous Independent Variable, 62
3.5 Multivariable Models, 64
3.6 Presentation and Interpretation of the Fitted Values, 77
3.7 A Comparison of Logistic Regression and Stratified Analysis
for 2 × 2 Tables, 82
Exercises, 87
4 Model-Building Strategies and Methods for Logistic Regression 89
4.1 Introduction, 89
4.2 Purposeful Selection of Covariates, 89
4.2.1 Methods to Examine the Scale of a Continuous
Covariate in the Logit, 94
4.2.2 Examples of Purposeful Selection, 107
4.3 Other Methods for Selecting Covariates, 124
4.3.1 Stepwise Selection of Covariates, 125
4.3.2 Best Subsets Logistic Regression, 133
4.3.3 Selecting Covariates and Checking their Scale
Using Multivariable Fractional Polynomials, 139
4.4 Numerical Problems, 145
Exercises, 150
5 Assessing the Fit of the Model 153
5.1 Introduction, 153
5.2 Summary Measures of Goodness of Fit, 154
5.2.1 Pearson Chi-Square Statistic, Deviance,
and Sum-of-Squares, 155
5.2.2 The Hosmer–Lemeshow Tests, 157
5.2.3 Classification Tables, 169
5.2.4 Area Under the Receiver Operating Characteristic
Curve, 173
5.2.5 Other Summary Measures, 182
5.3 Logistic Regression Diagnostics, 186
5.4 Assessment of Fit via External Validation, 202
contents ix
5.5 Interpretation and Presentation of the Results from a Fitted
Logistic Regression Model, 212
Exercises, 223
6 Application of Logistic Regression with Different Sampling
Models 227
6.1 Introduction, 227
6.2 Cohort Studies, 227
6.3 Case-Control Studies, 229
6.4 Fitting Logistic Regression Models to Data from Complex
Sample Surveys, 233
Exercises, 242
7 Logistic Regression for Matched Case-Control Studies 243
7.1 Introduction, 243
7.2 Methods For Assessment of Fit in a 1–M Matched
Study, 248
7.3 An Example Using the Logistic Regression Model in a 1–1
Matched Study, 251
7.4 An Example Using the Logistic Regression Model in a 1–M
Matched Study, 260
Exercises, 267
8 Logistic Regression Models for Multinomial and Ordinal
Outcomes 269
8.1 The Multinomial Logistic Regression Model, 269
8.1.1 Introduction to the Model and Estimation of Model
Parameters, 269
8.1.2 Interpreting and Assessing the Significance of the
Estimated Coefficients, 272
8.1.3 Model-Building Strategies for Multinomial Logistic
Regression, 278
8.1.4 Assessment of Fit and Diagnostic Statistics for the
Multinomial Logistic Regression Model, 283
8.2 Ordinal Logistic Regression Models, 289
8.2.1 Introduction to the Models, Methods for Fitting, and
Interpretation of Model Parameters, 289
8.2.2 Model Building Strategies for Ordinal Logistic
Regression Models, 305
Exercises, 310
x contents
9 Logistic Regression Models for the Analysis of Correlated Data 313
9.1 Introduction, 313
9.2 Logistic Regression Models for the Analysis of Correlated
Data, 315
9.3 Estimation Methods for Correlated Data Logistic Regression
Models, 318
9.4 Interpretation of Coefficients from Logistic Regression
Models for the Analysis of Correlated Data, 323
9.4.1 Population Average Model, 324
9.4.2 Cluster-Specific Model, 326
9.4.3 Alternative Estimation Methods for the
Cluster-Specific Model, 333
9.4.4 Comparison of Population Average and
Cluster-Specific Model, 334
9.5 An Example of Logistic Regression Modeling with
Correlated Data, 337
9.5.1 Choice of Model for Correlated Data Analysis, 338
9.5.2 Population Average Model, 339
9.5.3 Cluster-Specific Model, 344
9.5.4 Additional Points to Consider when Fitting Logistic
Regression Models to Correlated Data, 351
9.6 Assessment of Model Fit, 354
9.6.1 Assessment of Population Average Model Fit, 354
9.6.2 Assessment of Cluster-Specific Model Fit, 365
9.6.3 Conclusions, 374
Exercises, 375
10 Special Topics 377
10.1 Introduction, 377
10.2 Application of Propensity Score Methods in Logistic
Regression Modeling, 377
10.3 Exact Methods for Logistic Regression Models, 387
10.4 Missing Data, 395
10.5 Sample Size Issues when Fitting Logistic Regression
Models, 401
10.6 Bayesian Methods for Logistic Regression, 408
10.6.1 The Bayesian Logistic Regression Model, 410
10.6.2 MCMC Simulation, 411
contents xi
10.6.3 An Example of a Bayesian Analysis and Its
Interpretation, 419
10.7 Other Link Functions for Binary Regression Models, 434
10.8 Mediation, 441
10.8.1 Distinguishing Mediators from Confounders, 441
10.8.2 Implications for the Interpretation of an Adjusted
Logistic Regression Coefficient, 443
10.8.3 Why Adjust for a Mediator? 444
10.8.4 Using Logistic Regression to Assess Mediation:
Assumptions, 445
10.9 More About Statistical Interaction, 448
10.9.1 Additive versus Multiplicative Scale–Risk
Difference versus Odds Ratios, 448
10.9.2 Estimating and Testing Additive Interaction, 451
Exercises, 456
References 459
Index 479