Spatial Versus Non-Spatial Methods to Predict County-Level Primary Care Physician Supplies

Hollands, Simon

Research Objective: To compare spatial modeling techniques against standard regression methods in estimating associations between proximity to primary care residents, medical students, and hospitals and county-level supplies of primary care physicians (PCPs) per capita.

Study Design: Using data from the Area Health Resource File, the Association of American Medical Colleges, and the National Resident Matching Program, we tested spatial autocorrelation in county-level PCPs per capita using Moran’s I and Local Indicators of Spatial Autocorrelation. We then fit cross-sectional Ordinary Least Squares (OLS) regression models that predicted PCP supplies as a function of primary care residents within 200 miles, medical students within 200 miles, and number of hospitals within the same county as well as county-level sociodemographic confounders and assessed the spatial autocorrelation of county-level residuals. To account for spatial non-independence of residuals (which represents a violation of OLS regression assumptions), we then fit similarly-structured spatial lag and spatial error regression models, using inverse-distance and queen-contiguity rules to structure spatial dependency, and reassessed the spatial dependency of residuals.

Population Studied: PCPs in all United States counties.

Principal Findings: In all models PCP supply was negatively associated with residency positions and medical students (Adjusted OLS for PCPs per 100,000 β: -0.000926 for residency, p<0.001; -0.0000879 for medical students, p<0.001) and positively associated with hospital counts (β: 0.574, p<0.001). Spatial autocorrelation was detected in county level PCP supply and in OLS model residuals (Moran’s I for unadjusted OLS residuals: 0.087 for residency positions, 0.082 for medical students, 0.072 for hospitals; p<0.001 for all). Residual high-value clusters were found on both coasts and the Upper Midwest; low value clusters were found in the Dakotas, Texas, and through the Missouri belt. After adjustment, Moran’s I values for OLS model residuals were reduced, suggesting that a substantial portion of the spatial autocorrelation originally detected was explained by observed covariates (0.042 for residency positions, 0.043 for medical students, 0.040 for hospitals; p<0.001 for all). Spatial lag models based on inverse distance function provided the best fit, reducing spatial autocorrelation (Moran’s I) by up to 37%. Adjusted spatial lag models altered the magnitude of regression coefficients by -26.34% for residency positions, -31.96% for medical students, and -9.5% for hospitals, producing positive and statistically significant estimates of spatial dependency (with resulting decreases in correlation of the residuals). In contrast, spatial error models did not improve residual spatial autocorrelation.

Conclusions: Relative to OLS, spatial lag models had better fit, reduced spatial autocorrelation of regression residuals, and changed estimates of cross-sectional associations between counts of medical students, residents, hospitals, and PCPs by up to 32%. The favorable performance of spatial lag models suggests that PCPs are attracted disproportionately to areas with unexplained higher PCP counts.

Implications for Policy or Practice: Analyses of geographic data should test for and deal with spatial non-independence, which can produce biased coefficient estimates. As these analyses demonstrate, failure to do so can significantly alter conclusions.

Association for Public Policy Analysis & Management

Panel Paper: Spatial Versus Non-Spatial Methods to Predict County-Level Primary Care Physician Supplies

Navigation

Additional Resources