*Names in bold indicate Presenter
Using the within-study comparison of Shadish, Clark & Steiner (2008) and the ECLS-K dataset, we investigate three research questions: First, how important is it to have a large and heterogeneous set of covariates? Second, how important is it to sample multiple items per domain? And since each data set enables us to identify the likely true selection process, the third question we address is: How much bias reduction is achieved by failing to include the most effective covariates in the set used to correct for selection bias? Thus, will a large and heterogeneous set of covariates compensate for the absence of the most effective covariates? These questions get at what can be known about bias reduction when theory about selection is meager but the number and dimensionality of covariates independently vary.
The results from both studies, the within-study comparison and the ECLS-K dataset, indicate that bias reduction increases as the number of covariates per domain increases, though at a diminishing rate. Sampling covariates from multiple heterogeneous construct domains also increases bias reduction and is more important than having many measurements of a few domains only. Combining the maximal heterogeneity sampled and at least five items per domain reduces almost all the bias in the two educational data sets examined. When the most effective single covariates are deliberately omitted – which no analyst would ever do in practice -- bias reduction again increases as a joint function of the heterogeneity of domains and the number of items per domain, but it is open to debate whether the level of bias reduction achieved is acceptable or not. This is hardly the case when what turn out to be the crucial covariates are included among the covariates sampled.