Abstract A key assumption in species distribution modeling (SDM) with presence-background (PB) methods is that sampling of occurrence localities is unbiased and that any sampling bias is proportional to the background distribution of environmental covariates. This assumption is rarely met when SDM practitioners rely on federated museum records from natural history collections for geo-located occurrences due to inherent sampling bias found in these collections. We use a simulation approach to explore the effectiveness of three methods developed to account for sampling bias in SDM with PB frameworks. Two of the methods rely on careful filtering of observation data— geographic thinning (G-Filter) and environmental thinning (E-Filter)— while a third, FactorBiasOut, creates selection weights for background data to bias locations toward areas where the observation dataset was sampled. While these methods have been assessed previously, evaluation has emphasized spatial predictions of habitat potential. Here, we dig deeper into the effectiveness of these methods by exploring how sampling bias not only affects predictions of habitat potential, but also our understanding of niche characteristics such as which explanatory variables and response curves best represent species– environment relationships. We simulate 100 virtual species ranging from generalist to specialist in their habitat preferences and introduce geographic and environmental bias at three intensity levels to measure the effectiveness of each correction method to (1) predict true probability of occurrence across a study area, (2) recover true species– environment relationships, and (3) identify true explanatory variables. We find that the FactorBiasOut most often showed the greatest improvement in recreating known distributions but did no better at correctly identifying environmental covariates or recreating species– environment relationships than G-Filter or E-Filter methods. Narrow niche species are most problematic for biased calibration datasets, such that correction methods can, in some cases, make predictions worse.