Estimates the bootstrap simulation error, expressed as a "simulation coefficient of variation" (CV).
Value
A data frame with one row for each statistic.
The column STATISTIC
gives the name of the statistic.
The column SIMULATION_CV
gives the estimated simulation CV of the statistic.
The column N_REPLICATES
gives the number of bootstrap replicates.
Statistical Details
Unlike other replication methods such as the jackknife or balanced repeated replication, the bootstrap variance estimator's precision can always be improved by using a larger number of replicates, as the use of only a finite number of bootstrap replicates introduces simulation error to the variance estimation process. Simulation error can be measured as a "simulation coefficient of variation" (CV), which is the ratio of the standard error of a bootstrap estimator to the expectation of that bootstrap estimator, where the expectation and standard error are evaluated with respect to the bootstrapping process given the selected sample.
For a statistic \(\hat{\theta}\), the simulation CV of the bootstrap variance estimator
\(v_{B}(\hat{\theta})\) based on \(B\) replicate estimates \(\hat{\theta}^{\star}_1,\dots,\hat{\theta}^{\star}_B\) is defined as follows:
$$
CV_{\star}(v_{B}(\hat{\theta})) = \frac{\sqrt{var_{\star}(v_B(\hat{\theta}))}}{E_{\star}(v_B(\hat{\theta}))} = \frac{CV_{\star}(E_2)}{\sqrt{B}}
$$
where
$$
E_2 = (\hat{\theta}^{\star} - \hat{\theta})^2
$$
$$
CV_{\star}(E_2) = \frac{\sqrt{var_{\star}(E_2)}}{E_{\star}(E_2)}
$$
and \(var_{\star}\) and \(E_{\star}\) are evaluated with respect to
the bootstrapping process, given the selected sample.
The simulation CV, denoted \(CV_{\star}(v_{B}(\hat{\theta}))\), is estimated for a given number of replicates \(B\)
by estimating \(CV_{\star}(E_2)\) using observed values and dividing this by \(\sqrt{B}\). If the bootstrap errors
are assumed to be normally distributed, then \(CV_{\star}(E_2)=\sqrt{2}\) and so \(CV_{\star}(v_{B}(\hat{\theta}))\) would not need to be estimated.
Using observed replicate estimates to estimate the simulation CV instead of assuming normality allows simulation CV to be
used for a a wide array of bootstrap methods.
References
See Section 3.3 and Section 8 of Beaumont and Patak (2012) for details and an example where the simulation CV is used to determine the number of bootstrap replicates needed for various alternative bootstrap methods in an empirical illustration.
Beaumont, J.-F. and Z. Patak. (2012), "On the Generalized Bootstrap for Sample Surveys with Special Attention to Poisson Sampling." International Statistical Review, 80: 127-148. doi:10.1111/j.1751-5823.2011.00166.x .
See also
Use estimate_boot_reps_for_target_cv
to help choose the number of bootstrap replicates.
Examples
if (FALSE) { # \dontrun{
set.seed(2022)
# Create an example bootstrap survey design object ----
library(survey)
data('api', package = 'survey')
boot_design <- svydesign(id=~1,strata=~stype, weights=~pw,
data=apistrat, fpc=~fpc) |>
svrep::as_bootstrap_design(replicates = 5000)
# Calculate estimates of interest and retain estimates from each replicate ----
estimated_means_and_proportions <- svymean(x = ~ api00 + api99 + stype, design = boot_design,
return.replicates = TRUE)
custom_statistic <- withReplicates(design = boot_design,
return.replicates = TRUE,
theta = function(wts, data) {
numerator <- sum(data$api00 * wts)
denominator <- sum(data$api99 * wts)
statistic <- numerator/denominator
return(statistic)
})
# Estimate simulation CV of bootstrap estimates ----
estimate_boot_sim_cv(
svrepstat = estimated_means_and_proportions
)
estimate_boot_sim_cv(
svrepstat = custom_statistic
)
} # }