Rescale replicate factors. The main application of this rescaling is to ensure that all replicate weights are strictly positive.
Note that this rescaling has no impact on variance estimates for totals (or other linear statistics), but variance estimates for nonlinear statistics will be affected by the rescaling.
Arguments
- x
Either a replicate survey design object, or a numeric matrix of replicate weights.
- tau
Either a single positive number, or
NULL
. This is the rescaling constant \(\tau\) used in the transformation \(\frac{w + \tau - 1}{\tau}\), where \(w\) is the original weight.
Iftau=NULL
or is left unspecified, then the argumentmin_wgt
should be used instead, in which case, \(\tau\) is automatically set to the smallest value needed to rescale the replicate weights such that they are all at leastmin_wgt
.- min_wgt
Should only be used if
tau=NULL
ortau
is left unspecified. Specifies the minimum acceptable value for the rescaled weights, which will be used to automatically determine the value \(\tau\) used in the transformation \(\frac{w + \tau - 1}{\tau}\), where \(w\) is the original weight. Must be at least zero and must be less than one.- digits
Only used if the argument
min_wgt
is used. Specifies the number of decimal places to use for choosingtau
. Using a smaller number ofdigits
is useful simply for producing easier-to-read documentation.
Value
If the input is a numeric matrix, returns the rescaled matrix. If the input is a replicate survey design object, returns an updated replicate survey design object.
For a replicate survey design object, results depend on
whether the object has a matrix of replicate factors rather than
a matrix of replicate weights (which are the product of replicate factors and sampling weights).
If the design object has combined.weights=FALSE
,
then the replication factors are adjusted.
If the design object has combined.weights=TRUE
,
then the replicate weights are adjusted. It is strongly
recommended to only use the rescaling method for replication factors
rather than the weights.
For a replicate survey design object, the scale
element
of the design object will be updated appropriately,
and an element tau
will also be added.
If the input is a matrix instead of a survey design object,
the result matrix will have an attribute named tau
which can be retrieved using attr(x, 'tau')
.
Details
Let \(\mathbf{A} = \left[ \mathbf{a}^{(1)} \cdots \mathbf{a}^{(b)} \cdots \mathbf{a}^{(B)} \right]\) denote the \((n \times B)\) matrix of replicate adjustment factors. To eliminate negative adjustment factors, Beaumont and Patak (2012) propose forming a rescaled matrix of nonnegative replicate factors \(\mathbf{A}^S\) by rescaling each adjustment factor \(a_k^{(b)}\) as follows: $$ a_k^{S,(b)} = \frac{a_k^{(b)} + \tau - 1}{\tau} $$ where \(\tau \geq 1 - a_k^{(b)} \geq 1\) for all \(k\) in \(\left\{ 1,\ldots,n \right\}\) and all \(b\) in \(\left\{1, \ldots, B\right\}\).
The value of \(\tau\) can be set based on the realized adjustment factor matrix \(\mathbf{A}\) or by choosing \(\tau\) prior to generating the adjustment factor matrix \(\mathbf{A}\) so that \(\tau\) is likely to be large enough to prevent negative adjustment factors.
If the adjustment factors are rescaled in this manner, it is important to adjust the scale factor used in estimating the variance with the bootstrap replicates. For example, for bootstrap replicates, the adjustment factor becomes \(\frac{\tau^2}{B}\) instead of \(\frac{1}{B}\). $$ \textbf{Prior to rescaling: } v_B\left(\hat{T}_y\right) = \frac{1}{B}\sum_{b=1}^B\left(\hat{T}_y^{*(b)}-\hat{T}_y\right)^2 $$ $$ \textbf{After rescaling: } v_B\left(\hat{T}_y\right) = \frac{\tau^2}{B}\sum_{b=1}^B\left(\hat{T}_y^{S*(b)}-\hat{T}_y\right)^2 $$
References
This method was suggested by Fay (1989) for the specific application of creating replicate factors using his generalized replication method. Beaumont and Patak (2012) provided an extended discussion on this rescaling method in the context of rescaling generalized bootstrap replication factors to avoid negative replicate weights.
The notation used in this documentation is taken from Beaumont and Patak (2012).
- Beaumont, Jean-François, and Zdenek Patak. 2012.
"On the Generalized Bootstrap for Sample Surveys with Special Attention to Poisson Sampling: Generalized Bootstrap for Sample Surveys."
International Statistical Review 80 (1): 127–48.
https://doi.org/10.1111/j.1751-5823.2011.00166.x.
- Fay, Robert. 1989. "Theory And Application Of Replicate Weighting For Variance Calculations."
In, 495–500. Alexandria, VA: American Statistical Association.
http://www.asasrms.org/Proceedings/papers/1989_033.pdf
Examples
# Example 1: Rescaling a matrix of replicate weights to avoid negative weights
rep_wgts <- matrix(
c(1.69742746694909, -0.230761178913411, 1.53333377634192,
0.0495043413294782, 1.81820367441039, 1.13229198793703,
1.62482013925955, 1.0866133494029, 0.28856654131668,
0.581930729719006, 0.91827012312825, 1.49979905894482,
1.26281337410693, 1.99327362761477, -0.25608700039304),
nrow = 3, ncol = 5
)
rescaled_wgts <- rescale_reps(rep_wgts, min_wgt = 0.01)
print(rep_wgts)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.6974275 0.04950434 1.6248201 0.5819307 1.262813
#> [2,] -0.2307612 1.81820367 1.0866133 0.9182701 1.993274
#> [3,] 1.5333338 1.13229199 0.2885665 1.4997991 -0.256087
print(rescaled_wgts)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.54915549 0.2515782 1.4919844 0.6708116 1.20693966
#> [2,] 0.03089671 1.6442549 1.0681995 0.9356458 1.78210522
#> [3,] 1.41994786 1.1041669 0.4398162 1.3935426 0.01095512
#> attr(,"tau")
#> [1] 1.27
# Example 2: Rescaling replicate weights with a specified value of 'tau'
rescaled_wgts <- rescale_reps(rep_wgts, tau = 2)
print(rescaled_wgts)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.3487137 0.5247522 1.3124101 0.7909654 1.1314067
#> [2,] 0.3846194 1.4091018 1.0433067 0.9591351 1.4966368
#> [3,] 1.2666669 1.0661460 0.6442833 1.2498995 0.3719565
#> attr(,"tau")
#> [1] 2
# Example 3: Rescaling replicate weights of a survey design object
set.seed(2023)
library(survey)
data('mu284', package = 'survey')
## First create a bootstrap design object
svy_design_object <- svydesign(
data = mu284,
ids = ~ id1 + id2,
fpc = ~ n1 + n2
)
boot_design <- as_gen_boot_design(
design = svy_design_object,
variance_estimator = "Stratified Multistage SRS",
replicates = 5, tau = 1
)
## Rescale the weights
rescaled_boot_design <- boot_design |>
rescale_reps(min_wgt = 0.01)
boot_wgts <- weights(boot_design, "analysis")
rescaled_boot_wgts <- weights(rescaled_boot_design, 'analysis')
print(boot_wgts)
#> REP_1 REP_2 REP_3 REP_4 REP_5
#> [1,] 34.071074 -3.352195 7.031013 35.4547244 18.681422
#> [2,] -3.271131 12.579037 57.474328 9.3992013 25.014379
#> [3,] 12.204302 16.611771 14.029208 6.9869038 -8.727739
#> [4,] 40.124053 62.587721 29.834150 31.6263955 10.057763
#> [5,] 6.857688 48.936835 5.029175 42.1974205 67.126670
#> [6,] 38.866284 -7.883877 6.363613 35.3323662 14.104502
#> [7,] -2.705981 5.310800 51.191780 -18.8838183 34.232137
#> [8,] 23.948409 19.740921 21.950039 0.8683187 -2.397135
#> [9,] 38.102201 56.396306 39.516036 39.6713936 31.130900
#> [10,] 7.987330 41.986885 8.545987 47.8769539 66.314653
#> [11,] 35.747939 -13.746937 9.901870 41.9315736 8.610797
#> [12,] 1.384506 2.579634 50.469377 -26.8411849 19.800463
#> [13,] 22.153736 11.250766 19.117806 0.9281634 -1.226728
#> [14,] 48.183146 68.452257 28.322524 31.3003310 12.972211
#> [15,] 7.066647 63.713091 11.462660 41.8092991 64.604278
#> attr(,"tau")
#> [1] 1
#> attr(,"scale")
#> [1] 0.2
#> attr(,"rscales")
#> [1] 1 1 1 1 1
print(rescaled_boot_wgts)
#> REP_1 REP_2 REP_3 REP_4 REP_5
#> [1,] 25.24027 6.805158 11.92004 25.9218675 17.659157
#> [2,] 11.91898 19.726948 41.84285 18.1605261 25.852732
#> [3,] 14.46846 16.639624 15.36743 11.8983106 4.157107
#> [4,] 34.98722 46.053065 29.91830 30.8011800 20.176238
#> [5,] 15.21725 35.945896 14.31651 32.6259871 44.906406
#> [6,] 27.60244 4.572803 11.59127 25.8615925 15.404516
#> [7,] 12.19738 16.146535 38.74800 4.2280041 30.393500
#> [8,] 20.25373 18.181078 19.26931 8.8842293 7.275631
#> [9,] 33.99123 43.003106 34.68770 34.7642333 30.557093
#> [10,] 15.77373 32.522275 16.04893 35.4237868 44.506397
#> [11,] 26.06631 1.684596 13.33425 29.1124336 12.698258
#> [12,] 14.21240 14.801133 38.39214 0.3081191 23.284300
#> [13,] 19.36966 13.998735 17.87412 8.9137094 7.852187
#> [14,] 38.95721 48.941999 29.17366 30.6405572 21.611926
#> [15,] 15.32019 43.224840 17.48571 32.4347943 43.663848
#> attr(,"tau")
#> [1] 2.03
#> attr(,"scale")
#> [1] 0.82418
#> attr(,"rscales")
#> [1] 1 1 1 1 1