Compute jackknife estimates of a parameter taking one or more given samples as parameters. In particular, E is the estimator to be jackknifed as a function name, handle, or inline function, and x is the sample for which the estimate is to be taken. The i-th entry of jackstat will contain the value of the estimator on the sample x with its i-th row omitted.
jackstat(i) = E(x(1 : i - 1, i + 1 : length(x)))
Depending on the number of samples to be used, the estimator must have the appropriate form:
If only one sample is used, then the estimator need not be concerned with cell arrays,
for example jackknifing the standard deviation of a sample can be performed with
jackstat = jackknife (@std, rand (100, 1))
.
If, however, more than one sample is to be used, the samples must all be of equal size,
and the estimator must address them as elements of a cell-array,
in which they are aggregated in their order of appearance:
jackstat = jackknife(@(x) std(x{1})/var(x{2}), rand (100, 1), randn (100, 1)
If all goes well, a theoretical value P for the parameter is already known,
n is the sample size,
t = n * E(x) - (n - 1) * mean(jackstat)
, and
v = sumsq(n * E(x) - (n - 1) * jackstat - t) / (n * (n - 1))
, then
(t-P)/sqrt(v)
should follow a t-distribution with n-1 degrees of freedom.
Jackknifing is a well known method to reduce bias; further details can be found in:
The following code
for k = 1:1000 x=rand(10,1); s(k)=std(x); jackstat=jackknife(@std,x); j(k)=10*std(x) - 9*mean(jackstat); end figure();hist([s',j'], 0:sqrt(1/12)/10:2*sqrt(1/12))
Produces the following figure
Figure 1 |
---|
The following code
for k = 1:1000 x=randn(1,50); y=rand(1,50); jackstat=jackknife(@(x) std(x{1})/std(x{2}),y,x); j(k)=50*std(y)/std(x) - 49*mean(jackstat); v(k)=sumsq((50*std(y)/std(x) - 49*jackstat) - j(k)) / (50 * 49); end t=(j-sqrt(1/12))./sqrt(v); figure();plot(sort(tcdf(t,49)),"-;Almost linear mapping indicates good fit with t-distribution.;")
Produces the following figure
Figure 1 |
---|
Package: statistics