cov.rob.Rd
Compute a multivariate location and scale estimate with a high
breakdown point -- this can be thought of as estimating the mean and
covariance of the good
part of the data. cov.mve
and
cov.mcd
are compatibility wrappers.
cov.rob(x, cor = FALSE, quantile.used = floor((n + p + 1)/2), method = c("mve", "mcd", "classical"), nsamp = "best", seed) cov.mve(...) cov.mcd(...)
x | a matrix or data frame. |
---|---|
cor | should the returned result include a correlation matrix? |
quantile.used | the minimum number of the data points regarded as |
method | the method to be used -- minimum volume ellipsoid, minimum
covariance determinant or classical product-moment. Using
|
nsamp | the number of samples or |
seed | the seed to be used for random sampling: see |
... | arguments to |
A list with components
the final estimate of location.
the final estimate of scatter.
(only is cor = TRUE
) the estimate of the correlation
matrix.
message giving number of singular samples out of total
the value of the criterion on log scale. For MCD this is the determinant, and for MVE it is proportional to the volume.
the subset used. For MVE the best sample, for MCD the best
set of size quantile.used
.
total number of observations.
For method "mve"
, an approximate search is made of a subset of
size quantile.used
with an enclosing ellipsoid of smallest volume; in
method "mcd"
it is the volume of the Gaussian confidence
ellipsoid, equivalently the determinant of the classical covariance
matrix, that is minimized. The mean of the subset provides a first
estimate of the location, and the rescaled covariance matrix a first
estimate of scatter. The Mahalanobis distances of all the points from
the location estimate for this covariance matrix are calculated, and
those points within the 97.5% point under Gaussian assumptions are
declared to be good
. The final estimates are the mean and rescaled
covariance of the good
points.
The rescaling is by the appropriate percentile under Gaussian data; in addition the first covariance matrix has an ad hoc finite-sample correction given by Marazzi.
For method "mve"
the search is made over ellipsoids determined
by the covariance matrix of p
of the data points. For method
"mcd"
an additional improvement step suggested by Rousseeuw and
van Driessen (1999) is used, in which once a subset of size
quantile.used
is selected, an ellipsoid based on its covariance
is tested (as this will have no larger a determinant, and may be smaller).
P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.
A. Marazzi (1993) Algorithms, Routines and S Functions for Robust Statistics. Wadsworth and Brooks/Cole.
P. J. Rousseeuw and B. C. van Zomeren (1990) Unmasking multivariate outliers and leverage points, Journal of the American Statistical Association, 85, 633--639.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212--223.
P. Rousseeuw and M. Hubert (1997) Recent developments in PROGRESS. In L1-Statistical Procedures and Related Topics ed Y. Dodge, IMS Lecture Notes volume 31, pp. 201--214.
#> $center #> Air.Flow Water.Temp Acid.Conc. stack.loss #> 56.3750 20.0000 85.4375 13.0625 #> #> $cov #> Air.Flow Water.Temp Acid.Conc. stack.loss #> Air.Flow 23.050000 6.666667 16.625000 19.308333 #> Water.Temp 6.666667 5.733333 5.333333 7.733333 #> Acid.Conc. 16.625000 5.333333 34.395833 13.837500 #> stack.loss 19.308333 7.733333 13.837500 18.462500 #> #> $msg #> [1] "20 singular samples of size 5 out of 2500" #> #> $crit #> [1] 19.89056 #> #> $best #> [1] 5 6 7 8 9 10 11 12 15 16 18 19 20 #> #> $n.obs #> [1] 21 #>cov.rob(stack.x, method = "mcd", nsamp = "exact")#> $center #> Air.Flow Water.Temp Acid.Conc. #> 56.70588 20.23529 85.52941 #> #> $cov #> Air.Flow Water.Temp Acid.Conc. #> Air.Flow 23.470588 7.573529 16.102941 #> Water.Temp 7.573529 6.316176 5.367647 #> Acid.Conc. 16.102941 5.367647 32.389706 #> #> $msg #> [1] "266 singular samples of size 4 out of 5985" #> #> $crit #> [1] 5.472581 #> #> $best #> [1] 4 5 6 7 8 9 10 11 12 13 14 20 #> #> $n.obs #> [1] 21 #>