|
QN SCALEName:
Many statistics have one of these properties. However, it can be difficult to find statistics that are both resistant and have robustness of efficiency. The most common estimate of scale, the standard deviation, is the most efficient estimate of scale if the data come from a normal distribution. However, the standard deviation is not robust in the sense that changing even one value can dramatically change the computed value of the standard deviation (i.e., poor resistance). In addition, it does not have robustness of efficiency for non-normal data. The median absolute deviation (MAD) and interquartile range are the two most commonly used robust alternatives to the standard deviation. The MAD in particular is a very robust scale estimator. However, the MAD has the following limitations:
Rousseeuw and Croux proposed the Qn estimate of scale as an alternative to the MAD. It shares desirable robustness properties with MAD (50% breakdown point, bounded influence function). In addition, it has significantly better normal efficiency (82%) and it does not depend on symmetry. The Qn scale estimate is motivated by the Hodges-Lehmann estimate of location:
An analogous scale estimate can be obtained by replacing pairwise averages with pairwised distances:
This estimate has high efficiency for normal data (86%), but a breakdown point of only 29%. Rousseeuw and Croux proposed the following variation of this statistic:
where d is a constant factor and k =
The Rousseeuw and Croux article (see the Reference section below) discusses the properties of the Qn estimate in detail.
where <y> is the response variable; <par> is a parameter where the computed Qn estimate is stored; and where the <SUBSET/EXCEPT/FOR qualification> is optional.
LET A = QN SCALE Y1 SUBSET TAG > 2
CROSS TABULATE QN SCALE PLOT Y X1 X2 BOOTSTRAP QN SCALE PLOT Y JACKNIFE QN SCALE PLOT Y DEX QN SCALE PLOT Y X1 ... XK QN SCALE BLOCK PLOT Y X1 ... XK QN SCALE INFLUENCE CURVE Y QN SCALE INTERACTION PLOT Y X1 X2
TABULATE QN SCALE Y X
"Data Analysis and Regression: A Second Course in Statistics", Mosteller and Tukey, Addison-Wesley, 1977, pp. 203-209.
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
MULTIPLOT SCALE FACTOR 2
X1LABEL DISPLACEMENT 12
.
LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 200
LET SIGMA = 1
LET Y2 = LOGNORMAL RANDOM NUMBERS FOR I = 1 1 200
.
BOOTSTRAP SAMPLES 500
BOOTSTRAP QN SCALE PLOT Y1
X1LABEL B025 = ^B025, B975=^B975
HISTOGRAM YPLOT
X1LABEL
.
BOOTSTRAP QN SCALE PLOT Y2
X1LABEL B025 = ^B025, B975=^B975
HISTOGRAM YPLOT
.
END OF MULTIPLOT
JUSTIFICATION CENTER
MOVE 50 96
TEXT QN SCALE BOOTSTRAP: NORMAL
MOVE 50 46
TEXT QN SCALE BOOTSTRAP: LOGNORMAL
Date created: 5/5/2003 |