SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Auxiliary Chapter

SN SCALE

Name:
    SN SCALE (LET)
Type:
    Let Subcommand
Purpose:
    Compute the Sn scale estimate for a variable.
Description:
    Mosteller and Tukey (see Reference section below) define two types of robustness:

    1. resistance means that changing a small part, even by a large amount, of the data does not cause a large change in the estimate

    2. robustness of efficiency means that the statistic has high efficiency in a variety of situations rather than in any one situation. Efficiency means that the estimate is close to optimal estimate given that we know what distribution that the data comes from. A useful measure of efficiency is:

        Efficiency = (lowest variance feasible)/ (actual variance)

    Many statistics have one of these properties. However, it can be difficult to find statistics that are both resistant and have robustness of efficiency.

    The most common estimate of scale, the standard deviation, is the most efficient estimate of scale if the data come from a normal distribution. However, the standard deviation is not robust in the sense that changing even one value can dramatically change the computed value of the standard deviation (i.e., poor resistance). In addition, it does not have robustness of efficiency for non-normal data.

    The median absolute deviation (MAD) and interquartile range are the two most commonly used robust alternatives to the standard deviation. The MAD in particular is a very robust scale estimator. However, the MAD has the following limitations:

    1. It does not have particularly high efficiency for data that is in fact normal (37%). In comparison, the median has 64% efficiency for normal data.

    2. The MAD statistic also has an implicit assumption of symmetry. That is, it measures the distance from a measure of central location (the median).

    Rousseeuw and Croux proposed the Sn estimate of scale as an alternative to the MAD. It shares desirable robustness properties with MAD (50% breakdown point, bounded influence function). In addition, it has significantly better normal efficiency (58%) and it does not depend on symmetry.

    The Sn scale estimate is defined as:

      Sn = c*MEDIAN(i){MEDIAN(j)|x(i) - x(j)|}

    That is, for each i we compute the median of {|xi - xj j = 1, ..., n}. The median of these n numbers is then the estimate of Sn. The constant c is determined to make Sn a consistent estimator. The value used is 1.1926 (this is the value needed to make Sn a consistent estimator for normal data).

    The Sn statistic measures typical distances between values in contrast to the MAD and the standard deviation which measure the distance from a central location. This is why the Sn is appropriate for asymmetic distributions as well symmetric distributions.

    The Rousseeuw and Croux article (see the Reference section below) discusses the properties of the Sn estimate in detail.

Syntax:
    LET <par> = SN SCALE <y>             <SUBSET/EXCEPT/FOR qualification>
    where <y> is the response variable;
                <par> is a parameter where the computed Sn scale statistic is stored;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    LET A = SN SCALE Y1
    LET A = SN SCALE Y1 SUBSET TAG > 2
Note:
    Dataplot uses code provided by Rousseeuw and Croux to compute the Sn estimate. This algorithm uses an efficient computational method for computing Sn.
Note:
    The Rousseeuw and Croux article also proposes the Qn scale estimate. The article discusses the properties of both estimators in detail.
Note:
    In addition, the Sn statistic is supported for the following plots and commands

      SN SCALE PLOT Y X
      CROSS TABULATE SN SCALE PLOT Y X1 X2
      BOOTSTRAP SN SCALE PLOT Y
      JACKNIFE SN SCALE PLOT Y
      DEX SN SCALE PLOT Y X1 ... XK
      SN SCALE BLOCK PLOT Y X1 ... XK
      SN SCALE INFLUENCE CURVE Y
      SN SCALE INTERACTION PLOT Y X1 X2

      TABULATE SN SCALE Y X
      CROSS TABULATE SN Y X1 X2
      LET Z = CROSS TABULATE SN SCALE Y X1 X2
      LET Y = MATRIX COLUMN SN SCALE M
      LET Y = MATRIX ROW SN SCALE M

Default:
    None
Synonyms:
    None
Related Commands:
    QN SCALE = Compute the Qn scale estimate of a variable.
    MEDIAN ABSOLUTE DEVIATION = Compute the median absolute deviation of a variable.
    INTERQUARTILE RANGE = Compute the interquartile range of a variable.
    STANDARD DEVIATION = Compute the standard deviation of a variable.
    DIFFERENCE OF SN = Compute the difference of the Sn scale estimates between two variables.
    STATISTIC PLOT = Generate a statistic versus subset plot.
    CROSS TABULATE PLOT = Generate a statistic versus subset plot (two subset variables).
    BOOTSTRAP PLOT = Generate a bootstrap plot for a statistic.
Reference:
    "Alternatives to the Median Absolute Deviation", Peter J. Rousseuw and Christophe Croux, Journal of the American Statistical Association, December, 1993, Vol. 88, No. 424, pp. 1273-1283.

    "Data Analysis and Regression: A Second Course in Statistics", Mosteller and Tukey, Addison-Wesley, 1977, pp. 203-209.

Applications:
    Data Analysis
Implementation Date:
    2003/4
Program:
    MULTIPLOT 2 2
    MULTIPLOT CORNER COORDINATES 0 0 100 100
    MULTIPLOT SCALE FACTOR 2
    X1LABEL DISPLACEMENT 12
    .
    LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 200
    LET SIGMA = 1
    LET Y2 = LOGNORMAL RANDOM NUMBERS FOR I = 1 1 200
    .
    BOOTSTRAP SAMPLES 500
    BOOTSTRAP SN SCALE PLOT Y1
    X1LABEL B025 = ^B025, B975=^B975
    HISTOGRAM YPLOT
    X1LABEL
    .
    BOOTSTRAP SN SCALE PLOT Y2
    X1LABEL B025 = ^B025, B975=^B975
    HISTOGRAM YPLOT
    .
    END OF MULTIPLOT
    JUSTIFICATION CENTER
    MOVE 50 96
    TEXT SN SCALE BOOTSTRAP: NORMAL
    MOVE 50 46
    TEXT SN SCALE BOOTSTRAP: LOGNORMAL
        

    plot generated by sample program

Date created: 5/5/2003
Last updated: 5/5/2003
Please email comments on this WWW page to alan.heckert@nist.gov.