SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 2 Vol 1

PERCENTAGE BEND CORRELATION

Name:
    PERCENTAGE BEND CORRELATION (LET)
Type:
    Let Subcommand
Purpose:
    Compute the percentage bend correlation for a variable.
Description:
    Mosteller and Tukey (see Reference section below) define two types of robustness:

    1. resistance means that changing a small part, even by a large amount, of the data does not cause a large change in the estimate

    2. robustness of efficiency means that the statistic has high efficiency in a variety of situations rather than in any one situation. Efficiency means that the estimate is close to optimal estimate given that we know what distribution that the data comes from. A useful measure of efficiency is:

        Efficiency = (lowest variance feasible)/ (actual variance)

    Many statistics have one of these properties. However, it can be difficult to find statistics that are both resistant and have robustness of efficiency.

    The Pearson correlation coefficient is an optimal estimator for Gaussian data. However, it is not resistant and it does not have robustness of efficiency.

    The percentage bend correlation estimator, discussed in Shoemaker and Hettmansperger and also by Wilcox, is both resistant and robust of efficiency. The rationale and derivation for this estimate is given in these references.

    The percentage bend correlation between two variables X and Y is computed as follows:

    1. Set m = (1-\( \beta \))*n) + 0.5. Round m down to the nearest integer.

    2. Let \( W_{i} = |X_{i} - M_{x}| \) for i = 1, ..., n where Mx. is the median of X.

    3. Sort the Wi in ascending order.

    4. \( \hat{W}_{x} \) = W(m) (i. e., the m-th order statistic). W(m) is the estimate of the (1-\( \beta \)) quantile of W.

    5. Sort the X values. Compute the number of values of \( (X_{i} - M_{x})/\hat{W}_{x}(\beta) \) that are less than -1 and the number that are greater than +1 and store in i1 and i2 respectively. Then compute

        \( S_{x} = \sum_{i=i1+1}^{n-i2}{X_{i}} \)
        \( \hat{\phi}_{x} = \frac{\hat{W}_{x}(i2 - i1) + S_{x}}{n - i1 - i2} \)
        \( U_{i} = \frac{X_{i} - \hat{\phi}_{x}}{\hat{W}_{x}} \)

    6. Repeat the above calculations on the Y variable. Store corresponding quantities in \( \hat{W}_{y} \), \( \hat{\phi}_{y} \), and Vi.

    7. Define the function

        \( \Psi(x) = \max[-1, \min(1,x)] \)

    8. Compute

        Ai = \( \Psi_{i} \) (Ui)
        Bi = \( \Psi_{i} \) (Vi)

    9. Compute the percentage bend correlation

      \( \rho_{pb} = \frac{\sum_{i=1}^{n}{A_{i}B_{i}}} {\sqrt{\sum_{i=1}^{n}{A_{i}^2}\sum_{i=1}^{n}{B_{i}^2}}} \)

    The value of \( \beta \) is selected between 0 and 0.5. Higher values of \( \beta \) result in a higher breakdown point at the expense of lower efficiency.

Syntax:
    LET <par> = PERCENTAGE BEND CORRELATION <y1> <y2>
                                <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                  <y2> is the second response variable;
                  <par> is a parameter where the computed percentage bend correlation is stored;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    LET A = PERCENTAGE BEND CORRELATION Y1 Y2
    LET A = PERCENTAGE BEND CORRELATION Y1 Y2 SUBSET TAG > 2
Note:
    To set the value of \( \beta \), enter the command

      LET BETA = <value>

    where <value> is greater than 0 and less than or equal to 0.5. The default value for \( \beta \) is 0.1.

Note:
    Dataplot statistics can be used in a number of commands. For details, enter

Default:
    None
Synonyms:
    None
Related Commands: References:
    Shoemaker and Hettmansperger (1982), "Robust Estimates of and Tests for the One- and Two-Sample Scale Models", Biometrika 69, pp. 47-54.

    Rand Wilcox (1997), "Introduction to Robust Estimation and Hypothesis Testing", Academic Press.

    Mosteller and Tukey (1977), "Data Analysis and Regression: A Second Course in Statistics", Addison-Wesley, pp. 203-209.

Applications:
    Robust Data Analysis
Implementation Date:
    2002/07
Program 1:
    SKIP 25 
    READ MATRIX IRIS.DAT Y1 Y2 Y3 Y4 X 
    LET M = CREATE MATRIX Y1 Y2 Y3 Y4 
    SET CORRELATION TYPE PERCENTAGE BEND 
    LET B = CORRELATION MATRIX Y1 Y2 Y3 Y4 
        
Program 2:
     
    SKIP 25
    READ IRIS.DAT Y1 Y2 Y3 Y4 X
    .
    MULTIPLOT CORNER COORDINATES 0 0 100 95
    MULTIPLOT SCALE FACTOR 2
    MULTIPLOT 2 1
    BOOTSTRAP SAMPLES 500
    BOOTSTRAP PERCENTAGE BEND CORRELATION PLOT Y1 Y2
    X1LABEL DISPLACEMENT 12
    X1LABEL B025 = ^B025, B975=^B975
    HISTOGRAM YPLOT
    END OF MULTIPLOT
    MOVE 50 96
    JUSTIFICATION CENTER
    TEXT PERCENTAGE BEND CORRELATION BOOTSTRAP: IRIS DATA
        
    plot generated by sample program

Privacy Policy/Security Notice
Disclaimer | FOIA

NIST is an agency of the U.S. Commerce Department.

Date created: 08/12/2002
Last updated: 10/07/2016

Please email comments on this WWW page to alan.heckert@nist.gov.