SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Contacts SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Auxiliary Chapter

INFLUENCE CURVE

Name:
    ... INFLUENCE CURVE
Type:
    Graphics Command
Purpose:
    Generate an influence curve for a given statistic.
Description:
    Influence functions are an important concept in studying robust statistics. The influence curve attempts to provide empirical insight into the "influence".

    Given a set of univariate data points, Y, and a specified statistic, the influence is studied by determining how the value of the statistic changes as a single point is added to Y. Specifically, we define a set of X values (typically over a fairly broad range relative to the original Y values). The influence curve is generated by plotting the value of the computed statistic with a single point of X added to Y against that X value.

    Several features are of interest in the influence curve:

    1. Is the curve "bounded" as the X values become extreme? Robust statistics should be bounded. That is, a robust statistic should not be unduly influenced by a single extreme point.

    2. What is the general behaviour as the X point becomes extreme? For example, does it becomes smoothly down-weighted as the values become extreme?

    3. What is the influence if the X point is in the "center" of the Y points?
Syntax:
    <stat> INFLUENCE CURVE <y> <x>        <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of the following statistics:
                  MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
                  BIWEIGHT LOCATION, HODGES-LEHMAN,
                  GEOMETRIC MEAN, HARMONIC MEAN,
                  STANDARD DEVIATION, RELATIVE STANDARD DEVIATION,
                  STANDARD DEVIATION OF MEAN,
                  TRIMMED MEAN STANDARD ERROR,
                  RELATIVE VARIANCE (or COEFFICIENT OF VARIATION),
                  VARIANCE, VARIANCE OF THE MEAN,
                  RANGE, GEOMETRIC STANDARD DEVIATION,
                  AVERAGE ABSOLUTE DEVIATION (or AAD),
                  MEDIAN ABSOLUTE DEVIATION (or MAD),
                  INTERQUARTILE RANGE, PERCENTAGE BEND MIDVARIANCE,
                  BIWEIGHT SCALE, BIWEIGHT MIDVARIANCE,
                  WINSORIZED VARIANCE, WINSORIZED STANDARD DEVIATION,
                  MIDRANGE,
                  MAXIMUM, MINIMUM, EXTREME,
                  SKEWNESS, KURTOSIS,
                  AUTOCORRELATION, AUTOCOVARIANCE,
                  LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
                  <FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/
                  SEVENTH/EIGHTH/NINTH/TENTH> DECILE,
                  PERCENTILE, QUANTILE, QUANTILE STANDARD ERROR,
                  NORMAL PPCC,
                  SINE FREQUENCY, SINE AMPLITUDE,
                  CP, CPK, CNPK, CPM, CC, CPL, CPU,
                  EXPECTED LOSS, PERCENT DEFECTIVE,
                  TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL),
                  TAGUCHI SN- (or SNS), TAGUCHI SN00 (or SN2);
           <y> is the response (= dependent) variable;
           <x> is a variable containing a sequence of X values;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    MEAN INFLUENCE CURVE Y X
    BIWEIGHT LOCATION INFLUENCE CURVE Y X1
Note:
    Although DATAPLOT supports this command for a large number of statistics, there may be cases where you want it for an unsupported statistic. The following example shows how to compute the mean influence curve (this is a supported statistic, the example is meant simply to demonstrate the method):
      LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 30
      LET XSEQ = SEQUENCE -10  0.1  10
      LET N = SIZE XSEQ
      LET NY = SIZE Y
      LET NTEMP = NY1 + 1
      LOOP FOR K = 1 1 N
          LET XTEMP = XSEQ(K)
          LET Y(NTEMP) = XTEMP
          LET A = MEAN Y
          LET YNEW(K) = A
      END OF LOOP
      PLOT YNEW XTEMP
          
    This basic idea can be easily adapted to unsupported statistics.
Default:
    None
Synonyms:
    None
Related Commands:
    CHARACTERS = Sets the type for plot characters.
    LINES = Sets the type for plot lines.
    STATISTIC PLOT = Plot a statistic for grouped data.
    BOOTSTRAP PLOT = Generate a bootstrap plot for a statistic.
References:
    Tukey and Mosteller (1977). "Data Analysis and Regression", Duxbury Press.

    Rand Wilcox (1997). "Introduction to Robust Estimation and Hypothesis Testing", Academic Press.

Applications:
    Robust Data Analysis
Implementation Date:
    2002/7
Program:
    LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 50
    LET XSEQ = SEQUENCE -20 0.1 20
    .
    MULTIPLOT 2 2
    MULTIPLOT CORNER COORDINATES 0 0 100 100
    MULTIPLOT SCALE FACTOR 2
    .
    TITLE MEAN INFLUENCE CURVE
    MEAN INFLUENCE CURVE Y XSEQ
    LET P1 = 10
    LET P2 = 10
    TITLE TRIMMED MEAN INFLUENCE CURVE
    TRIMMED MEAN INFLUENCE CURVE Y XSEQ
    TITLE WINSORIZED MEAN INFLUENCE CURVE
    WINSORIZED MEAN INFLUENCE CURVE Y XSEQ
    TITLE BIWEIGHT LOCATION INFLUENCE CURVE
    BIWEIGHT LOCATION INFLUENCE CURVE Y XSEQ
    .
    END OF MULTIPLOT

    plot generated by sample program

Date created: 7/18/2002
Last updated: 4/4/2003
Please email comments on this WWW page to alan.heckert@nist.gov.