SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

ANDERSON DARLING TEST

Name:
    ANDERSON DARLING TEST

    NOTE: This command has been replaced with the unified GOODNESS OF FIT command.

Type:
    Analysis Command
Purpose:
    Perform an Anderson-Darling goodnes of fit test that a data set comes from a specified distribution. Currently, Dataplot supports the Anderson-Darling goodness of fit test for the normal, lognormal, Weibull, exponential, and extreme value type 1 distributions.
Description:
    The Anderson-Darling test (Stephens, 1974) is used to test if a sample of data comes from a specific distribution. It is a modification of the Kolmogorov-Smirnov (K-S) test and gives more weight to the tails than the K-S test. The K-S test is distribution free in the sense that the critical values do not depend on the specific distribution being tested. The Anderson-Darling test makes use of the specific distribution in calculating critical values. This has the advantage of allowing a more sensitive test and the disadvantage that critical values must be calculated for each distribution. Currently, Dataplot supports the Anderson-Darling test for the following distributions:

    1. normal,
    2. lognormal,
    3. exponential,
    4. Weibull, and
    5. extreme value type 1.
    6. logistic
    7. double exponential
    8. uniform ((0,1)
    9. generalized pareto

    Note that the uniform (0,1) case can be used for fully specified distributions (i.e., the shape, location, and scale parameters are not estimated from the data). Simply apply the appropriate CDF function to the data (this transforms it to a (0,1) interval) and apply the uniform (0,1) test to the transformed data.

    More formally, the test is defined as follows.

    H0: The data follows a specified distribution.
    Ha: The data do not follow the specified distribution
    Test Statistic: The Anderson-Darling test statistic is defined as:

      A**2 = -N - S

    where

      S = SUM((2*i-1)/N)[LOG(F(Y(i)+LOG(1-F(Y(N+1-i)))]
 where the summation is from 1 to N

    where F is the cumulative distribution function of interest.

    Significance Level: alpha
    Critical Region: The critical values for the Anderson-Darling test are dependent on the specific distribution being tested. Tabulated values and formulas have been published by Stephens for a few specific distributions (normal, lognormal, exponential, Weibull, logistic, extreme value type 1,logistic, double exponential, uniform, generalized Pareto). The test is a one-sided test and the hypothesis that the distribution is of a specific form is rejected if the test statistic, A, is greater than the critical value.

    Note that relevant parameters for the distribution being tested are estimated from the data.

Syntax:
    ANDERSON DARLING <DIST> TEST <y> <SUBSET/EXCEPT/FOR qualification>
    where <DIST> is NORMAL, LOGNORMAL, WEIBULL, EXPONENTIAL, EV1, LOGISTIC, DOUBLE EXPONENTIAL, UNIFORM, GENERALIZED PARETO;
              <y> is the response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    ANDERSON DARLING NORMAL TEST Y
    ANDERSON DARLING TEST Y
    ANDERSON DARLING LOGNORMAL TEST Y
    ANDERSON DARLING EXPONENTIAL TEST Y
    ANDERSON DARLING WEIBULL TEST Y
    ANDERSON DARLING NORMAL TEST Y SUBSET GROUP = 2 TO 4
Note:
    By default, the Anderson-Darling test will compute the maximum likelihood estimates of the data for the specified distribution. It then uses these estimates in computing the Anderson-Darling test.

    For some of the distributions, you can specify your own estimates of the parameters. Specifically,

      LET ALPHA = <value of location parameter>
      LET BETA = <value of scale parameter>
      ANDERSON DARLING LOGISTIC TEST Y

      LET GAMMA = <value of shape parameter>
      LET BETA = <value of scale parameter>
      ANDERSON DARLING WEIBULL TEST Y

      LET GAMMA = <value of shape parameter>
      LET A = <value of scale parameter>
      ANDERSON DARLING WEIBULL TEST Y

    The maximum likelihood estimation for the generalized Pareto distribution is still undergoing algorithmic development, so it is suggested that you estimate the parameters using some other method (e.g., PPCC PLOT) first and then apply the Anderson-Darling test.

Note:
    If the CAPTURE HTML or CAPTURE LATEX commands are used, the output from the Anderson-Darling test will be generated in HTML or Latex format, respectively. Enter HELP CAPTURE HTML or CAPTURE LATEX for details.
Default:
    If no distribution is specified, then the data is tested for normality.
Synonyms:
    ANDERSON DARLING is a synonym for ANDERSON DARLING <DIST> TEST.
Related Commands:
    KOLMOGOROV SMIRNOV GOODNESS OF FIT TEST = Perform a Kolmogorov Smirnov goodness of fit test.
    CHI SQUARE GOODNESS OF FIT TEST = Perform a Chisquare goodness of fit test.
    WILK SHAPIRO TEST = Perform a Wilks-Shapiro test for normality.
    PROBABILITY PLOT = Generate a probability plot.
    CAPTURE HTML = Generate Dataplot output in HTML format.
    CAPTURE LATEX = Generate Dataplot output in Latex format.
Reference:
    "EDF Statistics for Goodness of Fit and Some Comparisons", Stephens, M. A. (1974), Journal of the American Statistical Association, Vol. 69, pp. 730-737.

    "Asymptotic Results for Goodness-of-Fit Statistics with Unknown Parameters", Stephens, M. A. (1976), Annals of Statistics, Vol. 4, pp. 357-369.

    "Goodness of Fit for the Extreme Value Distribution", Stephens, M. A. (1977), Biometrika, Vol. 64, pp. 583-588.

    "Goodness of Fit with Special Reference to Tests for Exponentiality", Stephens, M. A. (1977), Technical Report No. 262, Department of Statistics, Stanford University, Stanford, CA.

    "Tests of Fit for the Logistic Distribution Based on the Empirical Distribution Function", Stephens, M. A. (1979), Biometrika, Vol. 66, pp. 591-595.

    "Goodness-of-Fit Tests for the Generalized Pareto Distribution", V. Choulakian and M. A. Stephens, Technometrics, November, 2001, Vol. 43, No. 4, pp. 478-484.

    "MIL-HDBK-17 Volume 1: Guidelines for Characterization of Structural Materials", Depeartment of Defense, chapter 8. The URL for MIL-HDBK-17 is http://mil-17.udel.edu/.

Applications:
    Reliability
Implementation Date:
    1998/5
    2000/10: Fix to estimate shape and scale parameters for Weibull 2003/10: Support for CAPTURE HTML and CAPTURE LATEX
    2003/11: Support for logistic, double exponential, and uniform (0,1) distributions 2004/4: Support for the generalized Parteto distribution
Program:
    SKIP 25
    READ VANGEL31.DAT Y
    ANDERSON DARLING EXPONENTIAL TEST Y
     
        The following output is generated:
    
          
               *******************************************
               **  anderson darling exponential test y  **
               *******************************************
          
          
                       ANDERSON DARLING 1-SAMPLE TEST
                       THAT THE DATA COME FROM A EXPONENTIAL          DISTRIBUTION
          
         1. STATISTICS:
               NUMBER OF OBSERVATIONS                =       38
               LOCATION PARAMETER                    =    185.7895
               SCALE PARAMETER                       =    18.59549
          
               ANDERSON DARLING TEST STATISTIC VALUE =    14.35715
          
         2. CRITICAL VALUES:
               90         % POINT    =    1.062000
               95         % POINT    =    1.321000
               97.5       % POINT    =    1.591000
               99         % POINT    =    1.959000
          
         3. CONCLUSION (AT THE 5% LEVEL):
               THE DATA DO NOT COME FROM A EXPONENTIAL         DISTRIBUTION.
        
Date created: 06/05/2001
Last updated: 12/04/2023

Please email comments on this WWW page to alan.heckert@nist.gov.