![]() |
PPCC PLOTName:
The PPCC plot is based on the following two ideas:
The PPCC plot is formed by selecting a value of the shape parameter, generating the probability plot (this probability plot is not actually graphed), and then computing the correlation coefficient of the resulting probability plot. The PPCC plot then consists of:
The value of the distributional parameter (on the horizontal axis) which corresponds to the maximum of the PPCC plot curve (on the vertical axis) is, of course, of interest since it indicates the best-fit member of the family. Some advantages of the PPCC plot as a fitting technique are:
Some disadvantages of the PPCC plot as a fitting technique are:
PPCC plots are available for the following continuous distributional families (with the distributional parameter in parentheses) with one shape parameter:
PPCC plots are available for the following continuous distributional families (with the distributional parameter in parentheses) with two shape parameters:
PPCC plots are available for the following discrete distributional families (with the distributional parameter in parentheses):
The use of the PPCC plot for discrete distributions is still experimental (see the Note below). The percent point function for the discrete distributions is a step function (since X is restricted to integer values). This can result in non-smooth ppcc and probability plots. For discrete distributions, the KS PLOT (which will plot the minimum value of chi-square statistic) is recommended over the PPCC PLOT as long as the sample size is reasonably large.
where <y> is the variable of raw data values under analysis; <family> is one of the distributions listed above; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the raw data case.
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of raw data values under analysis; <x> is the censoring variabe; <family> is one of the distributions listed above; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the raw data case where there is censoring.
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of raw data values under analysis; <groupid> is a group id variable; <family> is one of the distributions listed above; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the raw data case where there is grouped data (in the sense of batches).
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of raw data values under analysis; <x> is the censoring variabe; <groupid> is a group id variable; <family> is one of the distributions listed above; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the raw data case where there is both grouped data (in the sense of batches) and censoring.
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of pre-computed frequencies; <x> is the variable of distinct values for the variable under analysis; <family> is one of the families listed above; and where the This syntax is used for the binned data case where the bins are defined by the mid-points of each bin.
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of pre-computed frequencies; <xlow> is the variable containing the lower limits for the bins; <xhigh> is the variable containing the upper limits for the bins; <family> is one of the families listed above; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the binned data case where the bins are defined by the lower and upper limits of the bins (i.e., the bins can be of unequal width).
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of pre-computed frequencies; <x> is the variable of distinct values for the variable under analysis; <groupid> is a group id variable; <family> is one of the families listed above; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for the binned data case where there are multiple batches of data. The bins are defined by the mid-points of each bin and there are multiple batches of data.
<SUBSET/EXCEPT/FOR/qualification> where <y> is the variable of pre-computed frequencies; <xlow> is the variable containing the lower limits for the bins; <xhigh> is the variable containing the upper limits for the bins; <groupid> is a group id variable; <family> is one of the families listed above; and where the This syntax is used for the binned data case where there are multiple batches of data. The bins are defined by the lower and upper limits of the bins (i.e., the bins can be of unequal width).
T PPCC PLOT X EXTREME VALUE TYPE 2 PPCC PLOT X POISSON PPCC PLOT X LAMBDA PPCC PLOT F X T PPCC PLOT F X EXTREME VALUE TYPE 2 PPCC PLOT F X POISSON PPCC PLOT F X
LET GAMMA2 = 20 WEIBULL PPCC PLOT Y A common use of this is to obtain a refinement of the estimate of the shape parameter. That is, an initial iteration (typically just the default values of the parameter) is used to identify the appropriate neighborhood of the optimal value of the shape parameter. Then a second iteration of the PPCC PLOT is generated with the parameter restricted to a much narrower range of values. Although this iteration can be repeated as many times as you like, for practical purposes a two iterations is typically sufficient.
In the case of two shape parameters, these are saved as SHAPE1 and SHAPE2.
before generating the ppcc plot. For the noncentral t and noncentral chi-square distributions, we can fix the value of the degrees of freedom parameter to a single value. In this case, the ppcc plot reverts to a one shape parameter plot. Enter the commands
LET NU2 = <value> where <value> is the same for NU1 and NU2.
A value of 1 or MIN specifies the minimum form of the disribution and a value of 2 or MAX specifies the maximum form of the distribution. Although earlier versions of Dataplot required that this parameter be explicitly entered, Dataplot will now choose a default form of the distribution if it has not been specified. For the Weibull, the minimum form is the default. For the Frechet and generalized extreme value disributions, the maximum form is the default. Note that if you enter an explicit SET MINMAX command, it applies to all 3 distributions.
For distributions that have percent point functions that can be computed with simple closed form formulas or that have relatively simple approximations, there is little to be gained by thinning the data since the ppcc plot in these cases will still be quite fast even for very large data sets. However, there are a number of distributions where the percent point function is computed by numerically inverting a cumulative distribution function (which may in turn be computed via a numerical integration). In these cases, using one of the binning techniques can make the method practical (although you will likely not obtain as accurate an estimate as the full data set would produce).
You can modify the number values used for the shape parameters by entering the command
where <val1> is the number of values for the first shape parameter and <val2> is the number of values for the second shape parameter. There are two typical uses for this command:
LET YUPP = MAXIMUM Y LET YLOW = YLOW - 0.5 CLASS LOWER YLOW LET YUPP = YUPP + 0.5 CLASS UPPER YUPP CLASS WIDTH = 1 LET Y2 X2 = BINNED Y POISSON PPCC PLOT Y2 X2 POISSON KS PLOT Y2 X2 This will center the bins around the integer values and will cover the first and last class. In this case, the KS PLOT syntax will generate a plot that shows the minimum value of the chi-square statistic. It is usually recommended that the minimum bin size be at least 5 in order for the chi-square goodness of fit to generate accurate critical values. You can automatically combine bins with the command
LET Y3 XLOW XHIGH = COMBINE FREQUENCY TABLE Y2 X2 Although the ppcc plot can also accept the unequal bin width syntax, there is typically less reason to do this for the ppcc plot. The primary reason is you want to compare the ppcc plot with the chi-square plot and you want to have comparable bins for both methods. Also, some data sets may be provided in a format with unequal bin widths (this is usually to combine bins in the tails with few points).
LAMBDA PPCC PLOT and TUKEY PPCC PLOT are synonyms for TUKEY LAMBDA PPCC PLOT. STUDENT T PPCC PLOT is a synonym for T PPCC PLOT. The CHISQUARE term can be specified as CHISQUARE or CHI SQUARE. FL PPCC PLOT, BRIN SAUNDERS PPCC PLOT, and SAUNDERS BRIN are synonyms for FATIGUE LIFE PPCC PLOT. IG PPCC PLOT is a synonym for INVERSE GAUSSIAN PPCC PLOT. RIG PPCC PLOT is a synonym for RECIPROCAL INVERSE GAUSSIAN PPCC PLOT. GEP PPCC PLOT and GP PPCC PLOT are synonyums for GENERALIZED PARETO PLOT. LOGNORMAL PPCC PLOT and LOG-NORMAL PPCC PLOT are synonyms for LOG NORMAL PPCC PLOT. POWER LOG-NORMAL PPCC PLOT and POWER LOGNORMAL PPCC PLOT are synonyms for POWER LOG NORMAL PPCC PLOT. VONMISES PPCC PLOT and VON-MISES PPCC PLOT are synonyms for VON MISES PPCC PLOT. LOGLOGISTIC PPCC PLOT and LOG-LOGISTIC PPCC PLOT are synonyms for LOG LOGISTIC PPCC PLOT. SKEW LAPLACE PPCC PLOT is a synonym for SKEW DOUBLE EXPONENTIAL PPCC PLOT. ASYMMETRIC LAPLACE PPCC PLOT is a synonym for ASYMMETRIC DOUBLE EXPONENTIAL PPCC PLOT.
1990/5: Implemented IG, WALD, RIG, FL distributions. 1993/12: Implemented GENERALIZED PARETO distribution. 1995/5: Implemented LOGNORMAL, POWER NORMAL,
2002/5: Implemented TWO-SIDED POWER distribution. 2003/5: Implemented ERROR distribution. 2004/1: Implemented FOLDED T, SKEWED T, SKEWED NORMAL,
2004/5: Added support for the SET PPCC FORMAT command. 2004/5: Fixed a number of bugs in various distributions. 2004/5: Fixed a number of bugs in various distributions. 2004/6: Implemented SKEW DOUBLE EXPONENTIAL,
2004/9: Implemented GENERALIZED ASYMETRIC LAPLACE,
2004/9: Implemented SET PPCC PLOT AXIS POINTS 2004/9: Implemented SET PPCC PLOT AXIS ORDER 2004/10: Implemented CENSORED case 2005/5: Implemented REPLICATION case 2005/5: Implemented binned case where bins are
MULTIPLOT 2 2 MULTIPLOT CORNER COORDINATES 0 0 100 100 MULTIPLOT SCALE FACTOR 1.5 TITLE AUTOMATIC X1LABEL THEORETICAL VALUE Y1LABEL DATA VALUE TITLE OFFSET 2 X1LABEL DISPLACEMENT 10 Y1LABEL DISPLACEMENT 14 CHAR X LINE BLANK JUSTIFICATION RIGHT . LET LAMBDA = 1.5 LET Y = TUKEY LAMBDA RANDOM NUMBERS FOR I = 1 1 100 TUKEY LAMBDA PPCC PLOT Y MOVE 82 30 TEXT LAMBDA = ^SHAPE MOVE 82 25 TEXT PPCC = ^MAXPPCC . LET NU = 4 LET Y = T RANDOM NUMBERS FOR I = 1 1 100 T PPCC PLOT Y MOVE 82 30 TEXT NU = ^SHAPE MOVE 82 25 TEXT PPCC = ^MAXPPCC . LET GAMMA = 2.3 LET Y = WALD RANDOM NUMBERS FOR I = 1 1 100 WALD PPCC PLOT Y MOVE 82 30 TEXT GAMMA = ^SHAPE MOVE 82 25 TEXT PPCC = ^MAXPPCC . LET GAMMA = 1.6 LET Y = WEIBULL RANDOM NUMBERS FOR I = 1 1 100 SET PPCC PLOT AXIS POINTS 200 LET GAMMA1 = 0.2 LET GAMMA2 = 25 LINE SOLID CHARACTER BLANK WEIBULL PPCC PLOT Y MOVE 82 30 TEXT GAMMA = ^SHAPE MOVE 82 25 TEXT PPCC = ^MAXPPCC . END OF MULTIPLOT
|
Privacy
Policy/Security Notice
NIST is an agency of the U.S.
Commerce Department.
Date created: 8/30/2005 |