![]() |
BEST DISTRIBUTIONAL FITName:
There are two steps in this process:
You can specify the fit method with the command
where <value> is one of the following
The default method is maximum likelihood. You can specify the goodness of fit critierion with the command
where <value> is one of the following
The default goodness of fit criterion is Anderson-Darling. Note that this command is intended strictly as a screening tool to identify good candidate distributions. You should perform a more complete analysis once you identify appropriate candidate distributions. Also, you may be able to improve the fit for certain distributions by fine tuning the starting values. We do not recommend simply selecting the "best" distribution from the list. Rather this command is meant to identify good candidate models that should be examined more carefully. For example, a simpler distribution that provides nearly as good a fit as a more complicated distribution may be preferred. In some cases, a distribution that has a more meaningful physical interpretation or has established usage in a given area of work may be preferred. For performance reasons, not all possible distributions are included.
where <y> is the response variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. For this syntax, the response variable can be a matrix.
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of 1 to 30 response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a best distributional analysis for each listed response variable. These response variables can be matrices.
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable; <x1> ... <xk> is a list of 1 to 6 group-id variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax peforms a cross-tabulation of <x1> ... <xk> and performs the best distributional fit analysis for each unique combination of cross-tabulated values. For example, if X1 has 3 levels and X2 has 2 levels, there will be a total of 6 best distributional fit analyses performed.
BEST DISTRIBUTIONAL FIT Y SUBSET TAG = 1 MULTIPLE BEST DISTRIBUTIONAL FIT Y1 TO Y5 REPLICATED BEST DISTRIBUTIONAL FIT Y X
Also, distributions that expect all positive (or negative) numbers will be shifted appropriately before performing the maximum likelihood estimation. Since this command is intended as a quick screening method, not all methods for which Dataplot supports maximum likelihood estimation are included.
This restriction is primarily for performance reasons.
with k denoting the number of parameters being fit and L is the maximized value of the likelihood function. The AICc is computed as
The AICc is recommended over the AIC when the sample size is small or k is large. Since AICc converges to AIC for large n, some analysts prefer to use AICc rather than AIC for all cases. The BIC is computed as
The penalty term for extra parameters is larger in the BIC than it is for the AIC.
So if we use a non-PPCC method to estimate the parameters and use the PPCC as a ranking method, there is an additional implicit estimate for the location and scale parameters. For this reason, the PPCC ranking method is only supported when a PPCC fitting method is used.
AD is a synonym for ANDERSON DARLING KS is a synonym for KOLMOGOROV SMIRNOV
2012/10: Added AIC, AICC, and BIC ranking methods . Step 1: Read the data . . Following data from Jeffery Fong of the NIST . Applied and Computational Mathematics Division. . This is strength data in ksi units. . read y 18.830 20.800 21.657 23.030 23.230 24.050 24.321 25.500 25.520 25.800 26.690 26.770 26.780 27.050 27.670 29.900 31.110 33.200 33.730 33.760 33.890 34.760 35.750 35.910 36.980 37.080 37.090 39.580 44.045 45.290 45.381 end of data . set write decimals 5 . . Step 2: Apply goodness of fit tests for Weibull distribution . based on ML estimates . . Maximum likelihood method . set best fit method ml set best fit criterion anderson darling best distributional fit y . set best fit method ml set best fit criterion kolm smir best distributional fit y . . PPCC method . set best fit method ppcc set best fit criterion ppcc best distributional fit yThe following output is generated. Best Distributional Fit Response Variable: Y Fit Method: Maximum Likelihood Ranking Criterion: Anderson Darling Summary Statistics: Number of Observations: 31 Sample Minimum: 18.83000 Sample Maximum: 45.38100 Sample Mean: 30.81142 Sample SD: 7.253381 Ranked List of Best Fit ---------------------------------------------------------------------------------------------------- Goodness Estimate Estimate Estimate Estimate of Fit of of of Shape of Shape Distribution Statistic Location Scale Parameter 1 Parameter 2 ---------------------------------------------------------------------------------------------------- TRIANGULAR 0.3332130 17.33848 49.26861 25.50000 ** 3-PAR WEIBULL (MINIMUM) 0.3380554 17.64420 14.83507 1.913580 ** 2-PAR LOGNORMAL 0.3888329 ** 30.00134 0.2349026 ** GUMBEL (MAXIMUM) 0.3980371 27.39966 5.986812 ** ** 2-PAR INVERTED GAMMA 0.4062290 ** 555.0562 18.99880 ** 2-PAR GAMMA 0.4386866 ** 1.627518 18.93154 ** 2-PAR BURR TYPE 10 0.4464637 ** 19.47685 7.276202 ** 2-PAR FRECHET (MAX) 0.4681383 ** 26.74576 4.659741 ** LOGISTIC EXPONENTIAL 0.4897890 ** 43.44812 5.187883 ** NORMAL 0.5321921 30.81142 7.253381 ** ** FOLDED NORMAL 0.5559204 30.81142 7.135432 ** ** LOGISTIC 0.5728510 30.44662 4.224463 ** ** TWO-SIDED POWER 0.5794983 18.55532 45.96386 25.50000 1.269723 2-PAR WEIBULL (MINIMUM) 0.5973435 ** 33.67424 4.635390 ** REFL GENE TOPP AND LEONE 0.8370549 18.80345 45.40755 0.5000000 0.7780750 BIRNBAUM SAUNDERS 0.8477387 ** 30.00279 0.3283453 ** SLASH 0.8526063 30.46421 3.523827 ** ** DOUBLE EXPONENTIAL 0.8691080 29.90000 6.124452 ** ** RAYLEIGH 0.9356298 18.79377 9.882772 ** ** GUMBEL (MININUM) 0.9867376 34.50269 7.278262 ** ** CAUCHY 1.200882 29.25895 5.093631 ** ** 1-PAR MAXWELL 2.618183 ** 18.25977 ** ** PARETO 3.437671 0.000000 1.000000 2.077089 18.53756 2-PAR BETA 3.448169 18.83000 26.55100 0.3175827 0.3288633 2-PAR WEIBULL (MAXIMUM) 3.657144 ** 14.76277 1.088025 ** EXPONENTIAL (2-PARAMETER) 4.013110 18.83000 11.98142 ** ** UNIFORM 5.244683 18.83000 26.55100 ** ** TOPP AND LEONE 5.711260 18.83000 26.55100 1.920741 ** POWER 7.789723 18.83000 26.55100 0.5217915 ** 2-PAR FRECHET (MIN) 8.330182 ** 0.9244886 0.1772339 ** REFLECTED POWER 10.70867 18.83000 26.55100 0.5568036 ** 2-COMP NORMAL MIXTURE 23.99764 ** ** 24.57355 35.74121 2-PAR INVERTED WEIBULL 916.0192 ** 0.3738909E-01 4.659730 ** Best Distributional Fit Response Variable: Y Fit Method: Maximum Likelihood Ranking Criterion: Kolmogorov Smirn Summary Statistics: Number of Observations: 31 Sample Minimum: 18.83000 Sample Maximum: 45.38100 Sample Mean: 30.81142 Sample SD: 7.253381 Ranked List of Best Fit ---------------------------------------------------------------------------------------------------- Goodness Estimate Estimate Estimate Estimate of Fit of of of Shape of Shape Distribution Statistic Location Scale Parameter 1 Parameter 2 ---------------------------------------------------------------------------------------------------- TRIANGULAR 0.1113989 17.33848 49.26861 25.50000 ** 3-PAR WEIBULL (MINIMUM) 0.1170822 17.64420 14.83507 1.913580 ** 2-PAR LOGNORMAL 0.1219492 ** 30.00134 0.2349026 ** BIRNBAUM SAUNDERS 0.1297689 ** 30.00279 0.3283453 ** TWO-SIDED POWER 0.1314438 18.55532 45.96386 25.50000 1.269723 2-PAR INVERTED GAMMA 0.1318310 ** 555.0562 18.99880 ** 2-PAR BURR TYPE 10 0.1326033 ** 19.47685 7.276202 ** LOGISTIC EXPONENTIAL 0.1329623 ** 43.44812 5.187883 ** SLASH 0.1342469 30.46421 3.523827 ** ** 2-PAR GAMMA 0.1349165 ** 1.627518 18.93154 ** GUMBEL (MAXIMUM) 0.1358038 27.39966 5.986812 ** ** LOGISTIC 0.1425181 30.44662 4.224463 ** ** 2-PAR FRECHET (MAX) 0.1456718 ** 26.74576 4.659741 ** NORMAL 0.1513989 30.81142 7.253381 ** ** 2-PAR WEIBULL (MINIMUM) 0.1525868 ** 33.67424 4.635390 ** FOLDED NORMAL 0.1539952 30.81142 7.135432 ** ** RAYLEIGH 0.1570339 18.79377 9.882772 ** ** DOUBLE EXPONENTIAL 0.1598958 29.90000 6.124452 ** ** GUMBEL (MININUM) 0.1601804 34.50269 7.278262 ** ** CAUCHY 0.1612234 29.25895 5.093631 ** ** REFL GENE TOPP AND LEONE 0.1625837 18.80345 45.40755 0.5000000 0.7780750 TOPP AND LEONE 0.1633070 18.83000 26.55100 1.920741 ** UNIFORM 0.1832347 18.83000 26.55100 ** ** EXPONENTIAL (2-PARAMETER) 0.2010937 18.83000 11.98142 ** ** 1-PAR MAXWELL 0.2417328 ** 18.25977 ** ** PARETO 0.2660602 0.000000 1.000000 2.077089 18.53756 2-PAR BETA 0.2714764 18.83000 26.55100 0.3175827 0.3288633 2-PAR WEIBULL (MAXIMUM) 0.2845987 ** 14.76277 1.088025 ** POWER 0.2852870 18.83000 26.55100 0.5217915 ** REFLECTED POWER 0.3940263 18.83000 26.55100 0.5568036 ** 2-PAR FRECHET (MIN) 0.4239265 ** 0.9244886 0.1772339 ** 2-COMP NORMAL MIXTURE 0.7554719 ** ** 24.57355 35.74121 2-PAR INVERTED WEIBULL 1.000000 ** 0.3738909E-01 4.659730 ** Best Distributional Fit Response Variable: Y Fit Method: PPCC Ranking Criterion: PPCC Summary Statistics: Number of Observations: 31 Sample Minimum: 18.83000 Sample Maximum: 45.38100 Sample Mean: 30.81142 Sample SD: 7.253381 Ranked List of Best Fit ---------------------------------------------------------------------------------------------------- Goodness Estimate Estimate Estimate Estimate of Fit of of of Shape of Shape Distribution Statistic Location Scale Parameter 1 Parameter 2 ---------------------------------------------------------------------------------------------------- GENERALIZED PARETO (MAX) 0.9892061 20.18056 17.07700 -0.6000000 ** REFLECTED POWER 0.9892025 20.24849 28.72833 1.708333 ** 3-PAR WEIBULL (MINIMUM) 0.9882787 16.08011 16.72055 2.104016 ** GENERALIZED EXT VAL (MIN) 0.9882166 32.88338 7.894681 0.4509018 ** RAYLEIGH 0.9881071 16.73097 11.30290 ** ** 3-PAR BURR TYPE 10 0.9881069 16.71816 15.98673 1.001807 ** 2-PAR MAXWELL 0.9877088 13.32804 10.99731 ** ** BRADFORD 0.9871834 20.53644 24.96087 1.907258 ** 3-PAR WEIBULL (MAXIMUM) 0.9869599 74.55574 46.73858 6.913655 ** GENERALIZED EXT VAL (MAX) 0.9869491 27.83139 6.788558 0.1503006 ** 3-PAR GAMMA 0.9867522 5.208567 2.171618 11.82349 ** WALD 0.9865175 -8.640849 39.52321 27.95582 ** BIRNBAUM SAUNDERS 0.9865080 -6.302986 36.45857 0.2002008 ** G AND H 0.9863477 30.17988 7.281429 0.1919192 0.000000 DOUBLE GAMMA 0.9863375 30.81142 2.345620 2.705221 ** 3-PAR LOGNORMAL 0.9863089 -6.151920 36.30533 0.2002008 ** 3-PAR INVERTED GAMMA 0.9862202 -19.79904 2367.047 47.70000 ** DOUBLE WEIBULL 0.9859972 30.81142 7.044318 1.703213 ** 3-PAR GEOM EXTREME EXPO 0.9856844 17.88092 4.973479 10.82149 ** GENERALIZED PARETO (MIN) 0.9851244 44.69703 33.26714 -1.400000 ** ERROR 0.9830244 30.81142 12.18327 3.400000 ** TUKEY-LAMBDA 0.9827152 30.81142 6.945565 0.4000000 ** ANGLIT 0.9826880 30.81142 21.03343 ** ** COSINE 0.9823948 30.81142 6.379694 ** ** HALF-NORMAL 0.9822081 21.15793 12.25805 ** ** LOGISTIC EXPONENTIAL 0.9813604 2.311842 40.26116 4.823990 ** GUMBEL (MAXIMUM) 0.9806594 27.52850 5.921301 ** ** LOG LOGISTIC 0.9804096 -18.27834 48.59806 11.74677 ** NORMAL 0.9801995 30.81142 7.382224 ** ** UNIFORM 0.9788557 18.56336 24.49611 ** ** 3-PAR FRECHET (MAX) 0.9787403 -261.9942 289.4954 50.00000 ** 3-PAR INVERTED WEIBULL 0.9787403 -261.9942 289.4954 50.00000 ** LOGISTIC 0.9750343 30.81142 4.173573 ** ** LOG GAMMA 0.9748592 -85.60675 36.38708 25.00000 ** HYPERBOLIC SECANT 0.9700100 30.81142 4.870712 ** ** ASYMMETRIC DOUBLE EXPO 0.9686984 27.98396 7.188828 0.7532995 ** ARCSINE 0.9638854 21.02687 19.56909 ** ** LOG DOUBLE EXPONENTIAL 0.9635815 -23.48322 53.86625 10.00000 ** DOUBLE EXPONENTIAL 0.9590572 30.81142 5.438864 ** ** EXPONENTIAL (2-PARAMETER) 0.9479883 23.50395 7.529710 ** ** GUMBEL (MININUM) 0.9350657 33.94171 5.646002 ** ** 3-PAR FRECHET (MIN) 0.9300918 309.0630 275.1060 50.00000 ** SLASH 0.8225079 30.81142 1.106939 ** ** CAUCHY 0.8092121 30.81142 1.378849 ** **
|
Privacy
Policy/Security Notice
NIST is an agency of the U.S.
Commerce Department.
Date created: 09/22/2011 |