![]() |
BEST DISTRIBUTIONAL FITName:
There are two steps in this process:
You can specify the fit method with the command
where <value> is one of the following
The default method is maximum likelihood. You can specify the goodness of fit critierion with the command
where <value> is one of the following
The default goodness of fit criterion is Anderson-Darling. Note that this command is intended strictly as a screening tool to identify good candidate distributions. You should perform a more complete analysis once you identify appropriate candidate distributions. Also, you may be able to improve the fit for certain distributions by fine tuning the starting values. We do not recommend simply selecting the "best" distribution from the list. Rather this command is meant to identify good candidate models that should be examined more carefully. For example, a simpler distribution that provides nearly as good a fit as a more complicated distribution may be preferred. In some cases, a distribution that has a more meaningful physical interpretation or has established usage in a given area of work may be preferred. For performance reasons, not all possible distributions are included.
where <y> is the response variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. For this syntax, the response variable can be a matrix.
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of 1 to 30 response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a best distributional analysis for each listed response variable. These response variables can be matrices.
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable; <x1> ... <xk> is a list of 1 to 6 group-id variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax peforms a cross-tabulation of <x1> ... <xk> and performs the best distributional fit analysis for each unique combination of cross-tabulated values. For example, if X1 has 3 levels and X2 has 2 levels, there will be a total of 6 best distributional fit analyses performed.
BEST DISTRIBUTIONAL FIT Y SUBSET TAG = 1 MULTIPLE BEST DISTRIBUTIONAL FIT Y1 TO Y5 REPLICATED BEST DISTRIBUTIONAL FIT Y X
Also, distributions that expect all positive (or negative) numbers will be shifted appropriately before performing the maximum likelihood estimation. Since this command is intended as a quick screening method, not all methods for which Dataplot supports maximum likelihood estimation are included.
This restriction is primarily for performance reasons.
with k denoting the number of parameters being fit and L is the maximized value of the likelihood function. The AICc is computed as
The AICc is recommended over the AIC when the sample size is small or k is large. Since AICc converges to AIC for large n, some analysts prefer to use AICc rather than AIC for all cases. The BIC is computed as
The penalty term for extra parameters is larger in the BIC than it is for the AIC.
So if we use a non-PPCC method to estimate the parameters and use the PPCC as a ranking method, there is an additional implicit estimate for the location and scale parameters. For this reason, the PPCC ranking method is only supported when a PPCC fitting method is used.
then two additional columns are given. The first additional column gives the PDF value at 0. The second additional column is set to 1 if the distribution has an infinite lower tail and it is set to 0 if the distribution has a bounded lower tail. For the first additional column, the following commands can also be used
SET BEST FIT FONG XVALUE <VALUE> The SET BEST FIT FONG TYPE command is used to specify whether we want to print the CDF or the PDF value (the default is the PDF value). The SET BEST FIT FONG XVALUE command specifies at what value (the default is 0) we want to print the PDF (or CDF value). These options were added at the request of Jeffrey Fong. For lifetime data (and many other types of measurement data), we will only have positive measurements. So it is of interest whether we have non-negligble probability below some threshold value. It may be that some distributions indicate good fit, but they are not appropriate distributional models due to non-negligible probability for inadmissable values. Note that unbounded distributions may provide adequate distributional models for these cases if the cumulative probability below some threshold is practically, even if not theoretically, zero.
LET UPPLIMIT = <value> These limits will apply to all of these distributions. There is currently no way to specify different lower and upper limits for the different distributions. If these values are not set, then default values based on the data will be used (typically based on the minimum and maximum values of the data).
SET BEST FIT DISTRIBUTION LOGISTIC <ON/OFF> SET BEST FIT DISTRIBUTION LOG LOGISTIC <ON/OFF> SET BEST FIT DISTRIBUTION HYPERBOLIC SECANT <ON/OFF> SET BEST FIT DISTRIBUTION UNIFORM <ON/OFF> SET BEST FIT DISTRIBUTION POWER <ON/OFF> SET BEST FIT DISTRIBUTION ARCSINE <ON/OFF> SET BEST FIT DISTRIBUTION TRIANGULAR <ON/OFF> SET BEST FIT DISTRIBUTION ERROR <ON/OFF> SET BEST FIT DISTRIBUTION SLASH <ON/OFF> SET BEST FIT DISTRIBUTION CAUCHY <ON/OFF> SET BEST FIT DISTRIBUTION COSINE <ON/OFF> SET BEST FIT DISTRIBUTION BRADFORD <ON/OFF> SET BEST FIT DISTRIBUTION ANGLIT <ON/OFF> SET BEST FIT DISTRIBUTION RAYLEIGH <ON/OFF> SET BEST FIT DISTRIBUTION FOLDED NORMAL <ON/OFF> SET BEST FIT DISTRIBUTION TUKEY LAMBDA <ON/OFF> SET BEST FIT DISTRIBUTION GENERALIZED TUKEY LAMBDA <ON/OFF> SET BEST FIT DISTRIBUTION DOUBLE GAMMA <ON/OFF> SET BEST FIT DISTRIBUTION DOUBLE WEIBULL <ON/OFF> SET BEST FIT DISTRIBUTION REFLECTED POWER <ON/OFF> SET BEST FIT DISTRIBUTION TWO SIDED POWER <ON/OFF> SET BEST FIT DISTRIBUTION TOPP AND LEONE <ON/OFF> SET BEST FIT DISTRIBUTION REFLECTED GENERALIZED TOPP AND LEONE <ON/OFF> SET BEST FIT DISTRIBUTION GENERALIZED EXTREME VALUE MINIMUM <ON/OFF> SET BEST FIT DISTRIBUTION GENERALIZED EXTREME VALUE MAXIMUM <ON/OFF> SET BEST FIT DISTRIBUTION PARETO <ON/OFF> SET BEST FIT DISTRIBUTION GENERALIZED PARETO MINIMUM <ON/OFF> SET BEST FIT DISTRIBUTION GENERALIZED PARETO MAXIMUM <ON/OFF> SET BEST FIT DISTRIBUTION G AND H <ON/OFF> SET BEST FIT DISTRIBUTION G <ON/OFF> SET BEST FIT DISTRIBUTION INVERTED WEIBULL <ON/OFF> SET BEST FIT DISTRIBUTION GAMMA <ON/OFF> SET BEST FIT DISTRIBUTION LOG GAMMA <ON/OFF> SET BEST FIT DISTRIBUTION INVERTED GAMMA <ON/OFF> SET BEST FIT DISTRIBUTION FATIGUE LIFE <ON/OFF> SET BEST FIT DISTRIBUTION WALD <ON/OFF> SET BEST FIT DISTRIBUTION LOGISTIC EXPONENTIAL <ON/OFF> SET BEST FIT DISTRIBUTION GEOMETRIC EXTREME EXPONENTIAL <ON/OFF> SET BEST FIT DISTRIBUTION LOG DOUBLE EXPONENTIAL <ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER LOGNORMAL <ON/OFF>
SET BEST FIT DISTRIBUTION THREE PARAMETER LOGNORMAL <ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER EXPONENTIAL <ON/OFF>
SET> BEST FIT DISTRIBUTION TWO PARAMETER WEIBULL MINIMUM <ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER WEIBULL MAXIMUM <ON/OFF>
SET BEST FIT DISTRIBUTION THREE PARAMETER WEIBULL MINIMUM <ON/OFF>
SET BEST FIT DISTRIBUTION THREE PARAMETER WEIBULL MAXIMUM <ON/OFF>
SET BEST FIT DISTRIBUTION <E PARAMETER EXPONENTIAL ON/OFF>
SET BEST FIT DISTRIBUTION ASYMMETRIC DOUBLE EXPONENTIAL <ON/OFF>
SET> BEST FIT DISTRIBUTION DOUBLE EXPONENTIAL <ON/OFF>
SET BEST FIT DISTRIBUTION BURR TYPE TEN <ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER INVERSE GAUSSIAN <ON/OFF>
SET BEST FIT DISTRIBUTION THREE PARAMETER INVERSE GAUSSIAN <ON/OFF>
SET BEST FIT DISTRIBUTION THREE PARAMETER LOGNORMAL <ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER EXPONENTIAL <ON/OFF>
SET BEST FIT DISTRIBUTION FRECHET MINIMUM <ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER BETA <ON/OFF>
SET BEST FIT DISTRIBUTION FOUR PARAMETER BETA <ON/OFF>
SET BEST FIT DISTRIBUTION <E PARAMETER HALF NORMAL ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER HALF NORMAL <ON/OFF>
SET BEST FIT DISTRIBUTION <E PARAMETER HALF LOGISTIC ON/OFF>
SET BEST FIT DISTRIBUTION TWO PARAMETER HALF LOGISTIC <ON/OFF>
SET BEST FIT DISTRIBUTION TWO COMPONENT NORMAL MIXTURE <ON/OFF> The following resets the default list of distributions
SET BEST FIT DISTRIBUTION DEFAULT The following turns off all the distributions. If you have a small set of distributions, you can enter one of these commands first and then use the above commands to turn on the specific distributions you want to include
SET BEST FIT DISTRIBUTION OFF
AD is a synonym for ANDERSON DARLING KS is a synonym for KOLMOGOROV SMIRNOV
2012/10: Added AIC, AICC, and BIC ranking methods 2020/05: Added support for specifying which distributions to include . Step 1: Read the data . . Following data from Jeffery Fong of the NIST . Applied and Computational Mathematics Division. . This is strength data in ksi units. . read y 18.830 20.800 21.657 23.030 23.230 24.050 24.321 25.500 25.520 25.800 26.690 26.770 26.780 27.050 27.670 29.900 31.110 33.200 33.730 33.760 33.890 34.760 35.750 35.910 36.980 37.080 37.090 39.580 44.045 45.290 45.381 end of data . set write decimals 5 . . Step 2: Apply goodness of fit tests for Weibull distribution . based on ML estimates . . Maximum likelihood method . set best fit method ml set best fit criterion anderson darling best distributional fit y . set best fit method ml set best fit criterion kolm smir best distributional fit y . . PPCC method . set best fit method ppcc set best fit criterion ppcc best distributional fit yThe following output is generated. Best Distributional Fit Response Variable: Y Fit Method: Maximum Likelihood Ranking Criterion: Anderson Darling Summary Statistics: Number of Observations: 31 Sample Minimum: 18.83000 Sample Maximum: 45.38100 Sample Mean: 30.81142 Sample SD: 7.253381 Ranked List of Best Fit ---------------------------------------------------------------------------------------------------- Goodness Estimate Estimate Estimate Estimate of Fit of of of Shape of Shape Distribution Statistic Location Scale Parameter 1 Parameter 2 ---------------------------------------------------------------------------------------------------- *TRIANGULAR 0.3332130 17.33848 49.26861 25.50000 ** 3-PAR WEIBULL (MINIMUM) 0.3380554 17.64420 14.83507 1.913580 ** 3-PAR INVERSE GAUSSIAN 0.3874339 6.764274 1.000000 255.1458 24.04715 2-PAR LOGNORMAL 0.3888329 ** 30.00134 0.2349026 ** GUMBEL (MAXIMUM) 0.3980371 27.39966 5.986812 ** ** 3-PAR LOGNORMAL 0.3984710 6.066865 23.72709 0.2917821 ** 2-PAR INVERTED GAMMA 0.4062290 ** 555.0562 18.99880 ** 2-PAR INVERSE GAUSSIAN 0.4067194 0.000000 1.000000 563.9835 30.81142 *4-PAR BETA (MOMENTS) 0.4108797 18.80345 50.45846 1.307856 2.199731 2-PAR GAMMA 0.4386866 ** 1.627518 18.93154 ** 2-PAR BURR TYPE 10 0.4464637 ** 19.47685 7.276202 ** 2-PAR FRECHET (MAX) 0.4681346 ** 26.74577 4.659726 ** 2-PAR INVERTED WEIBULL 0.4681357 ** 26.74577 4.659730 ** LOGISTIC EXPONENTIAL 0.4897890 ** 43.44812 5.187883 ** NORMAL 0.5321921 30.81142 7.253381 ** ** FOLDED NORMAL 0.5559204 30.81142 7.135432 ** ** LOGISTIC 0.5728510 30.44662 4.224463 ** ** 2-PAR WEIBULL (MINIMUM) 0.5973435 ** 33.67424 4.635390 ** *REFLECTED POWER 0.7151471 18.56449 45.64651 1.110959 ** *2-PAR BETA 0.7213293 18.56449 45.64651 1.021440 1.126894 *REFL GENE TOPP AND LEONE 0.8370549 18.80345 45.40755 0.5000000 0.7780750 BIRNBAUM SAUNDERS 0.8477387 ** 30.00279 0.3283453 ** SLASH 0.8526063 30.46421 3.523827 ** ** *POWER 0.8635646 18.56449 45.64651 0.9463008 ** DOUBLE EXPONENTIAL 0.8691080 29.90000 6.124452 ** ** RAYLEIGH 0.9356298 18.79377 9.882772 ** ** GUMBEL (MININUM) 0.9867376 34.50269 7.278262 ** ** CAUCHY 1.200882 29.25895 5.093631 ** ** *TOPP AND LEONE 1.264385 18.56449 45.64651 1.573167 ** 1-PAR MAXWELL 2.618183 ** 18.25977 ** ** PARETO 3.437629 0.000000 1.000000 2.077101 18.53761 2-PAR WEIBULL (MAXIMUM) 3.657144 ** 14.76277 1.088025 ** 2-PAR EXPONENTIAL 4.013110 18.83000 11.98142 ** ** *UNIFORM 5.244683 18.83000 45.38100 ** ** 2-PAR FRECHET (MIN) 8.330173 ** 0.9246548 0.1772251 ** 2-COMP NORMAL MIXTURE 23.99764 ** ** 24.57355 35.74121 * denotes lower/upper limit rather than location/scale Best Distributional Fit Response Variable: Y Fit Method: Maximum Likelihood Ranking Criterion: Kolmogorov Smirn Summary Statistics: Number of Observations: 31 Sample Minimum: 18.83000 Sample Maximum: 45.38100 Sample Mean: 30.81142 Sample SD: 7.253381 Ranked List of Best Fit ---------------------------------------------------------------------------------------------------- Goodness Estimate Estimate Estimate Estimate of Fit of of of Shape of Shape Distribution Statistic Location Scale Parameter 1 Parameter 2 ---------------------------------------------------------------------------------------------------- *4-PAR BETA (MOMENTS) 0.1014130 18.80345 50.45846 1.307856 2.199731 *TRIANGULAR 0.1113989 17.33848 49.26861 25.50000 ** 3-PAR WEIBULL (MINIMUM) 0.1170822 17.64420 14.83507 1.913580 ** 2-PAR LOGNORMAL 0.1219492 ** 30.00134 0.2349026 ** 2-PAR INVERSE GAUSSIAN 0.1236868 0.000000 1.000000 563.9835 30.81142 3-PAR LOGNORMAL 0.1287544 6.066865 23.72709 0.2917821 ** BIRNBAUM SAUNDERS 0.1297689 ** 30.00279 0.3283453 ** 3-PAR INVERSE GAUSSIAN 0.1298348 6.764274 1.000000 255.1458 24.04715 2-PAR INVERTED GAMMA 0.1318310 ** 555.0562 18.99880 ** 2-PAR BURR TYPE 10 0.1326033 ** 19.47685 7.276202 ** LOGISTIC EXPONENTIAL 0.1329623 ** 43.44812 5.187883 ** SLASH 0.1342469 30.46421 3.523827 ** ** 2-PAR GAMMA 0.1349165 ** 1.627518 18.93154 ** GUMBEL (MAXIMUM) 0.1358038 27.39966 5.986812 ** ** *TOPP AND LEONE 0.1400991 18.56449 45.64651 1.573167 ** LOGISTIC 0.1425181 30.44662 4.224463 ** ** 2-PAR FRECHET (MAX) 0.1456706 ** 26.74577 4.659726 ** 2-PAR INVERTED WEIBULL 0.1456709 ** 26.74577 4.659730 ** *REFLECTED POWER 0.1489988 18.56449 45.64651 1.110959 ** *2-PAR BETA 0.1491331 18.56449 45.64651 1.021440 1.126894 NORMAL 0.1513989 30.81142 7.253381 ** ** 2-PAR WEIBULL (MINIMUM) 0.1525868 ** 33.67424 4.635390 ** FOLDED NORMAL 0.1539952 30.81142 7.135432 ** ** RAYLEIGH 0.1570339 18.79377 9.882772 ** ** DOUBLE EXPONENTIAL 0.1598958 29.90000 6.124452 ** ** GUMBEL (MININUM) 0.1601804 34.50269 7.278262 ** ** CAUCHY 0.1612234 29.25895 5.093631 ** ** *REFL GENE TOPP AND LEONE 0.1625837 18.80345 45.40755 0.5000000 0.7780750 *POWER 0.1728242 18.56449 45.64651 0.9463008 ** *UNIFORM 0.1832347 18.83000 45.38100 ** ** 2-PAR EXPONENTIAL 0.2010937 18.83000 11.98142 ** ** 1-PAR MAXWELL 0.2417328 ** 18.25977 ** ** PARETO 0.2660583 0.000000 1.000000 2.077101 18.53761 2-PAR WEIBULL (MAXIMUM) 0.2845987 ** 14.76277 1.088025 ** 2-PAR FRECHET (MIN) 0.4239459 ** 0.9246548 0.1772251 ** 2-COMP NORMAL MIXTURE 0.7554719 ** ** 24.57355 35.74121 * denotes lower/upper limit rather than location/scale Best Distributional Fit Response Variable: Y Fit Method: PPCC Ranking Criterion: PPCC Summary Statistics: Number of Observations: 31 Sample Minimum: 18.83000 Sample Maximum: 45.38100 Sample Mean: 30.81142 Sample SD: 7.253381 Ranked List of Best Fit ---------------------------------------------------------------------------------------------------- Goodness Estimate Estimate Estimate Estimate of Fit of of of Shape of Shape Distribution Statistic Location Scale Parameter 1 Parameter 2 ---------------------------------------------------------------------------------------------------- *TRIANGULAR 0.9903866 33.78641 50.83147 24.96409 ** *TOPP AND LEONE 0.9898597 19.21407 64.69333 1.256061 ** GENE TUKEY LAMBDA 0.9898393 29.50906 7.489485 0.7000000 0.3000000 GENE PARETO (MAX) 0.9892061 20.18056 17.07700 -0.6000000 ** *REFLECTED POWER 0.9892025 20.24849 28.72833 1.708333 ** 3-PAR WEIBULL (MINIMUM) 0.9882787 16.08011 16.72055 2.104016 ** GENE EXT VAL (MIN) 0.9882166 32.88338 7.894681 0.4509018 ** RAYLEIGH 0.9881071 16.73097 11.30290 ** ** 3-PAR BURR TYPE 10 0.9881069 16.71816 15.98673 1.001807 ** 2-PAR MAXWELL 0.9877088 13.32804 10.99731 ** ** BRADFORD 0.9871834 20.53644 24.96087 1.907258 ** 3-PAR WEIBULL (MAXIMUM) 0.9869599 74.55574 46.73858 6.913655 ** GENE EXT VAL (MAX) 0.9869491 27.83139 6.788558 0.1503006 ** 3-PAR GAMMA 0.9867522 5.208567 2.171618 11.82349 ** WALD 0.9865174 -8.570647 39.45313 27.85562 ** BIRNBAUM SAUNDERS 0.9865080 -6.302986 36.45857 0.2002008 ** G 0.9863602 30.19909 7.290451 0.1859296 ** G AND H 0.9863597 30.21816 7.299022 0.1800000 0.000000 DOUBLE GAMMA 0.9863375 30.81142 2.345620 2.705221 ** 3-PAR LOGNORMAL 0.9863089 -6.151920 36.30533 0.2002008 ** 3-PAR INVERTED GAMMA 0.9862202 -19.79904 2367.047 47.70000 ** DOUBLE WEIBULL 0.9859972 30.81142 7.044318 1.703213 ** 3-PAR GEOM EXTREME EXPO 0.9856844 17.88092 4.973479 10.82149 ** *POWER 0.9851532 21.03185 23.74664 0.7032828 ** GENE PARETO (MIN) 0.9851244 44.69703 33.26714 -1.400000 ** ERROR 0.9830244 30.81142 12.18327 3.400000 ** TUKEY-LAMBDA 0.9827152 30.81142 6.945565 0.4000000 ** ANGLIT 0.9826880 30.81142 21.03343 ** ** COSINE 0.9823948 30.81142 6.379693 ** ** HALF-NORMAL 0.9822081 21.15793 12.25805 ** ** LOGISTIC EXPONENTIAL 0.9813604 2.311842 40.26116 4.823990 ** GUMBEL (MAXIMUM) 0.9806594 27.52850 5.921301 ** ** LOG LOGISTIC 0.9804096 -18.27834 48.59806 11.74677 ** NORMAL 0.9801995 30.81142 7.382224 ** ** *UNIFORM 0.9788557 18.56336 24.49611 ** ** 3-PAR FRECHET (MAX) 0.9787403 -261.9942 289.4954 50.00000 ** 3-PAR INVERTED WEIBULL 0.9787403 -261.9942 289.4954 50.00000 ** LOGISTIC 0.9750343 30.81142 4.173573 ** ** LOG GAMMA 0.9748592 -85.60675 36.38708 25.00000 ** HYPERBOLIC SECANT 0.9700100 30.81142 4.870712 ** ** ASYMM DOUBLE EXPO 0.9686984 27.98396 7.188828 0.7532995 ** ARCSINE 0.9638854 21.02687 19.56909 ** ** LOG DOUBLE EXPONENTIAL 0.9635815 -23.48322 53.86625 10.00000 ** DOUBLE EXPONENTIAL 0.9590572 30.81142 5.438864 ** ** 2-PAR EXPONENTIAL 0.9479883 23.50395 7.529710 ** ** GUMBEL (MININUM) 0.9350657 33.94171 5.646002 ** ** 3-PAR FRECHET (MIN) 0.9300918 309.0630 275.1060 50.00000 ** SLASH 0.8225079 30.81142 1.106939 ** ** CAUCHY 0.8092121 30.81142 1.378849 ** ** * denotes lower/upper limit rather than location/scale
Date created: 09/22/2011 |
Last updated: 12/11/2023 Please email comments on this WWW page to [email protected]. |