
MAXIMUM LIKELIHOODName:
Maximum likelihood estimates are popular because they have good statistical properties. The primary drawback is that likelihood equations have to be derived for each specific distributions (other approaches, such as least squares or PPCC plots, allow a more general approach). In some cases, the maximum likelihood estimates are trivial while in other cases they are quite complex and may require specialized methods to solve. Dataplot currently supports maximum likelihood estimates for the following continuous distributions:
Dataplot currently supports maximum likelihood estimates for the following discrete distributions:
Additional distributions may be added in the future. We do not give the likelihood equations for the various distributions here. Most of them can be found in the sources listed in the Reference section. For a given distribution, the maximum likelihood command will generate one or more of the following outputs:
A number of these commands generate method of moment estimates or other quantitative parameter estimates in addition to the maximum likelihood estimates. The Johnson SB and Johnson SU case only generates method of moment or percentile estimates.
<SUBSET/EXCEPT/FOR qualification> where <DIST> is one of:
and where the <SUBSET/EXCEPT/FOR qualification> is optional. This generates maximum likelihood estimates for the full sample case.
where <DIST> is JOHNSON SB, JOHNSON SU, or UNIFORM; <y> is the response variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional.
where <DIST> is JOHNSON SB, JOHNSON SU, or JOHNSON; <y> is the response variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional.
<SUBSET/EXCEPT/FOR qualification> where <DIST> is one of:
<x> is the censoring variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. The censoring variable should contain 1's and 0's where 1 indicates a failure time and 0 indicates a censoring time.
<SUBSET/EXCEPT/FOR qualification> where <DIST> is EXPONENTIAL; <y> is the frequency variable; <x> is the bin midpoints variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax is used for grouped data. In this case, the keyword GROUPED is required to distinguish this from the censored data case.
<SUBSET/EXCEPT/FOR qualification> where <DIST> is NORMAL MIXTURE; <y> is the frequency variable; <x> is the bin midpoints variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. In this case, the keyword GROUPED is omitted since censored data is not supported for these distributions.
EXPONENTIAL MAXIMUM LIKELIHOOD Y WEIBULL MAXIMUM LIKELIHOOD Y X WEIBULL MAXIMUM LIKELIHOOD Y SUBSET X > 5 JOHNSON SB MOMENTS Y
You can enter your own values for these lower and upper limits with the commands
LET BETAUL = <value>
These distributions can also be estimated using the percentile method described by Slifker and Shapiro (see the Reference section below). This method is based on matching percentiles of the data with theoretical percentiles. This method first determines whether the Johnson SB or Johnson SU distribution is most appropriate. This method requires a tuning parameter that can be set with the following command:
The default value is 0.54. As the sample size gets larger, the value of Z can be set closer to 1. Basically, increasing the value of Z will use more extreme percentiles in performing the estimation. Skifler and Shapiro do not give specific recommendations. However, using the default value for small to moderate size data sets (say a few hundred points or less) and a value of 0.8 for data sets larger than this should generate reasonable results. Alternatively, you can generate the estimates using several different values of Z between 0.5 and 1. You can perform a KolmogorovSmirnov goodness of fit test with the different estimates to see what value of Z results in the best fit.
For the hypergeometric distribution, there are four quantities of interest:
There are two distinct cases to consider.
Formulas for the variance are also given in Johnson, Kotz, and Kemp.
HELP HERPDF
"Continuous Univariate Distributions: Volume II", 2nd. ed., Johnson, Kotz, and Balakrishnan, John Wiley and Sons, 1994. "Univariate Discrete Distributions", 2nd. ed., Johnson, Kotz, and Kemp, John Wiley and Sons, 1994. "Statistical Distributions in Engineering", Karl Bury, Cambridge University Press, 1999. "Statistical Distributions", Third Edition, Evans, Hastings, and Peacock, 2000. "Algorithm AS 99", Applied Statistics, 1976, Vol. 25, P. 180. "Confidence Intervals for the Parameters of the Logistic Distribution", Charles Antle, Lawrence Klimko, and William Harkness, Biometriks, (1970), pp. 397402. "The Johnson System: Selection and Parameter Estimation", James F. Slifker and Samuel S. Shapiro, Technometrics, Vol. 22, No. 2, May 1980, pp. 239246. "Inferences for the Cauchy Distribution Based on Maximum Likelihood Estimators", Biometrika, 1970, pp. 403407.
2003/10: Gumbel case supports both minimum and maximum cases 2003/11: Added support for logistic, uniform, and beta distributions 2004/5: Added confidence limits (Agresti and Coull approach) for binomial case 2004/5: Added confidence limits for lognormal case 2004/5: Added support for the following continuous distributions
GEOMETRIC EXTREME EXPONENTIAL FOLDED NORMAL CAUCHY
GEOMETRIC BETA BINOMIAL NEGATIVE BINOMIAL HYPERGEOMETRIC HERMITE YULE skip 25 read vangel31.dat y exponential mle y weibull mle y lognormal mle y gamma mle y The following output is generated. ************************* ** exponential mle y ** ************************* EXPONENTIAL MAXIMUM LIKELIHOOD ESTIMATION: FULL SAMPLE CASE ONEPARAMETER MODEL (LOCATION = 0) NUMBER OF OBSERVATIONS = 38 MINIMUM VALUE = 147.0000 ML ESTIMATE OF SCALE PARAMETER = 185.7895 STANDARD ERROR OF SCALE PARAMETER = 30.13903 CONFIDENCE INTERVAL FOR SCALE PARAMETER CONFIDENCE LOWER UPPER VALUE (%) LIMIT LIMIT  50.000 168.269 209.615 75.000 156.300 227.404 90.000 145.042 248.068 95.000 138.432 262.541 99.000 126.642 294.188 99.900 114.602 337.472 THE MINIMUM VALUE WILL BE SAVED AS THE INTERNAL PARAMETER U1 THE SCALE PARAMETER WILL BE SAVED AS THE INTERNAL PARAMETER B1 TWOPARAMETER MODEL (LOCATION UNKNOWN) NUMBER OF OBSERVATIONS = 38 ESTIMATE OF LOCATION PARAMETER = 147.0000 STANDARD ERROR OF LOCATION PARAMETER = 1.034478 BIAS CORRECTED ESTIMATE OF LOCATION PARAMETER = 145.9516 STANDARD ERROR OF BIAS CORRECTED LOCATION PARAMETER = 1.062437 ESTIMATE OF SCALE PARAMETER = 38.78947 STANDARD ERROR OF SCALE PARAMETER = 6.376950 BIAS CORRECTED ESTIMATE OF SCALE PARAMETER = 39.83784 STANDARD ERROR OF BIAS CORRECTED SCALE PARAMETER = 6.549300 CONFIDENCE INTERVAL FOR LOCATION PARAMETER CONFIDENCE LOWER UPPER VALUE (%) LIMIT LIMIT  50.000 145.519 146.697 75.000 144.758 146.860 90.000 143.729 146.946 95.000 142.933 146.973 99.000 141.028 146.995 99.900 138.154 146.999 CONFIDENCE INTERVAL FOR SCALE PARAMETER CONFIDENCE LOWER UPPER VALUE (%) LIMIT LIMIT  50.000 36.0380 45.0267 75.000 33.4428 48.9045 90.000 31.0050 53.4162 95.000 29.5751 56.5804 99.000 27.0274 63.5113 99.900 24.4298 73.0145 THE LOCATION PARAMETER WILL BE SAVED AS THE INTERNAL PARAMETER U2 THE SCALE PARAMETER WILL BE SAVED AS THE INTERNAL PARAMETER B2 ********************* ** weibull mle y ** ********************* WEIBULL MAXIMUM LIKELIHOOD ESTIMATION: FULL SAMPLE CASE TWOPARAMETER MODEL (LOCATION = 0) NUMBER OF OBSERVATIONS = 38 MINIMUM VALUE = 147.0000 SAMPLE MEAN VALUE = 185.7895 SAMPLE STANDARD DEVIATION VALUE = 18.59549 ESTIMATE OF SCALE PARAMETER = 194.2046 STANDARD ERROR OF SCALE PARAMETER = 3.137330 ESTIMATE OF SHAPE PARAMETER = 10.57322 STANDARD ERROR OF SHAPE PARAMETER = 1.337343 BIAS CORRECTED ESTIMATE OF SHAPE PARAMETER = 10.20502 STANDARD ERROR OF BIAS CORRECTED SHAPE PARAMETER = 1.290772 STANDARD ERROR OF SHAPE/SCALE COVARIANCE = 1.146094 STD ERR OF BIAS CORRECTED SHAPE/SCALE COVARIANCE = 1.125962 CONFIDENCE INTERVAL FOR SCALE PARAMETER NORMAL APPROXIMATION LIKELIHOOD RATIO CONFIDENCE LOWER UPPER LOWER UPPER VALUE (%) LIMIT LIMIT LIMIT LIMIT  50.000 192.089 196.321 192.063 196.335 75.000 190.596 197.814 190.529 197.852 90.000 189.044 199.365 188.901 199.457 95.000 188.056 200.354 187.840 200.504 99.000 186.123 202.286 185.704 202.633 99.900 183.881 204.528 183.090 205.301 CONFIDENCE INTERVAL FOR SHAPE PARAMETER (BASED ON NO BIAS CORRECTION ESTIMATES) NORMAL APPROXIMATION LIKELIHOOD RATIO CONFIDENCE LOWER UPPER LOWER UPPER VALUE (%) LIMIT LIMIT LIMIT LIMIT  50.000 9.67119 11.4752 9.73924 11.4358 75.000 9.03480 12.1116 9.16853 12.0612 90.000 8.37348 12.7729 8.59130 12.7256 95.000 7.95207 13.1944 8.23208 13.1567 99.000 7.12845 14.0180 7.54982 14.0162 99.900 6.17267 14.9738 6.79193 15.0417 THE FOLLOWING INTERNAL PARAMETERS ARE SAVED: ALPHAML, ALPHASE, GAMMAML, GAMMASE, CAMMABC, GAMMABCSE,COVSE,COVBCSE *********************** ** lognormal mle y ** *********************** LOGNORMAL MAXIMUM LIKELIHOOD ESTIMATION: FULL SAMPLE CASE TWOPARAMETER MODEL (LOCATION = 0) NUMBER OF OBSERVATIONS = 38 SAMPLE MINIMUM = 147.0000 SAMPLE MEAN = 185.7895 SAMPLE MEDIAN = 185.5000 SAMPLE STANDARD DEVIATION = 18.59549 ML ESTIMATE OF SHAPE PARAMETER (SIGMA) = 0.1002546 STANDARD ERROR OF SHAPE PARAMETER = 0.1165436E01 ML ESTIMATE OF SCALE PARAMETER = 184.8847 ML ESTIMATE OF MU (= LOG(SCALE)) = 5.219732 STANDARD ERROR OF SCALE/MU = 0.1626344E01 CONFIDENCE INTERVAL FOR SCALE PARAMETER SCALE PARAMETER MU PARAMETER CONFIDENCE LOWER UPPER LOWER UPPER VALUE (%) LIMIT LIMIT LIMIT LIMIT  50.000 184.874 184.896 5.20865 5.23081 75.000 184.866 184.904 5.20073 5.23874 90.000 184.857 184.912 5.19229 5.24717 95.000 184.852 184.918 5.18678 5.25269 99.000 184.841 184.929 5.17557 5.26389 99.900 184.827 184.943 5.16161 5.27785 CONFIDENCE INTERVAL FOR SHAPE PARAMETER CONFIDENCE LOWER UPPER VALUE (%) LIMIT LIMIT  50.000 0.936715E01 0.109717 75.000 0.889265E01 0.116492 90.000 0.844115E01 0.124286 95.000 0.817339E01 0.129704 99.000 0.769019E01 0.141454 99.900 0.718825E01 0.157350 THE FOLLOWING INTERNAL PARAMETERS ARE SAVED: SIGMAML, SIGMASE, SCALEML, UHATML, UHATSE ******************* ** gamma mle y ** ******************* GAMMA MAXIMUM LIKELIHOOD ESTIMATION: FULL SAMPLE CASE TWOPARAMETER MODEL (LOCATION = 0) NUMBER OF OBSERVATIONS = 38 MINIMUM VALUE = 147.0000 SAMPLE MEAN VALUE = 185.7895 SAMPLE STANDARD DEVIATION VALUE = 18.59549 SAMPLE GEOMETRIC MEAN VALUE = 184.8847 MOMENT ESTIMATE OF SCALE PARAMETER = 1.861205 MOMENT ESTIMATE OF SHAPE PARAMETER = 99.82214 ML ESTIMATE OF SCALE PARAMETER = 1.811020 STANDARD ERROR OF SCALE PARAMETER = 0.4158159 ML ESTIMATE OF SHAPE PARAMETER = 102.5883 STANDARD ERROR OF SHAPE PARAMETER = 23.49724 COVARIANCE OF THE SHAPE AND SCALE PARAMETERS = 9.746725 CONFIDENCE INTERVAL FOR SCALE PARAMETER NORMAL APPROXIMATION LIKELIHOOD RATIO CONFIDENCE LOWER UPPER LOWER UPPER VALUE (%) LIMIT LIMIT LIMIT LIMIT  50.000 1.53056 2.09148 1.55723 2.12304 75.000 1.33269 2.28935 1.40621 2.38733 90.000 1.12706 2.49498 1.26945 2.70996 95.000 0.996035 2.62600 1.19156 2.94588 99.000 0.739949 2.88209 1.05707 3.49022 99.900 0.442772 3.17927 0.925650 4.29764 CONFIDENCE INTERVAL FOR SHAPE PARAMETER NORMAL APPROXIMATION LIKELIHOOD RATIO CONFIDENCE LOWER UPPER LOWER UPPER VALUE (%) LIMIT LIMIT LIMIT LIMIT  50.000 86.7397 118.437 87.5479 119.267 75.000 75.5583 129.618 77.8833 132.049 90.000 63.9388 141.238 68.6408 146.247 95.000 56.5346 148.642 63.1638 155.791 99.000 42.0635 163.113 53.3517 175.581 99.900 25.2704 179.906 43.3751 200.473 THE FOLLOWING INTERNAL PARAMETERS ARE SAVED: GAMMAML, GAMMASE, SCALEML, SCALESE, GAMMAMOM, SCALEMOM,COVSE
Date created: 6/5/2001 