1.
Exploratory Data Analysis
1.3.
EDA Techniques
1.3.6.
Probability Distributions
1.3.6.5.
Estimating the Parameters of a Distribution
1.3.6.5.4.
|
PPCC and Probability Plots
|
|
PPCC and Probability Plots
|
The PPCC plot can be used to estimate
the shape parameter of a distribution with a single shape parameter.
After finding the best value of the shape parameter,
the probability plot can be used to
estimate the location and scale parameters of a probability
distribution.
|
Advantages
|
The advantages of this method are:
- It is based on two well-understood concepts.
- The linearity (i.e., straightness) of the probability
plot is a good measure of the adequacy of the
distributional fit.
- The correlation coefficient between the points on the
probability plot is a good measure of the linearity
of the probability plot.
- It is an easy technique to implement for a wide variety
of distributions with a single shape parameter. The basic
requirement is to be able to compute the
percent point function,
which is needed in the computation of both the probability
plot and the PPCC plot.
- The PPCC plot provides insight into the sensitivity of the
shape parameter. That is, if the PPCC plot is relatively
flat in the neighborhood of the optimal value of the shape
parameter, this is a strong indication that the fitted model
will not be sensitive to small deviations, or even large
deviations in some cases, in the value of the shape
parameter.
- The maximum correlation value provides a method for comparing
across distributions as well as identifying the best value
of the shape parameter for a given distribution. For example,
we could use the PPCC and probability fits for the Weibull,
lognormal, and possibly several other distributions.
Comparing the maximum correlation coefficient achieved for
each distribution can help in selecting which is the best
distribution to use.
|
Disadvantages
|
The disadvantages of this method are:
- It is limited to distributions with a single shape parameter.
- PPCC plots are not widely available in statistical software
packages other than Dataplot (Dataplot provides PPCC plots
for 40+ distributions). Probability plots are generally
available. However, many statistical software packages only
provide them for a limited number of distributions.
- Significance levels for the correlation coefficient
(i.e., if the maximum correlation value is above a given
value, then the distribution provides an adequate fit for
the data with a given confidence level) have only been worked
out for a limited number of distributions.
|
Other Graphical Methods
|
For reliability applications, the
hazard plot and the
Weibull plot are alternative graphical
methods that are commonly used to estimate parameters.
|