SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

PREDICTION BOUNDS

Name:
    PREDICTION BOUNDS
Type:
    Analysis Command
Purpose:
    Generates prediction bounds for all m new observations given a previous sample.
Description:
    Given a sample of n observations with mean \( \bar{x} \) and standard deviation s, the two-sided prediction interval to contain all of m new indpendent, identically distributed observations is

      \( \bar{x} \pm r_{(1 - \alpha,m,n)} s \)

    A conservative approximation for r(1-\( \alpha \),m,n) is

      \( \sqrt{1 + \frac{1}{n}} t_{(1 - \alpha/(2m);n-1)} \)

    with t denoting the t percent point function. Dataplot uses the tabulated values given in Table A.13 of Hahn and Meeker when n and m are both less than or equal to 10. Otherwise, the approximation above is used.

    The corresponding one-sided interval is

      \( \mbox{lower limit} = \bar{x} - r'_{(1 - \alpha;m,n)} s \)

      \( \mbox{upper limit} = \bar{x} + r'_{(1 - \alpha;m,n)} s \)

    A conservative approximation for r'(1-\( \alpha \),m,n) is

      \( \sqrt{1 + \frac{1}{n}} t_{(1 - \alpha/m;n-1)} \)

    with t denoting the t percent point function. Dataplot uses the tabulated values given in Table A.14 of Hahn and Meeker when n and m are both less than or equal to 10. Otherwise, the approximation above is used.

    In the formula above, the only value from the new observations is the sample size. That is, it can be applied before the new data is actually collected. The number of observations for the new sample is entered with the command

      LET NNEW = <value>

    If NNEW is not defined, then a value of 1 is used.

    The difference between the PREDICTION BOUNDS command and the PREDICTION LIMITS command is that the PREDICTION LIMITS command generates a prediction interval for the mean of m new observations while the PREDICTION BOUNDS command generates a prediction interval to contain all of the new observations.

    This prediction interval is based on the assumption that the underlying data is approximately normally distributed. Due to the central limit thereom, prediction limits for the mean are fairly robust against non-normality. However, the central limit thereom does not apply to prediction intervals to cover all of the new observations. So the PREDICTION BOUNDS command is much more sensitive to non-normality than is the PREDICTION LIMITS command.

Syntax 1:
    <LOWER/UPPER> <LOGNORMAL/BOXCOX> PREDICTION BOUNDS <y>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y> is the response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    If LOWER is specified, a one-sided lower prediction limit is returned. If UPPER is specified, a one-sided upper prediction limit is returned. If neither is specified, a two-sided limit is returned.

    If the keyword LOGNORMAL is present, the log of the data will be taken, then the normal prediction bounds will be computed, and then the computed normal lower and upper limits will be exponentiated to obtain the lognormal prediction bounds.

    Similarly, if the keyword BOXCOX is present, a Box-Cox transformation to normality will be applied to the data before computing the normal prediction bounds. The computed lower and upper limits will then be transformed back to the original scale.

    This syntax supports matrix arguments for the response variable.

Syntax 2:
    MULTIPLE <LOWER/UPPER> <LOGNORMAL/BOXCOX>
                            PREDICTION BOUNDS <y1> ... <yk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> .... <yk> is a list of 1 to 30 response variables;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax will generate a prediction interval for each of the response variables.

    If LOWER is specified, a one-sided lower prediction limit is returned. If UPPER is specified, a one-sided upper prediction limit is returned. If neither is specified, a two-sided limit is returned.

    If the keyword LOGNORMAL is present, the log of the data will be taken, then the normal prediction bounds will be computed, and then the computed normal lower and upper limits will be exponentiated to obtain the lognormal prediction bounds.

    Similarly, if the keyword BOXCOX is present, a Box-Cox transformation to normality will be applied to the data before computing the normal prediction bounds. The computed lower and upper limits will then be transformed back to the original scale.

    This syntax supports matrix arguments for the response variables.

Syntax 3:
    REPLICATED <LOWER/UPPER> <LOGNORMAL/BOXCOX>
                            PREDICTION BOUNDS <y> <x1> ... <xk>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y> is the response variable;
                <x1> .... <xk> is a list of 1 to 6 group-id variables;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax performs a cross-tabulation of the <x1> ... <xk> and generates a prediction interval for each unique combination of the cross-tabulated values. For example, if X1 has 3 levels and X2 has 2 levels, six prediction intervals will be generated.

    If LOWER is specified, a one-sided lower prediction limit is returned. If UPPER is specified, a one-sided upper prediction limit is returned. If neither is specified, a two-sided limit is returned.

    If the keyword LOGNORMAL is present, the log of the data will be taken, then the normal prediction bounds will be computed, and then the computed normal lower and upper limits will be exponentiated to obtain the lognormal prediction bounds.

    Similarly, if the keyword BOXCOX is present, a Box-Cox transformation to normality will be applied to the data before computing the normal prediction bounds. The computed lower and upper limits will then be transformed back to the original scale.

    This syntax does not support matrix arguments.

Examples:
    PREDICTION BOUNDS Y1
    PREDICTION BOUNDS Y1 SUBSET TAG > 2
    MULTIPLE PREDICTION BOUNDS Y1 TO Y5
    REPLICATED PREDICTION BOUNDS Y X
Note:
    A table of prediction limits is printed for alpha levels of 50.0, 80.0, 90.0, 95.0, 99.0, and 99.9.
Note:
    In addition to the PREDICTION BOUNDS command, the following commands can also be used:

      LET ALPHA = 0.05
      LET NNEW = <value>

      LET A = LOWER PREDICTION BOUNDS Y
      LET A = UPPPER PREDICTION BOUNDS Y
      LET A = ONE SIDED LOWER PREDICTION BOUNDS Y
      LET A = ONE SIDED UPPER PREDICTION BOUNDS Y

      LET A = SUMMARY LOWER PREDICTION BOUNDS YMEAN YSD N
      LET A = SUMMARY UPPPER PREDICTION BOUNDS YMEAN YSD N
      LET A = SUMMARY ONE SIDED LOWER PREDICTION BOUNDS YMEAN YSD N
      LET A = SUMMARY ONE SIDED UPPER PREDICTION BOUNDS YMEAN YSD N

    The first two commands specify the significance level and the number of new observations. The next four commands are used when you have raw data. The last four commands are used when only summary data (mean, standard deviation, sample size) is available.

    In addition to the above LET command, built-in statistics are supported for about 20 different commands (enter HELP STATISTICS for details).

Default:
    None
Synonyms:
    None
Related Commands: Reference:
    Hahn and Meeker (1991), "Statistical Intervals: A Guide for Practitioners," Wiley, pp. 62-63.
Applications:
    Confirmatory Data Analysis
Implementation Date:
    2013/04
Program 1:
     
    SKIP 25
    READ ZARR13.DAT Y
    SET WRITE DECIMALS 5
    LET NNEW = 5
    .
    PREDICTION BOUNDS Y
    LOWER PREDICTION BOUNDS Y
    UPPER PREDICTION BOUNDS Y
        
    The following output is generated
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
     
    Summary Statistics:
    Number of Observations:                             195
    Sample Mean:                                    9.26146
    Sample Standard Deviation:                      0.02278
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            50.0        9.22370        9.29922
            80.0        9.21422        9.30870
            90.0        9.20786        9.31505
            95.0        9.20202        9.32089
            99.0        9.18988        9.33303
            99.9        9.17483        9.34808
     
     
                One-Sided Lower Prediction Bounds for All Observations
     
    Response Variable: Y
     
    Summary Statistics:
    Number of Observations:                             195
    Sample Mean:                                    9.26146
    Sample Standard Deviation:                      0.02278
    Number of New Observations:                           5
     
     
     
    One-Sided Lower Prediction Bounds for All Observations
    ---------------------------
      Confidence          Lower
       Value (%)          Limit
    ---------------------------
            50.0        9.23208
            80.0        9.22125
            90.0        9.21422
            95.0        9.20786
            99.0        9.19490
            99.9        9.17914
     
     
                One-Sided Upper Prediction Bounds for All Observations
     
    Response Variable: Y
     
    Summary Statistics:
    Number of Observations:                             195
    Sample Mean:                                    9.26146
    Sample Standard Deviation:                      0.02278
    Number of New Observations:                           5
     
     
     
    One-Sided Upper Prediction Bounds for All Observations
    ---------------------------
      Confidence          Upper
       Value (%)          Limit
    ---------------------------
            50.0        9.29084
            80.0        9.30166
            90.0        9.30870
            95.0        9.31505
            99.0        9.32801
            99.9        9.34377
        
Program 2:
     
    SKIP 25
    READ GEAR.DAT Y X
    SET WRITE DECIMALS 5
    LET NNEW = 5
    .
    REPLICATED PREDICTION BOUNDS Y X
     
        
    The following output is generated
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            1.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99800
    Sample Standard Deviation:                      0.00434
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.98560        1.01039
            95.0        0.98356        1.01243
            99.0        0.97869        1.01730
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            2.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99910
    Sample Standard Deviation:                      0.00521
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.98422        1.01397
            95.0        0.98177        1.01642
            99.0        0.97592        1.02227
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            3.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99540
    Sample Standard Deviation:                      0.00397
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.98405        1.00674
            95.0        0.98219        1.00860
            99.0        0.97773        1.01306
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            4.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99820
    Sample Standard Deviation:                      0.00385
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.98721        1.00918
            95.0        0.98540        1.01099
            99.0        0.98108        1.01531
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            5.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99190
    Sample Standard Deviation:                      0.00757
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.97029        1.01350
            95.0        0.96673        1.01706
            99.0        0.95823        1.02556
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            6.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99879
    Sample Standard Deviation:                      0.00988
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.97061        1.02698
            95.0        0.96596        1.03163
            99.0        0.95488        1.04271
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            7.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    1.00150
    Sample Standard Deviation:                      0.00787
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.97904        1.02395
            95.0        0.97533        1.02766
            99.0        0.96650        1.03649
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            8.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    1.00039
    Sample Standard Deviation:                      0.00362
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.99005        1.01074
            95.0        0.98835        1.01244
            99.0        0.98428        1.01651
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                            9.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99829
    Sample Standard Deviation:                      0.00413
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.98650        1.01009
            95.0        0.98455        1.01204
            99.0        0.97991        1.01668
     
     
                Two-Sided Prediction Bounds for All Observations
     
    Response Variable: Y
    Factor Variable 1: X                           10.00000
     
    Summary Statistics:
    Number of Observations:                              10
    Sample Mean:                                    0.99479
    Sample Standard Deviation:                      0.00532
    Number of New Observations:                           5
     
     
     
    Two-Sided Prediction Bounds for All Observations
    ------------------------------------------
      Confidence          Lower          Upper
       Value (%)          Limit          Limit
    ------------------------------------------
            90.0        0.97960        1.00999
            95.0        0.97710        1.01249
            99.0        0.97112        1.01847
        
Program 3:
     
    .  Following example from Hahn and Meeker's book.
    .
    let ymean = 50.10
    let ysd   = 1.31
    let n1    = 5
    let nnew  = 3
    let alpha = 0.05
    .
    set write decimals 5
    let slow1 = summary lower prediction bounds ymean ysd n1
    let supp1 = summary upper prediction bounds ymean ysd n1
    let slow2 = summary one sided lower prediction bounds ymean ysd n1
    let supp2 = summary one sided upper prediction bounds ymean ysd n1
    print slow1 supp1 slow2 supp2
        
    The following output is generated
     PARAMETERS AND CONSTANTS--
    
        SLOW1   --       44.74603
        SUPP1   --       55.45397
        SLOW2   --       45.75080
        SUPP2   --       54.44920
        
Date created: 04/15/2013
Last updated: 12/11/2023

Please email comments on this WWW page to alan.heckert@nist.gov.