Dataplot Vol 1 Vol 2

# SD CONFIDENCE LIMITS

Name:
SD CONFIDENCE LIMITS
Type:
Analysis Command
Purpose:
Generates a confidence interval for the standard deviation.
Description:
Given a sample of n observations with standard deviation s, the two-sided confidence interval for the standard deviation is

$$\mbox{lower confidence limit} = s \sqrt{\frac{n-1}{\chi^{2}_{(1-\alpha/2;n-1)}}}$$

$$\mbox{upper confidence limit} = s \sqrt{\frac{n-1}{\chi^{2}_{(\alpha/2;n-1)}}}$$

with $$\chi^{2}$$ denoting the percent point function of the chi-square distribution. In these formulas, $$\alpha$$ is less than 0.5 (i.e., for a 95% confidence interval, we are using $$\alpha$$ = 0.05).

The one-sided lower confidence limit is

$$\mbox{lower confidence limit} = s \sqrt{\frac{n-1}{\chi^{2}_{(1-\alpha;n-1)}}}$$

The one-sided upper confidence limit is

$$\mbox{upper confidence limit} = s \sqrt{\frac{n-1}{\chi^{2}_{(\alpha;n-1)}}}$$

This confidence interval is based on the assumption that the underlying data is approximately normally distributed. The confidence interval for the standard deviation is highly sensitive to non-normality in the data. It is recommended that the original data be tested for normality before using these normal based intervals. If the data is not approximately normal, an alternative is to use the command

BOOTSTRAP STANDARD DEVIATION PLOT Y
Syntax 1:
<LOWER/UPPER> SD CONFIDENCE LIMITS <y>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

If LOWER is specified, a one-sided lower confidence limit is returned. If UPPER is specified, a one-sided upper confidence limit is returned. If neither is specified, a two-sided limit is returned.

This syntax supports matrix arguments for the response variable.

Syntax 2:
MULTIPLE <LOWER/UPPER> SD CONFIDENCE LIMITS <y1> ... <yk>
<SUBSET/EXCEPT/FOR qualification>
where <y1> .... <yk> is a list of 1 to 30 response variables;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax will generate a confidence interval for each of the response variables. The word MULTIPLOT is optional. That is,

MULTIPLE SD CONFIDENCE LIMITS Y1 Y2 Y3

is equivalent to

SD CONFIDENCE LIMITS Y1 Y2 Y3

If LOWER is specified, a one-sided lower confidence limit is returned. If UPPER is specified, a one-sided upper confidence limit is returned. If neither is specified, a two-sided limit is returned.

This syntax supports matrix arguments for the response variables.

Syntax 3:
REPLICATED <LOWER/UPPER> SD CONFIDENCE LIMITS <y> <x1> ... <xk>
<SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> .... <xk> is a list of 1 to 6 group-id variables;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax performs a cross-tabulation of the <x1> ... <xk> and generates a confidence interval for each unique combination of the cross-tabulated values. For example, if X1 has 3 levels and X2 has 2 levels, six confidence intervals will be generated.

If LOWER is specified, a one-sided lower confidence limit is returned. If UPPER is specified, a one-sided upper confidence limit is returned. If neither is specified, a two-sided limit is returned.

This syntax does not support matrix arguments.

Examples:
SD CONFIDENCE LIMITS Y1
SD CONFIDENCE LIMITS Y1 SUBSET TAG > 2
SD CONFIDENCE LIMITS Y1 TO Y5
REPLICATED SD CONFIDENCE LIMITS Y X
Note:
A table of confidence limits is printed for alpha levels of 50.0, 80.0, 90.0, 95.0, 99.0, and 99.9.
Note:
As noted above, the confidence interval for the standard deviation is very sensitive to non-normality in the data. Bonett (2006) has proposed an interval that is nearly exact when the data is normally distributed and provides good performance for moderately non-normal data. The interval for the variance (for the standard deviation, take the square root of these values) is

$$\exp(\log(c s^2) - z_{\alpha/2} se) < \sigma^2 < \exp(\log(c s^2) + z_{\alpha/2} se)$$

where

 s = sample standard deviation c = $$\frac{n}{n - z_{\alpha/2}}$$ z = the normal percent point function se = the standard error = $$c \sqrt{\frac{\hat{\gamma}_{4} - (n-3)/n}{n-1}}$$ $$\hat{\gamma}_{4}$$ = an adjusted estimate of kurtosis = $$\frac{n \sum_{i=1}^{n}{(X_{i} - m)^4}} {(\sum_{i=1}^{n}{(X_{i} - \bar{X})^2})^2}$$ m = trimmed mean with trim proportion equal to $$\frac{1}{2 \sqrt{n-4}}$$

The use of the trimmed mean reduces the bias of the kurtosis estimate for heavy tailed and skewed data.

The justification and derivation of this interval is given in Bonett's paper.

To request that the Bonett interval be generated, enter

SET BONETT STANDARD DEVIATION CONFIDENCE LIMITS ON

Based on simulation studies by Bonett, this interval results in greatly improved coverage properties for moderately non-normal data. For more extreme non-normality, large sample sizes may be required to obtain good coverage properties. Often a transformation to reduce skewness (which may reduce the heavy tailedness as well), such as the LOG or square root, can significantly reduce the sample size required to obtain good coverage properties.

Bonett also suggests that improved estimates for the kurtosis can significantly improve the coverage properties. Bonett's example is quality control applications where much historical data is frequently available. If a prior estimate of kurtosis is available, then the above formula pools this prior estimate with the kurtosis estimate from the data using

$$\hat{\gamma_{4}}^{*} = \frac{n_{0} \tilde{\gamma_{4}} + n \hat{\gamma_{4}}} {n_{0} + n}$$

with $$\tilde{\gamma}_{4}$$ and $$n_{0}$$ denoting the prior estimate of kurtosis and the associated sample size, respectively. Bonett gives guidance on pooling multiple estimates of kurtosis based on several small samples (it is often the case in quality control applications that a large number of small samples are available).

In Dataplot, you can specify a prior estimate of kurtosis by entering the commands

LET KURTOSIS = <value>
LET N0 = <value>

Niwitpong and Kirdwichai (2008) suggested two modifications to Bonett's interval to improve the coverage for data that are skewed and heavy tailed. Specifically, the following modifications are made to Bonett's interval

1. Use the median instead of the trimmed mean to compute the sample kurtosis.

2. Use the t percent point function rather than the normal percent point function.

This interval will be more conservative than the original Bonett data. Based on simulation studies in their paper, these adjustments can improve the nominal coverage for data that are skewed. However, it may be overly conservative for data that are only moderately non-normal.

To use the Niwitpong and Kirdwichai adjusted interval, enter

SET BONETT STANDARD DEVIATION CONFIDENCE LIMITS ...
Note:
In addition to the STANDARD DEVIATION CONFIDENCE LIMIT command, the following commands can also be used:

LET ALPHA = 0.05

LET A = LOWER STANDARD DEVIATION CONFIDENCE LIMIT Y
LET A = UPPPER STANDARD DEVIATION CONFIDENCE LIMIT Y
LET A = LOWER BONETT STANDARD DEVIATION CONFIDENCE
LIMIT Y
LET A = UPPER BONETT STANDARD DEVIATION CONFIDENCE
LIMIT Y
LET A = ONE SIDED LOWER STANDARD DEVIATION CONFIDENCE
LIMIT Y
LET A = ONE SIDED UPPER STANDARD DEVIATION CONFIDENCE
LIMIT Y

LET A = SUMMARY LOWER STANDARD DEVIATION CONFIDENCE
LIMIT YSD N
LET A = SUMMARY UPPPER STANDARD DEVIATION CONFIDENCE
LIMIT YSD N
LET A = SUMMARY ONE SIDED LOWER STANDARD DEVIATION
CONFIDENCE LIMIT YSD N
LET A = SUMMARY ONE SIDED UPPER STANDARD DEVIATION
CONFIDENCE LIMIT YSD N

The first command specifies the significance level. The next six commands are used when you have raw data. The last four commands are used when only summary data ( standard deviation, sample size) is available.

In addition to the above LET command, built-in statistics are supported for about 20 different commands (enter HELP STATISTICS for details).

Default:
None
Synonyms:
STANDARD DEVIATION CONFIDENCE INTERVAL is a synonym for STANDARD DEVIATION CONFIDENCE LIMITS

SD CONFIDENCE LIMIT is a synonym for STANDARD DEVIATION CONFIDENCE LIMIT

Related Commands:
 SD PREDICTION LIMITS = Generate a prediction limit for the standard deviation. CONFIDENCE LIMITS = Generate a confidence limit for the mean. PREDICTION LIMITS = Generate prediction limits for the mean. PREDICTION BOUNDS = Generate prediction limits to cover all new observations. TOLERANCE LIMITS = Generate a tolerance limit.
References:
Hahn and Meeker (1991), "Statistical Intervals: A Guide for Practitioners," Wiley, pp. 55-56.

Bonett (2006), "Approximate Confidence Interval for Standard Deviation of Nonnormal Distributions", Computational Statistics and Data Analysis, Vol. 50, pp. 775 - 782.

Niwitpong and Kirdwichai (2008), "Adjusted Bonett Confidence Interval for Standard Deviation of Non-Normal Distributions", Thailand Statistician, Vol. 6, No. 1, pp. 1-6.

Applications:
Confirmatory Data Analysis
Implementation Date:
2013/04
2017/12: Support for Bonett intervals
Program 1:

SKIP 25
SET WRITE DECIMALS 5
.
SD CONFIDENCE LIMITS Y
LOWER SD CONFIDENCE LIMITS Y
UPPER SD CONFIDENCE LIMITS Y

The following output is generated
            Two-Sided Confidence Limits for the SD

Response Variable: Y

Summary Statistics:
Number of Observations:                             195
Sample Mean:                                    9.26146
Sample Standard Deviation:                      0.02278

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.02206        0.02363
80.0        0.02141        0.02440
90.0        0.02104        0.02487
95.0        0.02072        0.02530
99.0        0.02013        0.02617
99.9        0.01948        0.02725

One-Sided Lower Confidence Limits for the SD

Response Variable: Y

Summary Statistics:
Number of Observations:                             195
Sample Mean:                                    9.26146
Sample Standard Deviation:                      0.02278

One-Sided Lower Confidence Limits for the SD
---------------------------
Confidence          Lower
Value (%)          Limit
---------------------------
50.0        0.02282
80.0        0.02188
90.0        0.02141
95.0        0.02104
99.0        0.02037
99.9        0.01966

One-Sided Upper Confidence Limits for the SD

Response Variable: Y

Summary Statistics:
Number of Observations:                             195
Sample Mean:                                    9.26146
Sample Standard Deviation:                      0.02278

One-Sided Upper Confidence Limits for the SD
---------------------------
Confidence          Upper
Value (%)          Limit
---------------------------
50.0        0.02282
80.0        0.02384
90.0        0.02440
95.0        0.02487
99.0        0.02581
99.9        0.02694

Program 2:

SKIP 25
.
SET WRITE DECIMALS 5
REPLICATED SD CONFIDENCE LIMITS Y X

The following output is generated
            Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     1.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99800
Sample Standard Deviation:               0.00435

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00386        0.00537
80.0        0.00340        0.00639
90.0        0.00317        0.00715
95.0        0.00299        0.00793
99.0        0.00268        0.00990
99.9        0.00239        0.01323

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     2.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99910
Sample Standard Deviation:               0.00522

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00464        0.00644
80.0        0.00408        0.00767
90.0        0.00380        0.00858
95.0        0.00359        0.00952
99.0        0.00322        0.01188
99.9        0.00287        0.01588

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     3.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99540
Sample Standard Deviation:               0.00398

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00354        0.00491
80.0        0.00311        0.00584
90.0        0.00290        0.00654
95.0        0.00274        0.00726
99.0        0.00246        0.00906
99.9        0.00219        0.01211

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     4.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99820
Sample Standard Deviation:               0.00385

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00343        0.00476
80.0        0.00302        0.00566
90.0        0.00281        0.00634
95.0        0.00265        0.00703
99.0        0.00238        0.00878
99.9        0.00212        0.01173

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     5.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99190
Sample Standard Deviation:               0.00758

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00674        0.00936
80.0        0.00593        0.01114
90.0        0.00553        0.01247
95.0        0.00521        0.01384
99.0        0.00468        0.01726
99.9        0.00417        0.02306

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     6.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99880
Sample Standard Deviation:               0.00989

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00879        0.01221
80.0        0.00774        0.01453
90.0        0.00721        0.01626
95.0        0.00680        0.01805
99.0        0.00611        0.02252
99.9        0.00545        0.03009

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     7.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             1.00150
Sample Standard Deviation:               0.00788

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00700        0.00973
80.0        0.00617        0.01158
90.0        0.00575        0.01296
95.0        0.00542        0.01438
99.0        0.00487        0.01794
99.9        0.00434        0.02397

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     8.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             1.00040
Sample Standard Deviation:               0.00363

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00322        0.00448
80.0        0.00284        0.00533
90.0        0.00265        0.00597
95.0        0.00249        0.00662
99.0        0.00224        0.00826
99.9        0.00200        0.01104

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     9.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99830
Sample Standard Deviation:               0.00414

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00368        0.00511
80.0        0.00324        0.00608
90.0        0.00302        0.00681
95.0        0.00285        0.00755
99.0        0.00256        0.00942
99.9        0.00228        0.01259

Two-Sided Confidence Limits for the SD

Response Variable: Y
Factor Variable 1: X                     10.00000

Summary Statistics:
Number of Observations:                  10
Sample Mean:                             0.99480
Sample Standard Deviation:               0.00533

Two-Sided Confidence Limits for the SD
------------------------------------------
Confidence          Lower          Upper
Value (%)          Limit          Limit
------------------------------------------
50.0        0.00474        0.00658
80.0        0.00417        0.00783
90.0        0.00389        0.00877
95.0        0.00367        0.00973
99.0        0.00329        0.01214
99.9        0.00294        0.01622

Program 3:

.  Following example from Hahn and Meeker's book.
.
let ymean = 50.10
let ysd   = 1.31
let n1    = 5
let alpha = 0.05
.
set write decimals 5
let slow1 = summary lower sd confidence limits ysd n1
let supp1 = summary upper sd confidence limits ysd n1
let slow2 = summary one sided lower sd confidence limits ysd n1
let supp2 = summary one sided upper sd confidence limits ysd n1
print slow1 supp1 slow2 supp2

The following output is generated.
 PARAMETERS AND CONSTANTS--

SLOW1   --        0.78486
SUPP1   --        3.76436
SLOW2   --        0.85059
SUPP2   --        3.10779

Program 3:

. Step 1:   Read the data (example from Bonett paper)
.
let y = data 15.83 16.01 16.24 16.42 15.33 15.44 16.88 16.31
.
. Step 2:   Compute the statistics
.
set write decimals 4
let ysd = standard deviation y
let lcl = lower bonett standard deviation confidence limit y
let ucl = upper bonett standard deviation confidence limit y
print ysd lcl ucl
.
set bonett standard deviation confidence limit on
standard deviation confidence limits y

The following output is generated

PARAMETERS AND CONSTANTS--

YSD     --         0.5168
LCL     --         0.3263
UCL     --         1.0841

Two-Sided Confidence Limits for the SD

Response Variable: Y

Summary Statistics:
Number of Observations:                  8
Sample Mean:                             16.0575
Sample Standard Deviation:               0.5168

---------------------------------------------------------
Confidence       Standard          Lower          Upper
Value (%)      Deviation          Limit          Limit
---------------------------------------------------------
50.0         0.5168         0.4548         0.6629
80.0         0.5168         0.3944         0.8123
90.0         0.5168         0.3646         0.9288
95.0         0.5168         0.3417         1.0518
99.0         0.5168         0.3036         1.3747
99.9         0.5168         0.2681         1.9636

Two-Sided Confidence Limits for the SD
Bonett Interval for Non-Normality

Response Variable: Y

Summary Statistics:
Number of Observations:                  8
Sample Mean:                             16.0575
Sample Standard Deviation:               0.5168

---------------------------------------------------------
Confidence       Standard          Lower          Upper
Value (%)      Deviation          Limit          Limit
---------------------------------------------------------
50.0         0.5168         0.4555         0.6404
80.0         0.5168         0.3963         0.8026
90.0         0.5168         0.3592         0.9359
95.0         0.5168         0.3263         1.0841
99.0         0.5168         0.2607         1.5109
99.9         0.5168         0.1849         2.4533



NIST is an agency of the U.S. Commerce Department.

Date created: 04/15/2013
Last updated: 11/05/2015