![]() |
COCHRAN VARIANCE OUTLIER TESTName:
The Levene and Bartlett tests are widely used for assessing the homogeneity of variances in the one-factor (with k levels) case. The Cochran variance outlier test is another alternative for assessing the homogeneity of variances. Although the Cochran test has a similar purpose to the Levene and Bartlett tests, it tends to be used in a somewhat different context. The Levene and Bartlett test are used to assess overall homogeneity and are typically used in the context of deciding whether a specific test (e.g., an F test) is appropriate for a given set of data. These tests do not identify which variances are different. On the other hand, the Cochran variance outlier test tends to be used in the context of proficiency testing. In this case, we are primarily interested in identifying laboratories that are "different". For example, a laboratory with an unusually large variance may indicate the need for close examination of that laboratory's practices. Cochran's test is essentially an outlier test. Cochran's original test statistic is defined as
That is, it is the ratio of the largest variance to the sum of the variances. This is an upper-tailed test for the maximum variance. The critical values can be computed from
where
Some comments on this test.
't Lam (2009) has extended the Cochran test to support unequal sample sizes and tests for the minimum variance. He refers to this as the G statistic. Dataplot in fact generates the G statistic rather than the C statistic for this test. When the sample sizes are in fact equal, the G statistic for the maximum variance is equivalent to the Cochran C statistic. The G statistic for the j-th group is
where νi = ni - 1 with ni denoting the sample size of the i-th group. The critical value for testing the maximum variance is
where
Reject the null hypothesis that the maximum variance is an outlier if the test statistic is greater than the critical value. The critical value for testing the minimum variance is
In this case, \( \nu_{j} \) corresponds to the minimum variance. Reject the null hypothesis that the minimum variance is an outlier if the test statistic is less than the critical value. A two-sided test can also be performed. Just use α/2 in place of α in the above formulas. Although the 't Lam article provides a method for determining whether the maximum or minimum variance is more extreme, Dataplot will simply return the test statistic and critical values for both the maximum and the minimum cases. Note that with the G statistic, we are actually testing for the maximum (or minimum) value of the G statistic rather than the maximum (or minimum) variance. If the sample sizes are equal (or at least approximately equal), this should be equivalent. However, if there is a large difference in sample sizes, this may not be the case. That is, we are testing the maximum \( \nu_{j} s_{j}^{2} \) rather than the maximum \( s_{j}^{2} \). If there are potentially multiple outliers in the variances, the recommended procedure is to perform the test sequentially until all outlying variances are removed. That is, if the test indicates the maximum variance is an outlier, remove that group of data and perform the test again. Repeat until the test indicates that
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable; <tag> is a factor identifier variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax computes the test for the maximum variance.
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable; <tag> is a factor identifier variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax computes the test for the minimum variance.
<SUBSET/EXCEPT/FOR qualification> where <y> is a response variable; <tag> is a factor identifier variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax computes the two-sided test (i.e., both the minimum and maximum variance).
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of two to 30 response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax computes the test for the maximum variance.
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of two to 30 response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax computes the test for the minimum variance.
<SUBSET/EXCEPT/FOR qualification> where <y1> ... <yk> is a list of two to 30 response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax computes the two-sided test.
COCHRAN VARIANCE OUTLIER TEST Y X SUBSET X <> 5 COCHRAN MINIMUM VARIANCE OUTLIER TEST Y X COCHRAN TWO-SIDED VARIANCE OUTLIER TEST Y X
P-values are truncated at a minimum of 0.001 and a maximum of 99.999. P-values and CDF statistics are not currently computed for the two-sided case.
The ISO 5725 standard proposes Cochran's variance outlier test as an alternative to Mandel's k consistency statistic.
LET C = COCHRAN VARIANCE OUTLIER TEST Y X Enter HELP STATISTICS to see what commands can use these statistics.
Ruben U.E. 't Lam (2010), "Scrutiny of Variance Results for Outliers: Cochran's Test Optimized", Analytica Chimica ACTA, Vol. 659, No. 1-2, pp. 68-84. Kanji (2006), "100 Statistical Tests", SAGE Publications, p. 75. ISO Standard 5725–2:1994, “Accuracy (trueness and precision) of measurement methods and results – Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method”, International Organization for Standardization, Geneva, Switzerland, 1994.
. Step 1: Read the data . dimension 40 columns skip 25 read gear.dat y x set write decimals 5 . . Step 2: Generate a variance plot . label case asis title case asis title offset 2 xlimits 1 10 major x1tic mark number 10 x1tic mark offset 0.5 0.5 x1label Batch y1label Variance line blank solid character circle blank character hw 1 0.75 character fill on title Variance Plot for GEAR.DAT variance plot y x . . Step 2: Perform the test . . cochran variance outlier test y x let c = cochran variance outlier test y x let cv95 = cochran variance outlier cv95 y x let cv99 = cochran variance outlier cv99 y x let ccdf = cochran variance outlier cdf y x let cpval = cochran variance outlier pvalue y x print c cv95 cv99 ccdf cpval cochran minimum variance outlier test y x let cm = cochran minimum variance outlier test y x let cmv05 = cochran minimum variance outlier cv05 y x let cmv01 = cochran minimum variance outlier cv01 y x let cmcdf = cochran minimum variance outlier cdf y x let cmpval = cochran minimum variance outlier pvalue y x print cm cmv05 cmv01 cmcdf cmpval cochran two-sided variance outlier test y xThe following output is generated
Cochran Variance Outlier Test Response Variable: Y Group-ID Variable: X H0: Largest Variance is Not an Outlier Ha: Largest Variance is an Outlier Summary Statistics: Total Number of Observations: 100 Number of Groups: 10 Number of Groups with Positive Variance: 10 Group with Largest Variance: 6 Largest Variance: 0.00010 Sum of Variance: 0.00317 Cochran Test Statistic Value: 0.27713 CDF of Test Statistic: 0.98790 P-Value: 0.01210 Percent Points of the Reference Distribution ----------------------------------- Percent Point Value ----------------------------------- 0.1 = 0.15970 0.5 = 0.15983 1.0 = 0.16000 2.5 = 0.16051 5.0 = 0.16137 10.0 = 0.16315 25.0 = 0.16905 50.0 = 0.18164 75.0 = 0.20180 90.0 = 0.22643 95.0 = 0.24388 97.5 = 0.26050 99.0 = 0.28139 99.5 = 0.29648 99.9 = 0.32953 Conclusions (Upper 1-Tailed Test) ---------------------------------------------- Alpha CDF Critical Value Conclusion ---------------------------------------------- 10% 90% 0.22643 Reject H0 5% 95% 0.24388 Reject H0 2.5% 97.5% 0.26050 Reject H0 1% 99% 0.28139 Accept H0 PARAMETERS AND CONSTANTS-- C -- 0.27713 CV95 -- 0.24388 CV99 -- 0.28139 CCDF -- 0.98790 CPVAL -- 0.01210 Cochran Variance Outlier Test Response Variable: Y Group-ID Variable: X H0: Smallest Variance is Not an Outlier Ha: Smallest Variance is an Outlier Summary Statistics: Total Number of Observations: 100 Number of Groups: 10 Number of Groups with Positive Variance: 10 Group with Smallest Variance: 8 Smallest Variance: 0.00001 Sum of Variance: 0.00317 Cochran Test Statistic Value: 0.03730 CDF of Test Statistic: 0.44640 P-Value: 0.44640 Percent Points of the Reference Distribution ----------------------------------- Percent Point Value ----------------------------------- 0.1 = 0.00779 0.5 = 0.01144 1.0 = 0.01355 2.5 = 0.01702 5.0 = 0.02033 10.0 = 0.02442 25.0 = 0.03147 50.0 = 0.03861 75.0 = 0.04383 90.0 = 0.04650 95.0 = 0.04734 97.5 = 0.04775 99.0 = 0.04800 99.5 = 0.04808 99.9 = 0.04814 Conclusions (Lower 1-Tailed Test) ---------------------------------------------- Alpha CDF Critical Value Conclusion ---------------------------------------------- 1% 1% 0.01355 Accept H0 2.5% 2.5% 0.01702 Accept H0 5% 5% 0.02033 Accept H0 10% 10% 0.02442 Accept H0 PARAMETERS AND CONSTANTS-- CM -- 0.03730 CMV05 -- 0.02033 CMV01 -- 0.01355 CMCDF -- 0.44640 CMPVAL -- 0.44640 Cochran Variance Outlier Test Response Variable: Y Group-ID Variable: X H0: Extreme Variance is Not an Outlier Ha: Extreme Variance is an Outlier Summary Statistics: Total Number of Observations: 100 Number of Groups: 10 Number of Groups with Positive Variance: 10 Group with Largest Variance: 6 Largest Variance: 0.00010 Sum of Variance: 0.00317 Cochran Test Statistic Value (upper): 0.27713 Cochran Test Statistic Value (lower): 0.03730 Conclusions (Two-Tailed Test) ----------------------------------------------------------------------- Significance Lower Upper Alpha Level Critical Value Critical Value Conclusion ----------------------------------------------------------------------- 10% 90% 0.02033 0.24388 Reject H0 5% 95% 0.01702 0.26050 Reject H0 1% 99% 0.01144 0.29648 Accept H0
Date created: 05/05/2015 |
Last updated: 12/11/2023 Please email comments on this WWW page to alan.heckert@nist.gov. |