![]() |
ONE WAY ANOVAName:
The basic ANOVA model for one factor is
where
The errors are assumed to be independent and normally distributed with mean 0 and standard deviation \( \sigma \) and \( \mu \) is assumed to be a fixed parameter. In the fixed effects ANOVA model, the levels of the factor are assumed to be fixed and the \( \tau_{i} \) are also assumed to be fixed parameters. In the random effects ANOVA model, the levels of a factor are assumed to be a random sample from a population of possible level values and the \( \tau_{i} \) are assumed to be random variables that are normally distributed with mean zero and standard deviation \( \sigma_{\tau} \). The \( \tau_{i} \) and \( \epsilon_{ij} \) are assumed to be independent for all \( i \) and \( j \). Although the basic ANOVA calculations are the same for the fixed and random effects model, the hypothesis being tested is different. In a fixed effects model, ANOVA is used to test the hypothesis that \( \tau_1 = \tau_2 = ... = \tau_k \). In a random effects model, ANOVA is used to test the hypothesis that \( \sigma_{\tau} = 0 \). Note that testing for the equality of the \( \tau_{i} \) is not meaningful in the context of a random effects model. For example, suppose one factor is the operator of a machine. In a fixed effects model, we would be testing whether the average performance varies between the specific operators tested. For a random effects model, we assume that the specific operators tested were randomly selected from a larger pool of available operators. In this case, we are not interested in the average performance of the specific operators tested. Rather we are interested in the variability between operators. The basic computations for the one factor fixed effects ANOVA are given here (and in most introductory statistics text books)
Basically, the sum of squares are decomposed into a "treatment" sum of squares and an "error" sum of squares. Essentially we are comparing the ratio of the between treatment variability and the within treatment variability. For a fixed effects case, if the between variability is small compared to the within variability we conclude the treatment means are essentially equal. Similarly for the random effects model, if the between variability is small compared to the within variability we conclude the variance between treatments is essentially zero. The Dataplot ANOVA command performs an analysis of variance for up to 10 factors. However, this command is intended for the fixed effects case with balanced data (i.e., all cells have the same sample size). The ONE WAY ANOVA command specifically handles the case of a single factor variable. However, it handles a broader range cases than the regular ANOVA command and provides some additional output. Specifically,
where <y> is the response (= dependent) variable; <x> is the factor (= independent) variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a fixed effects ANOVA based on raw data.
<SUBSET/EXCEPT/FOR qualification> where <ymean> is a variable containing the treatment means; <ysd> is a variable containing the treatment standard deviations; <yn> is a variable containing the treatment sample sizes; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a fixed effects ANOVA based on summary data.
<SUBSET/EXCEPT/FOR qualification> where <y> is the response (= dependent) variable; <x> is the factor (= independent) variable; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a random effects ANOVA based on raw data.
<ymean> <ysd> <yn> <SUBSET/EXCEPT/FOR qualification> where <ymean> is a variable containing the treatment means; <ysd> is a variable containing the treatment standard deviations; <yn> is a variable containing the treatment sample sizes; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a random effects ANOVA based on summary data.
ONE WAY SUMMARY ANOVA YMEAN YSD YN ONE WAY RANDOM EFFECTS ANOVA Y X ONE WAY SUMMARY RANDOM EFFECTS ANOVA YMEAN YSD YN ONE WAY ANOVA Y X SUBSET Y > 0 ONE WAY ANOVA Y X SUBSET X = 1 TO 6
The predicted values are the estimated grand mean plus the estimated treatment effect. The residuals are the response variable minus the predicted values. The PRED and RES variables will not be saved if summary data is used.
The variance component table is used to specify the percentage of the variance that can be attributed to the treatment (between) and how much to the error (within). The variance component for the treatment is computed as
where MSTR denotes the mean square of the treatment and MSE denotes the mean square error and
The MSTR and MSE values are obtained from the standard ANOVA table. In the balanced case, n0 is equal to the level sample size. The variance component for the error is simply the MSE. The variance components are often given as a proportion of the total variance. In this case, the variance component is divided by the sum of the treatment variance component and the error variance component. This is also referred to as the intraclass correlation coefficient. Dataplot prints this expressed as a percentage rather than a proportion. The confidence interval for the treatment variance component is computed as
\( \mbox{Upper CL} = \mbox{SSTR} \frac{\left(1 - \frac{F_l}{F_0}\right)} {n_0 \chi^{2}_{(1-(\alpha/2),k-1)}} \)
where
The confidence interval for the error variance component is computed as
\( \mbox{Upper CL} = (ntotal-k) \frac{\mbox{MSE}} {\chi^{2}_{(1-(\alpha/2),ntotal-k)}} \) Approximate confidence intervals for the intraclass correlation coefficient are computed as
\( \mbox{Upper CL} = \frac{\mbox{U}} {1 + \mbox{U}} \) where
\( \mbox{U} = \frac{1}{n_0} (\frac{F_{0}}{F_{l}} - 1) \) where \( F_0 \), \( F_l \) and \( F_u \) are as defined above.
with \( \bar{y_{i}} \) denoting the mean of the i-th level and t denoting the t percent point function. If the ANOVA shows a statistically significant difference in the level means, a follow-up step is to compare the pairwise difference in treatment means to determine which differences are statistically significant. This is a subset of possible multiple comparisons. There are a number of different ways to perform these pairwise comparisons. The first table generates confidence intervals based on Fisher's Least Protected Difference (LSD)
This is just the standard confidence interval for the difference between means. The protected part is based on the idea that these differences are only analyzed if the F test from the ANOVA is statistically significant. The last column in the table provides p-values (p-values less than alpha indicate statistical significance). The second table generates confidence intervals based on Bonferroni's adjustment to alpha. The LSD method provides pairwise statistical signficance, but does not provide simultaneous statistical significance. Bonferroni addresses this with the following correction to alpha
The table header prints the adjusted alpha. This can be compared to the p-values in the last column of the table to determine statistical significance. The third table generates Tukey-Kramer honest significant difference (HSD) confidence intervals. These confidence intervals are based on
where q denotes the studentized range percent point function. Like the Bonferroni intervals, these intervals provide simultaneous statistical significance. The generated tables are based on pairwised differences between level means. More general comparisons are based on contrasts and linear combnations. A contrast is a linear combination of two or more factor level means with coefficients that sum to zero
with \( c_{1}, c_{2}, \dots , c_{r} \) denoting the contrast coefficients and \( \mu_{1}, \mu_{2}, \dots , \mu_{r} \) denoting the factor level means and where
For example, the pairwise difference betweem two level means has coefficients \( c_1 \) = 1 and \( c_2 \)= -1. To compare levels 1 and 2 with level 3, the coefficients are \( c_1 \) = 1, \( c_2 \)= 1, and \( c_3 \) = -2. Two contrasts are orthogonal if the sum of the products of corresponding coefficients (i.e., coefficients for the same means) also sum to zero. That is
where \( c \) and \( d \) are coefficients for two contrasts. For the unbalanced case, this is
A linear combination is similar to a contrast, but it does not require that the coefficients sum to zero. A contrast is estimated by
The variance of the contrast is
and a confidence interval is given by
Estimates and confidence intervals for linear combinations are generated using the same formulas as contrasts. Montgomery (pp. 60-62) shows how the treatment sum of squares can be partitioned into contrast sum of squares for orthogonal contrasts. Note that constrasts should be chosen before the data is collected.
Typical values for alpha are 0.01, 0.05 and 0.10.
1 WAY is a synonym for ONE WAY RANDOM EFFECTS SUMMARY is a synonym for SUMMARY RANDOM EFFECTS
Neter, Wasserman and Kunter (1990), "Applied Linear Statistical Models," 3rd ed., Irwin. Searle, Casella and McCulloch (1992), "Variance Components," Wiley, Chapter 3. Montgomery (1976), "Design and Analysis of Experiments," Second Edition, Wiley, chapters 3 and 4.
|
Program 1:
. Fixed Effects Case . . Step 1: Create the data . read x y 1 6.9 1 5.4 1 5.8 1 4.6 1 4.0 2 8.3 2 6.8 2 7.8 2 9.2 2 6.5 3 8.0 3 10.5 3 8.1 3 6.9 3 9.3 4 5.8 4 3.8 4 6.1 4 5.6 4 6.2 end of data . set let cross tabulate collapse let ymean = cross tabulate mean y x let ysd = cross tabulate sd y x let yn = cross tabulate size x . . Step 2: Run standard anova command and summary anova command . echo on capture 1way.out anova y x one way summary anova ymean ysd yn one way anova y x end of capture echo offThe following output is generated in ONEWAYAN.OUT ***************** ** anova y x ** ***************** 1-Way Fixed Effects Analysis of Variance--Balanced Case Response Variable: Y Factor 1 Variable: X Summary Statistics: Number of Observations: 20 Number of Factors: 1 Number of Levels for Factor 1: 4 Number of Distinct Cells: 4 Grand Mean: 6.78000 Grand Standard Deviation: 1.77870 Residual Standard Deviation: 1.15358 Residual Degrees of Freedom: 16 Replication Case: Replication Standard Deviation: 1.15358 Replication Degrees of Freedom: 16 Lack of Fit F Test cannot be done because there are 0 degrees of freedom in the numerator of the F ratio. This happens when the number of parameters fitted is identical to the number of distinct subsets. ANOVA Table -------------------------------------------------------------------------------- Source DF Sum of Squares Mean Square F Statistic F CDF Sig -------------------------------------------------------------------------------- Total (Corrected) 19 60.11200 3.16379 -------------------------------------------------------------------------------- Factor 1 3 38.82000 12.94000 9.7238 99.932% ** -------------------------------------------------------------------------------- Residual 16 21.29200 1.33075 Estimation ------------------------------------------------------------ Level-ID NI Mean Effect SD(Effect) ------------------------------------------------------------ Factor 1 1 5 5.34000 -1.44000 0.44678 2 5 7.72000 0.94000 0.44678 3 5 8.56000 1.78000 0.44678 4 5 5.50000 -1.28000 0.44678 Models --------------------------------------------------------- Model Residual Standard Deviation --------------------------------------------------------- Constant Only-- 1.77870 Constant and Factor 1 Only-- 1.15358 Constant and All 2 Factors-- 1.15358 ****************************************** ** one way summary anova ymean ysd yn ** ****************************************** One-Way Fixed Effects Analysis of Variance: Summary Data H0: Effect(1) = Effect(2) = ... - Effect(k) = 0 Means Variable: YMEAN SD Variable: YSD Sample Size Variable: YN Summary Statistics: Total Number of Observations 20 Number of Levels: 4 Grand Mean: 6.78000 ANOVA Table: Fixed Effects -------------------------------------------------------------------------------------------------- Source DF Sum of Squares Mean Square F Statistic F CDF (%) P-Value Sig -------------------------------------------------------------------------------------------------- Total (Corrected) 19 60.11200 -------------------------------------------------------------------------------------------------- Treatment (Between) 3 38.82000 12.94000 9.7238 99.9316 0.000684 ** Residual (Within) 16 21.29200 1.33075 Estimation: Grand Mean = 0.6780000E+01 ------------------------------------------------------- Level ID N(i) Mean Effect SD(Effect) ------------------------------------------------------- 1 5 5.34000 -1.44000 0.44678 2 5 7.72000 0.94000 0.44678 3 5 8.56000 1.78000 0.44678 4 5 5.50000 -1.28000 0.44678 95% Confidence Intervals for Treatment Means -------------------------------------------------------------------------------- Treatment Lower Upper Treatment Treatment Sample Confidence Confidence I Mean SD Size Limit Limit -------------------------------------------------------------------------------- 1 5.34000 1.11714 5 4.43930 6.24070 2 7.72000 1.10318 5 6.81930 8.62070 3 8.56000 1.37768 5 7.65930 9.46070 4 5.50000 0.97980 5 4.59930 6.40070 95% Confidence Intervals for the Difference of Treatment Means ---------------------------------------------------------------------- Difference of Lower Upper Fisher Treatment Confidence Confidence LSD I J Means Limit Limit P-Value ---------------------------------------------------------------------- 1 2 -2.38000 -3.65378 -1.10622 0.00489 1 3 -3.22000 -4.49378 -1.94622 0.00043 1 4 -0.16000 -1.43378 1.11378 0.82919 2 3 -0.84000 -2.11378 0.43378 0.26651 2 4 2.22000 0.94622 3.49378 0.00775 3 4 3.06000 1.78622 4.33378 0.00069 95% Bonferroni Intervals for the Difference of Treatment Means Bonferroni Adjusted Alpha = 0.0083 ---------------------------------------------------------------------- I J Difference of Lower Upper Treatment Confidence Confidence Means Limit Limit P-Value ---------------------------------------------------------------------- 1 2 -2.38000 -4.33021 -0.42979 0.00489 1 3 -3.22000 -5.17021 -1.26979 0.00043 1 4 -0.16000 -2.11021 1.79021 0.82919 2 3 -0.84000 -2.79021 1.11021 0.26651 2 4 2.22000 0.26979 4.17021 0.00775 3 4 3.06000 1.10979 5.01021 0.00069 Tukey-Kramer Honest Significant Difference (HSD) Multiple Comparisons: alpha = 0.05 ---------------------------------------------------------------------- |Difference| Lower Upper of Treatment Standard Confidence Confidence I J Means Error Limit Limit ---------------------------------------------------------------------- 1 2 2.38000 0.51590 0.28038 4.47962 1 3 3.22000 0.51590 1.12038 5.31962 1 4 0.16000 0.51590 -1.93962 2.25962 2 3 0.84000 0.51590 -1.25962 2.93962 2 4 2.22000 0.51590 0.12038 4.31962 3 4 3.06000 0.51590 0.96038 5.15962 ********************************* ** one way anova y x ** ********************************* One-Way Fixed Effects Analysis of Variance: Raw Data H0: Effect(1) = Effect(2) = ... - Effect(k) = 0 Group-ID Variable: Y Summary Statistics: Total Number of Observations 20 Number of Levels: 4 Grand Mean: 6.78000 Grand Standard Deviation: 1.77870 ANOVA Table: Fixed Effects -------------------------------------------------------------------------------------------------- Source DF Sum of Squares Mean Square F Statistic F CDF (%) P-Value Sig -------------------------------------------------------------------------------------------------- Total (Corrected) 19 60.11200 -------------------------------------------------------------------------------------------------- Treatment (Between) 3 38.82000 12.94000 9.7238 99.9316 0.000684 ** Residual (Within) 16 21.29200 1.33075 Estimation: Grand Mean = 0.6780000E+01 ------------------------------------------------------- Level ID N(i) Mean Effect SD(Effect) ------------------------------------------------------- 1 5 5.34000 -1.44000 0.44678 2 5 7.72000 0.94000 0.44678 3 5 8.56000 1.78000 0.44678 4 5 5.50000 -1.28000 0.44678 Models ------------------------------------------------------- Model Residual Standard Deviation ------------------------------------------------------- Constant 1.77870 Constant + Factor 1.15358 95% Confidence Intervals for Treatment Means -------------------------------------------------------------------------------- Treatment Lower Upper Treatment Treatment Sample Confidence Confidence I Mean SD Size Limit Limit -------------------------------------------------------------------------------- 1 5.34000 1.11714 5 4.43930 6.24070 2 7.72000 1.10318 5 6.81930 8.62070 3 8.56000 1.37768 5 7.65930 9.46070 4 5.50000 0.97980 5 4.59930 6.40070 95% Confidence Intervals for the Difference of Treatment Means ---------------------------------------------------------------------- Difference of Lower Upper Fisher Treatment Confidence Confidence LSD I J Means Limit Limit P-Value ---------------------------------------------------------------------- 1 2 -2.38000 -3.65378 -1.10622 0.00489 1 3 -3.22000 -4.49378 -1.94622 0.00043 1 4 -0.16000 -1.43378 1.11378 0.82919 2 3 -0.84000 -2.11378 0.43378 0.26651 2 4 2.22000 0.94622 3.49378 0.00775 3 4 3.06000 1.78622 4.33378 0.00069 95% Bonferroni Intervals for the Difference of Treatment Means Bonferroni Adjusted Alpha = 0.0083 ---------------------------------------------------------------------- I J Difference of Lower Upper Treatment Confidence Confidence Means Limit Limit P-Value ---------------------------------------------------------------------- 1 2 -2.38000 -4.33021 -0.42979 0.00489 1 3 -3.22000 -5.17021 -1.26979 0.00043 1 4 -0.16000 -2.11021 1.79021 0.82919 2 3 -0.84000 -2.79021 1.11021 0.26651 2 4 2.22000 0.26979 4.17021 0.00775 3 4 3.06000 1.10979 5.01021 0.00069 Tukey-Kramer Honest Significant Difference (HSD) Multiple Comparisons: alpha = 0.05 ---------------------------------------------------------------------- |Difference| Lower Upper of Treatment Standard Confidence Confidence I J Means Error Limit Limit ---------------------------------------------------------------------- 1 2 2.38000 0.51590 0.28038 4.47962 1 3 3.22000 0.51590 1.12038 5.31962 1 4 0.16000 0.51590 -1.93962 2.25962 2 3 0.84000 0.51590 -1.25962 2.93962 2 4 2.22000 0.51590 0.12038 4.31962 3 4 3.06000 0.51590 0.96038 5.15962Program 2: . Random Effects Case . . Step 1: Create the data . read x y 1 74 1 76 1 75 2 68 2 71 2 72 3 75 3 77 3 77 4 72 4 74 4 73 5 79 5 81 5 79 end of data . set let cross tabulate collapse let ymean = cross tabulate mean y x let ysd = cross tabulate sd y x let yn = cross tabulate size x . . Step 2: Run random effects ANOVA . echo on capture ONEWAYAN.OUT set write decimals 6 one way random effects anova y x one way summary random effects anova ymean ysd yn end of capture echo offThe following output is generated in ONEWAYA2.OUT **************************************** ** one way random effects anova y x ** **************************************** One-Way Random Effects Analysis of Variance: Raw Data H0: Variance of Effects = 0 Group-ID Variable: Y Summary Statistics: Total Number of Observations 15 Number of Levels: 5 Grand Mean: 74.866667 Grand Standard Deviation: 3.440653 ANOVA Table: Fixed Effects -------------------------------------------------------------------------------------------------- Source DF Sum of Squares Mean Square F Statistic F CDF (%) P-Value Sig -------------------------------------------------------------------------------------------------- Total (Corrected) 14 165.733333 -------------------------------------------------------------------------------------------------- Treatment (Between) 4 147.733333 36.933333 20.5185 99.9918 0.000082 ** Residual (Within) 10 18.000000 1.800000 Variance Components Table: Random Effects ---------------------------------------------------------------------------------------------------------------- Lower Upper Intraclass Lower Upper Variance Confidence Confidence Correlation Confidence Confidence Source Component Limit Limit Coefficient (%) Limit Limit ---------------------------------------------------------------------------------------------------------------- Treatment (Between) 11.711111 3.456828 101.096603 86.68 54.490128 98.364796 Residual (Within) 1.800000 0.878770 5.543625 13.32 ********************************************************* ** one way summary random effects anova ymean ysd yn ** ********************************************************* One-Way Random Effects Analysis of Variance: Summary Data H0: Variance of Effects = 0 Means Variable: YMEAN SD Variable: YSD Sample Size Variable: YN Summary Statistics: Total Number of Observations 15 Number of Levels: 5 Grand Mean: 74.866667 ANOVA Table: Fixed Effects -------------------------------------------------------------------------------------------------- Source DF Sum of Squares Mean Square F Statistic F CDF (%) P-Value Sig -------------------------------------------------------------------------------------------------- Total (Corrected) 14 165.733333 -------------------------------------------------------------------------------------------------- Treatment (Between) 4 147.733333 36.933333 20.5185 99.9918 0.000082 ** Residual (Within) 10 18.000000 1.800000 Variance Components Table: Random Effects ---------------------------------------------------------------------------------------------------------------- Lower Upper Intraclass Lower Upper Variance Confidence Confidence Correlation Confidence Confidence Source Component Limit Limit Coefficient (%) Limit Limit ---------------------------------------------------------------------------------------------------------------- Treatment (Between) 11.711111 3.456828 101.096603 86.68 54.490128 98.364796 Residual (Within) 1.800000 0.878770 5.543625 13.32 |
|
Date created: 09/27/2023 Last updated: 09/27/2023 Please email comments on this WWW page to [email protected]. |