 Dataplot Vol 1 Vol 2

# CONSENSUS MEAN

Name:
CONSENSUS MEAN
Type:
Analysis Command
Purpose:
Compute an estimate of a consensus mean, and the associated uncertainty, based on the data from multiple laboratories or multiple methods.
Overview:
The problem of determining a consensus mean based on data from two or more laboratories (or from two or more methods from the same laboratory) is a common one for measurement laboratories. A few specific applications of consensus means are Standard Reference Materials (SRM's), interlaboratory studies, and key comparisons.

There are a number of approaches to this problem. The Dataplot CONSENSUS MEANS command computes estimates for a variety of methods and does not specify which is the most appropriate method for a given data set. Consult with a statistician for guidance on which method is most appropriate for your data.

Grand Mean Model:
The simplest model is to assume that there is no lab effect (i.e., we treat the data as if it all came from the same lab).

In this case, the consensus mean is simply the grand mean of all the data and a confidence interval for the consensus mean is simply the standard t-based condidence interval:

$$\bar{x} \pm \frac{t_{(1-\alpha/2,n-1)}s} {\sqrt{n}}$$

where $$\bar{X}$$ is the overall mean, t is the percent point function for the t distribution, s is the standard deviation of all the points, and n is the total number of points.

The assumption of no lab effect is unrealistic in almost all cases. However, we include the grand mean method as a reference point as it gives an indication of how including the lab effects changes the estimate of the consensus mean and its uncertainty.

Mean of Means Model:
The mean of means model was originally recommended by Churchill Eisenhart and was one of the earliest methods used for SRM's.

For this method, we compute the mean for each of the k laboratories. Then we compute $$\bar{X}$$ and s as the mean and standard deviation of these k means. The estimate of the consensus mean is simply $$\bar{X}$$ and we compute the following confidence interval for the consensus mean:

$$\bar{x} \pm \frac{t_{(1-\alpha/2,nlab-1)}s} {\sqrt{\mbox{nlab}}}$$

The limitations of this method are discussed in the "An ISO GUM Approach to Combining Results from Multiple Methods" paper (see the Reference section).

For this method, the consensus mean estimate is an equi-weighted mean with no regard to possible differences in within-lab variation or within-lab sample sizes. The advantages of this method are that it is robust and simple to compute. The primary disadvantage is that no consideration is given to possible differences in the within-lab variation and sample sizes.

If the laboratory means are not normally distributed (e.g., due to the presence of outliers), this can distort the mean of means estimates. Two more robust procudures are available.

The median of means estimate takes the median of the laboratory means. The associated uncertainty is

$$\mu_{\tilde{x}} = \sqrt{\frac{\pi}{2 k} \mbox{MADe}}$$

with $$\tilde{x}$$, k, and MADe denoting the number of laboratories, the median of the laboratory means, and the scaled median absolute deviation (the scaled median absolute deviation is the median absolute deviation divided by 0.67449) of the laboratory means, respectively. It is recommended that at least five laboratories be available for this uncertainty to be reliable.

The Huber mean of means is based on the Huber's H15 robust mean of the laboratory means. The associated uncertainty is

$$u(\hat{\mu}_{\tiny H15}) = \frac{1}{e} \hat{\sigma}_{\tiny H15}$$

with $$\hat{\sigma_{\tiny H15}}$$ denoting the H15 estimate of scale. The e parameter is a tuning constant that is set to 0.95. The details of the H15 location and scale estimators are given elsewhere.

These robust estimators are discussed in the CCQM Guidance Note (see References below).

These robust estimators are more commonly used in the context of interlaboratory studies rather than for certifying reference materials. For certified reference materials, laboratories and methods are carefully chosen so outliers are less often a problem. Interlaboratory studies typically involve a greater range of laboratories with a wider range of capabilities and outliers are more likely to be an issue.

Common Mean Model:
Assume there are k laboratories, each measuring the unknown underlying (nonrandom) value $$\mu$$ common to all laboratories. The measurements xij i = 1, ..., k; j = 1, ..., ni are of the form

$$x_{ij} = \mu + e_{ij}$$

with independent Gaussian errors $$e_{ij} \sim N(0,\kappa_{i}^{2})$$. All parameters $$\mu$$, $$\kappa_{i}^{2}$$ i = 1, ...., k are unknown and the goal is to estimate $$\mu$$, determine its standard error, and to provide a confidence interval for $$\mu$$.

Unbiased estimates of the within lab means and variances $$\sigma_{i}^{2} = \kappa_{i}^{2}/n_{i}$$

$$x_i = \bar{x_{i}} = \sum_{j=1}^{n_i}{\frac{x_{ij}}{n_{i}}}$$

$$s_{i}^2 = \sum_{j=1}^{n_i}{\frac{(x_{ij} - x_{i})^2}{n_{i}(n_{i} - 1)}}$$

When the variaces $$\sigma_{i}^2$$ are known, the best, in terms of mean squared error, unbiased estimator of the reference value $$\mu$$ is the weighted means statistics

$$\tilde{x} = \frac{\sum_{i=1}^{k} {w_{i}x_{i}} } {\sum_{i=1}^{k} {w_i}}$$

with $$w_{i} = 1/\sigma_{i}^2$$. The formula for the variance is

$$Var(\tilde{x}) = E(\tilde{x} - \mu)^2 = \frac{1}{\sum_{i=1}^{k}{w_{i}}}$$

In practice, these within lab variances are unknown and so the true wi are also unknown.

The Graybill-Deal method is based on this model. In the Graybill-Deal model, the estimate of the consensus mean is

$$\tilde{x} = \frac{\sum_{i=1}^{k}{n_{i}x_{i}/(s_{i}^2)}} {\sum_{i=1}^{k}{n_i/(s_{i}^2)}}$$

Dataplot supports four methods for computing the variance of the Graybill-Deal consensus mean.

1. The naive estimate of the variance is obtained by replacing the $$\sigma_{i}^2$$ with the sample estimates $$s_i^2$$

$$\hat{Var(\tilde{x})} = \frac{1} {\sum_{i=1}^{k}{\frac{1}{s_i^2}}}$$

Although this variance is easy to compute and widely used, it is known to underestimate the true variance (and rather badly for small sample sizes).

2. Sinha proposed the variance estimator

$$\hat{Var(\tilde{x})} = \frac{1} {\sum_{i=1}^{k} {\frac{1}{s_i^2}}} \left( 1 + 4 \sum_{i=1}^{k}{\frac{\hat{w}_{i} (1 - \hat{w}_{i})}{n_{i} - 1}} \right)$$

with

$$\hat{w}_{i} = \frac{1/s_{i}^2} {\sum_{j=1}^{k}{1/s_{j}^2}}$$

Zhang performed some simulations that indicate that while this estimate of the variance reduces the bias, it still underestimates it.

3. To reduce the bias further, Zhang proposed the following estimate for the variance

$$\hat{Var(\tilde{x})} = \frac{1} {\sum_{i=1}^{k}{\left( \frac{n_i - 3}{n_i - 1} \right) \left( \frac{1} {s_i^2} \right) }}$$

4. Zhang proposed the following additional estimate for the variance

$$\hat{Var(\tilde{x})} = \frac{1} {\sum_{i=1}^{k}{\left( \frac{n_i - 3}{n_i - 1} \right) \left( \frac{1} {s_i^2} \right) }} \left(1 + 2 \sum_{i=1}^{k} {\frac{\hat{w}_{i}(1 - \hat{w}_{i})}{n_i - 1}} \right)$$

where

$$\hat{w}_{i} = \frac{ \left( \frac{n_i - 3}{n_i - 1} \right) \left(1/s_{i}^2 \right) } {\sum_{j=1}^{k} { \left( \frac{n_j - 3}{n_j - 1} \right) \left( 1/s_{j}^{2} \right) }}$$

Dataplot currently generates confidence intervals for the Graybill-Deal method using a method proposed by Rukhin (private communication). This method generates conservative intervals.

The Graybill-Deal approach has the following limitations

1. It does not take into account between lab effects. If the between lab variance is in fact significant, the Graybill-Deal may not be the appropriate approach.

2. Labs with small variances may recieve unjustifiably large weights and therefore dominate the estimate of the consensus mean.

One Way Random Effects Model:
In order to account for between lab variance, we can define a one-way random effects ANOVA model which may be both unbalanced and heteroscedastic:

$$x_{ij} = \mu + b_{i} + e_{ij}$$

where there are i = 1, ..., k labs and j = 1, .... ni observations for each lab. In this model, $$\mu$$ denotes the consensus mean, bi is the lab effect and eij is the error term. The bi are distributed as N(0,$$\sigma^2$$) and the eij are distributed as N(0,$$\sigma_{i}^2$$). That is, $$\sigma_{i}^2$$ are the within lab variances and $$\sigma^2$$ is the between lab variance.

For convenience, define the following terms:

 xi = mean for lab i $$\sigma_{i}^2$$ = variance for lab i $$\tau_{i}^{2}$$ = $$\frac{\sigma_{i}^2} {n_i}$$ (this was $$s_i^2$$ in the common means model) vi = ni - 1 $$\gamma_i$$ = $$\frac{\sigma^2} {\sigma^2 + \tau_{i}^2}$$ $$t_{i}^2$$ = $$\frac{s_{i}^2} {n_i}$$ (= variance of the mean)

The Mandel-Paule, modified Mandel-Paule, maximum likelihood (ML), DerSimonian-Laird, and generalized confidence interval methods are based on this model. We will discuss each of these in turn.

1. Mandel-Paule/Modified Mandel-Paule

The Mandel-Paule estimate of the consensus mean is defined as

$$\tilde{x} = \frac{\sum_{i=1}^{k} {w_{i}x_{i}} } {\sum_{i=1}^{k} {w_i}}$$

with wi denoting the weight function

$$w_{i} = \frac{1} {y + t_{i}^{2}}$$

where y is an estimate of the between lab variance.

The between lab variance is estimated by iteratively solving the following equation:

$$\frac{\sum_{i=1}^{k}{(x_{i} - \tilde{x})^2}} {y + t_{i}^2} = k - 1$$

The modified Mandel-Paule procedure uses k on the right hand side instead of (k-1).

The confidence interval for the consensus mean is computed as (equation 19 in the Ruhkin and Vangel paper).

$$\tilde{x} \pm \Phi^{-1}(\alpha/2) \frac{\sqrt{\sum_{i=1}^{k} {(x_{i} - \tilde{x})^2/(y + t_{i}^2)^2}}} {\sum_{i=1}^{k}{1/(y + t_{i}^2)}}$$

The Mandel-Paule estimates can be considered an approximation to maximum likelihood estimates, but they are computationally simpler. The Mandel-Paule methods are a reasonable choice when the number of labs is greater than or equal to six. For a smaller number of labs, the uncertainty intervals are generally too small.

Dataplot uses code provided by Mark Vangel to compute the Mandel-Paule estimates.

2. Rukhin-Vangel Maximum Likelihood (ML)

The Rukhin-Vangel paper gives the likelihood function. From this likelihood function, the ML estimate for the consensus mean is obtained from the equation

$$\tilde{x} = \frac{\sum_{i=1}^{k} {\gamma_{i}x_{i}} } {\sum_{i=1}^{k} {\gamma_i}}$$

The ML estimate of the between lab variance is obtained from the equation

$$\sigma^2 = \frac{\sum_{i=1}^{k}{(x_i - \tilde{x})^2 + \frac{v_{i}t_{i}^2}{1-\gamma_i}} } {n+k}$$

Rukhin and Vangel give a ML estimate of $$\gamma_i$$. This estimate is fairly complex and not repeated here. This ML estimate of $$\gamma_i$$ is solved numerically using an iterative algorithm. The Mandel-Paule estimates are used as starting values for the consensus mean and between lab variance.

The confidence interval for the ML estimate has the same form as the Mandel-Paule confidence interval. However, the $$t_{i}^2$$ are replaced with $$\tau_{i}^2$$ in the formula and the ML estimate of the between lab variance is used.

The mathematical details of the ML procedure are given in the Rukhin-Vangel paper. They also show why the Mandel-Paule estimates provide a good aproximation to the ML estimates.

This is the recommended method of choice when the number of labs is large (>= 6). As with Mandel-Paule, the uncertainty intervals tend to be too small when the number of labs is small (<= 5).

Dataplot uses code provided by Mark Vangel to compute the maximum likelihood estimates.

3. DerSimonian-Laird

The DerSimonian-Laird procedure is as follows:

• Compute the Graybill-Deal estimate as an initial estimate of the consensus mean (see the above description for Graybill-Deal).

• Determine a non-negative estimate of the between lab variance from

YDL = MAX[0,TERM1/(TERM2 - TERM3/TERM2)]

where

 TERM1 = $$\sum_{i=1}^{k} {\frac{(x_{i} - \tilde{x})^2} {s_{i}^{2}}} - k + 1$$ TERM2 = $$\sum_{i=1}^{k} {\frac{1} {s_{i}^{2}}}$$ TERM3 = $$\sum_{i=1}^{k} {\frac{1} {s_{i}^{4}}}$$ $$\tilde{x}_{GD}$$ = Graybill-Deal estimate of the consensus mean

• Estimate the Dersimonian-Laird weights and use them to compute the Dersimonian-Laird consensus mean

$$w_{i} = \frac{1} {Y_{DL} + s_{i}^2}$$

$$\tilde{x}_{DL} = \frac{w_{i} x_{i}} {\sum_{i=1}^{k}{w_{i}}}$$

• Compute the variance of the Dersimonian-Laird consensus mean estimate

$$Var(\tilde{x}_{DL}) = \sum_{i=1}^{k}{\frac{w_{i}^2 (x_{i} - \tilde{x})^2} {1 - w_{i}}}$$

The corresponding confidence interval is

$$\tilde{x}_{DL} \pm \sqrt{Var(\tilde{x}_{DL})} t_{1-\alpha/2,k-1}$$

The variance can also be estimated with several other methods. The Horn-Horn-Duncan (HHD) and minmax variance methods are described in the paper

Andrew Rukhin (2009), "Weighted Means Statistics in Interlaboratory Studies", Metrologia, Vol. 46, pp. 323-331.

The HHD method in particular is recommended by Ruhkin. It is conservative and should maintain the nominal coverage even when the sample sizes and variances for the labs vary widely.

In addition, the variance can be estimated using a parameteric bootstrap. The parametric bootstrap method used in Dataplot was developed by Antonio Possolo of the NIST Statistical Engineering Division.

Note that 95% confidence intervals for the bootstrap method can be derived in 3 different ways.

1. The 2.5 and 97.5 percentiles of the bootstrap samples can be used. Note that this interval is not symmetric. This is referred to as the percentile method.

2. The percentile method can be adjusted to create an interval that is symmetric about the estimated consensus mean. Compute the distance between the consensus mean and the 2.5 percentile of the bootstrap samples and also the distance between the consensus mean and the 97.5 percentile of the bootstrap samples. The larger of these two distances is used to determine a symmetric confidence interval.

3. Compute a kernel density curve from the bootstrap samples. Determine the 2.5 percentile and 97.5 percentile points of this kernel density curve. Similar to the percentile method, the symmetric confidence interval will be determined.

4. Iyer and Wang Generalized Confidence Intervals

Iyer, Wang, and Matthew have applied the generalized confidence interval approach of Weerhandi to the problem of finding confidence limits for the consensus mean.

The description of this method is rather involved and not given here. See the Wang, Iyer, and Matthews article listed in the Reference section below. Dataplot uses code provided by Jack Wang to compute the confidence intervals for this approach.

The primary advantage of this method is that it can be applied to cases where there are a small number of labs. It is also more robust than the Mandel-Paule and maximum likelihood when the normality assumptions are violated.

1. BOB (Type B on Bias)

This method is discussed in detail in the "An ISO GUM Approach to Combing Results from Multiple Methods" paper (see the Reference section).

This method should only applied if there are between two and five methods. It is based on the type B model of bias (which is where the name BOB comes from).

BOB is based on the model

$$\gamma = \mu + \beta$$

where $$\gamma$$ is the unknown value of the measurand, $$\mu$$ is the equally weighted mean of the population means of the methods, and $$\beta$$ is the possible bias of the $$\mu$$ as an estimate of $$\gamma$$. Both $$\mu$$ and $$\beta$$ require estimates and uncertainites of the estimates. The estimate of $$\mu$$ is the sample mean of the set of method (or lab) results.

We assume the best estimate of $$\beta$$ is 0, but there is uncertainty in this estimate. A probability distribution is placed on the value of $$\beta$$ that best summarizes the available information. A common choice is to use a uniform distribution (the ISO GUM paper provides the rationale for this choice), and that is the distributional model Dataplot uses. That is, we assume a uniform distribution centered at zero with upper and lower bounds of +a and -a. For this uniform distribution, the standard uncertainty is $$a/\sqrt{3}$$. The choice for a is the difference between the minimum and maximum lab (or method) mean divided by 2. This yields $$(\mbox{xmax} - \mbox{xmin})/\sqrt{12}$$ as the uncertainty for the bias term.

Dataplot combines these to get the following uncertainty factor:

$$KU = 2\sqrt{s_w^2 + s_b^2}$$

where

$$s_w^2 = \frac{ \sum_{i=1}^{\mbox{nlab}}{s_i^2}} {\mbox{nlab}^2}$$

where $$s_i$$ is the standard deviation of the ith lab mean and

$$s_b^2 = \frac{(\bar{x}_{max} - \bar{x}_{min})^2} {12}$$

Here, $$s_w^2$$ is the within lab variability (where s is the standard deviation of the lab means) and $$s_b^2$$ is the between lab variability.

The ISO GUM paper discusses some variations of this basic technique. For example, Dataplot uses a factor of 2 for the expanded uncertainty interval. This can be replaced with a t-value where the degrees of freedom are computed using the Welch-Saitterwaite approximation.

The BOB method is intended for the case when the number of labs is small (<= 5).

The BOB method was adapted from a Dataplot macro provided by Stefan Leigh of the NIST Statistical Engineering Division.

2. Schiller-Eberhardt

A number of variants of this method have been used. Dataplot implements the method as discussed in the Schiller-Eberhardt paper (see the Reference section below).

The Schiller-Eberhardt estimate of the consensus mean is:

$$\tilde{x} = \sum_{i=1}^{k} {\omega_{i} x_{i}}$$

where xi is the mean of the ith lab and $$\omega_{i}$$ is the weighting function:

$$\omega_{i} = \frac{w_i} {\sum_{i=1}^{k}{w_i}}$$

where

$$w_{i} = \frac{1} {s_{i}^{2} + s_{b}^{2}}$$

Here, $$s_b^2$$ is the between lab variance and $$s_i^2$$ is the variance of the i-th lab. The between lab variance is estimated as the smallest non-negative value that satisfies

$$\sum_{i=1}^{k} {\frac{w_{i}(x - \tilde{x})^2} {k-1}} = 1$$

This is solved iteratively using the Mandel-Paule algorithm. Note that Dataplot uses the between lab variance computed by the Mandel-Paule method described above.

The uncertainty interval for Schiller-Eberhardt is defined as

$$U = t_{(1-\alpha/2,df)} \sqrt{s_{\tilde{x}}^{2} + s_{h}^{2}} + bias allowance$$

where $$s_{\tilde{x}}^2$$ is the variance of the consensus mean, $$s_h^2$$ is the material variability variance, and bias allowance is defined as

$$Bias Allowance = \max{|\bar{x}_{i} - \tilde{x}|}$$

The variance of the Schiller-Eberhardt consensus mean is computed as

$$s_{\tilde{x}}^2 = \sum_{i=1}^{k}{\omega^{2}_{i}s_{i}^{2}}$$

where $$s_i^2$$ is the variance of the i-th lab mean and the omega weight function is defined as

$$\omega_{i} = \frac{w_i} {\sum_{i=1}^{k}{w_i}}$$

$$w_{i} = \frac{1} {s_{i}^{2}}$$

Note that the weight function for $$s_{\tilde{x}}^2$$ omits the between lab variance term that is included in the weight function for the consensus mean.

The variance of the material variability is discussed in the Schiller-Eberhardt paper. This is computed independently of the data given to the CONSENSUS MEAN command. To specify a value for $$s_h^2$$, enter the following commands before entering the CONSENSUS MEAN command.

LET SIGMAH = <value>
LET DFH = <value>

SIGMAH contains the value of the variance and DFH contains the corresponding degrees of freedom for the materials variance.

The degrees of freedom for the t percent point function in the uncertainty is computed as

$$df(effective) = \frac{(\sum_{i=1}^{k} {\omega_{i}^{2}s_{i}^{2}} + s_{h}^2)^2} {\sum_{i=1}^{k}{\frac{(\omega_{i}^{2}s_{i}^{2})^2} {n_i - 1}} + \frac{s_{h}^4} {df_h} }$$

This method has been superseeded by the BOB method. As with BOB, the Schiller-Eberhardt method is intended for a small number of labs (<= 5).

3. Fairweather

This implements the method described in

Fairweather (1972), "A Method for Obtaining an Exact Confidence Interval for the Common Mean of Several Normal Populations", Applied Statistics, Vol. 21, pp. 229-233.

Cox (2002), "The Evaluation of Key Comparison Data", Metrologia, Vol. 39, pp. 589-595.

The Dataplot code for the Fairweather procedure was adapted from a Matlab script provided by Andrew Ruhkin.

4. Bayesian Consensus Procedure

This implements the method described in

Hagwood and Guthrie (2006), "Combining Data in Small Multiple-Methods Studies", Technometrics, Vol. 48, No. 2.

This method is an alternative to BOB for the case where there are a small number of labs (typically 5 or less).

Data Analytic Considerations:
In determining the most appropriate estimate of the consensus mean, the following issues need to be addressed.

1. What is the definition of the consensus mean. Is this a lab independent number which represents an absolute physical truth or is this a lab-dependent "average" across all participating labs?

2. How many labs are there? Some methods are more appropriate for a small number of labs while others are based on asymptotic results and are thus more appropriate for a larger number of labs.

3. Do between-lab differences (biases) exist?

4. Are there differences in within-lab variation?

5. Are there differences in within lab sample sizes?

6. Does a lab with much data have such only because the lab's method is cheaper and thus of potentially poorer quality than other labs?

7. Are all labs treated equally?

8. Do "star" labs exist? That is, labs that are known to be either super unbiased or super accurate.

9. If an engineering equal lab tests out to be a statistically outlying lab, how (a prioori) is that lab to be weighted?

Answers to the above questions will determine how to appropriately weight the labs. The consensus mean will be a weighted mean of the lab means. The weighting can be either fixed (i.e., equal weights) or variable where the variable weights can be based on both engineering and statistical considerations.

If the engineering decision is made to treat all labs as equal in importance, then from a statistical point of view the analysis consists primarily of the following two steps:

1. estimation of a consensus mean;
2. estimation of an uncertainty limits for the consensus mean.

An additional third step is to carry out formal statistical tests to identify potentially outlying labs. A statistically unsolvable question that persists here is that just because a lab appears "different" does not necessarily mean that the lab is wrong (i.e., biased). The spectre that all of the consistent labs being self-behaved but biased is a real possibility which can only be solved by engineering judgement.

Description of the Dataplot Input:
Dataplot can accept data in either one of the following formats:

1. Raw Data - there should two columns of data. The first column contains the respone values and the second column contains the corresponding lab-id. The data do not need to be sorted by lab-id.

If your data is the form where each lab is contained in a separate column, you can do something like the following

LET Y LABID = STACK Y1 Y2 Y3 Y4 Y5 Y6
CONSENSUS MEAN Y LABID

This example will take the data for six labs stored in Y1, Y2, Y3, Y4, Y5, and Y6 and save it the variables Y and LABID in a format that can be used by the CONSENSUS MEAN command.

2. Summary Data - there should three columns of data. The first column contains the sample means for the labs, the second column contains the sample standard deviations for the labs, and the third column contains the sample sizes for the labs.
Description of Dataplot Output:
Dataplot generates the following four sections of output for the consensus means analysis.

1. The first section prints summary information about the data. Specifically, it prints the overall mean, the number of observations, and the number of labs. It also prints a table giving the sample size, mean, variance, standard deviation, and standard deviation of the mean for each lab. It then prints the pooled within lab variance (and standard deviation). The pooled within lab variance is computed as: as

$$s_{w} = \frac{\sum_{i=1}^{k}{(n_i - 1) s_{i}^2}} {\sum_{i=1}^{k}{n_i - 1}}$$

with $$s_i^2$$ denoting the variance of the ith lab.

2. The second section prints the detailed output for each method.

3. The third section prints a summary table containing the 95% confidence limits for each method.

4. The fourth section prints summary tables containing the uncertainty and the percent relative uncertainty. Separate tables are printed for standard uncertainty (k = 1) and expanded uncertainty (k = 2).
Syntax 1:
CONSENSUS MEANS <y> <tag>      <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<tag> is a lab id variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes the consensus means based on the raw data.

Syntax 2:
CONSENSUS MEANS <ymean> <ysd> <ni>
<SUBSET/EXCEPT/FOR qualification>
where <ymean> is a variable containing the lab means;
<ysd> is a variable containing the lab standard deviations;
<ni> is a variable containing the lab sample sizes;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes the consensus means based on the lab means, standard deviations, and sample sizes.

Syntax 3:
CONSENSUS MEANS <ymean> <ysd> <ni> <labid>
<SUBSET/EXCEPT/FOR qualification>
where <ymean> is a variable containing the lab means;
<ysd> is a variable containing the lab standard deviations;
<ni> is a variable containing the lab sample sizes;
<labid> is a variable containing the lab-id (numeric values); and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes the consensus means based on the mean, standard deviation, and sample size for each lab. The <labid> is used for identification purposes and is not used in the computations.

Examples:
CONSENSUS MEANS Y1 GROUP
CONSENSUS MEANS Y1 GROUP SUBSET GROUP > 2
CONSENSUS MEANS YMEAN YSD NI
Note:
Variations of the above methods are commonly used. For example, the uncertainty intervals may be modified to incorporate external sources of uncertainty. For that reason, the following internal parameters are automatically saved when the CONSENSUS MEAN command is entered. You can use these parameters to compute many variations of the above methods.

 XGRAND = the overall mean. S2POOL = the pooled within lab variance. YBARSD1 = the standard deviation of the lab means. YBARSD2 = the standard deviation of the lab means where the standard deviation is computed as a deviation from the grand mean rather than a deviation from the mean of the lab means. T1STDERR = the standard error for the t method using all the data. T2STDERR = the standard error for the t method using the lab means. SEMEAN = the consensus mean using the Schiller-Eberhardt method. SES2 = the variance of the consensus mean using the Schiller-Eberhardt method. BIASALLO = the bias allowance for the Schiller-Eberhardt method. SEDF = the degrees of freedom for the Schiller-Eberhardt method. MPMEAN = the consensus mean using the Mandel-Paule method. MPS2 = the between lab variance using the Mandel-Paule method. SEMP = the standard error of the Mandel-Paule method. MMPMEAN = the consensus mean using the modified Mandel-Paule method. MMPS2 = the between lab variance using the modified Mandel-Paule method. SEMMP = the standard error of the modified Mandel-Paule method. MLMEAN = the consensus mean using the Ruhkin-Vangel maximum likelihood method. MLS2 = the between lab variance using the Ruhkin-Vangel maximum likelihood method. SEML = the standard error of the Ruhkin-Vangel maximum likelihood method. BOBMEAN = the consensus mean for the BOB method. BOBS2 = the between lab variance for the BOB method. BOBS2W = the within lab variance for the BOB method. BOBKU = the uncertainty value for the BOB method. GDMEAN = the consensus mean using the Graybill-Deal method. GDS2 = the variance of the Graybill-Deal consensus mean. GCIMEAN = the consensus mean using the generalized confidence interval approach. GCISE = the standard error of the generalized confidence interval consensus mean. DERSMEAN = the consensus mean using the DerSimonian-Laird approach. DERSVARI = the variance of the DerSimonian-Laird consensus mean. DERSSE = the standard error of the DerSimonian-Laird consensus mean. YDL = the betweeen lab variance for the DerSimonian-Laird consensus mean. DERS95LL = the lower 95% confidence interval for the DerSimonian-Laird consensus mean based on the original formula for the variance. DERS95UL = the upper 95% confidence interval for the DerSimonian-Laird consensus mean based on the original formula for the variance. DHHD95LL = the lower 95% confidence interval for the DerSimonian-Laird consensus mean based on the Horn-Horn-Duncan formula for the variance. DHHD95UL = the upper 95% confidence interval for the DerSimonian-Laird consensus mean based on the Horn-Horn-Duncan formula for the variance. DERSSEHD = the standard error of the DerSimonian-Laird consensus mean based on the Horn-Horn-Duncan variance. DERSSERU = the standard error of the DerSimonian-Laird consensus mean based on the minmax variance. DERSSEBS = the standard error of the DerSimonian-Laird consensus mean based on the bootstrap. DERSBOK2 = the coverage factor for a 95% confidence interval for the DerSimonian-Laird consensus mean based on the percentiles of the bootstrap. DERSBOK2 = the coverage factor for a 95% confidence interval for the DerSimonian-Laird consensus mean based on a kernel density of the bootstrap. FAIRMEAN = the consensus mean using the Fairweather method. FAIRSE = the standard error of the Fairweather consensus mean. BCPMEAN = the consensus mean using the Bayesian consensus procedure method. BCPSE = the standard error of the Bayesian consensus procedure consensus mean. MEDOFMEA = the consensus mean using the median of means method. MEDMEASE = the standard error of the median of means method. H15OFMEA = the consensus mean using the Huber mean (H15) of means method. H15MEASE = the standard error of the Huber mean of means method.
Note:
To allow additional analysis, a number of results are written to external files.

The following variables are written to the file dpst1f.dat. These are the statistics for the labs.

1. Lab ID
2. Number of Observations for Lab
3. Mean for the Lab
4. Variance for the Lab
5. Standard Deviation for the Lab
6. Standard Deviation of Mean of the Lab

The following variables are written to the file dpst2f.dat. This is the information contained in table 2 of the CONSENSUS MEAN output. These variables can be used to make plots of the consensus mean results.

1. Consensus mean for the method
2. Lower 95% confidence limit for the method
3. Upper 95% confidence limit for the method
4. Method id

The following variables are written to the file dpst3f.dat. This is the information contained in table 3 of the CONSENSUS MEAN output. These variables can be used to generate plots of the consensus mean results.

1. Consensus mean for the method
2. Standard uncertainty (k = 1) for the method
3. Percentage relative standard uncertainty
4. Method id

The following variables are written to the file dpst4f.dat. This is the information contained in table 4 of the CONSENSUS MEAN output. These variables can be used to generate plots of the consensus mean results.

1. Consensus mean for the method
2. Expanded uncertainty (k = 2) for the method
3. Percentage relative expanded uncertainty
4. Method id

The following variables are written to the file dpst5f.dat.

1. Simulated consensus means from the generalized confidence interval approach
Note:
By default, the 4 tables are generated using an F15.7 format. This uses a column width of 15 with 7 digits to the right of the decimal point. You can specify the number of digits to the right of the decimal point with the command

SET WRITE DECIMALS <value>

If you want to use an exponential format (E15.7), enter

SET WRITE DECIMALS -7
Note:
You can optionally generate the CONSENSUS MEANS output in HTML, Latex, or Rich Text Format (RTF). Enter

HELP CAPTURE HTML
HELP CAPTURE LATEX

for details.

Note:
For the raw data case, if the number of labs is greater than 6 the Bayesian consensus procedure method is automatically suppressed.

Although the BOB procedure is not recommened when there are more than five laboratories, it is not automatically suppressed in this case.

The Fairweather method requires that each lab have a minimum of five measurements. If at least one lab has five or fewer measurements, then the Fairweather method is automatically suppressed.

Note:
If you have labs with either a single observation or with a zero standard deviation (i.e., all measurements are the same for that lab), this can cause problems for some of the methods.

If this situation is encountered, the following methods can still use the data from that lab

1. grand mean
2. mean of means
3. BOB
4. Bayesian consensus procedure

For the remaining methods, these labs will be automatically omitted from the consensus means analysis.

Note:
The CONSENSUS MEAN command typically works with the mean, standard deviation, and sample size of each lab. If you have raw data (i.e., response variable and group-id variable), these are automatically computed.

If you have summary data, it may not always be available in this form. Specifically, the following types of summary data are sometimes encountered.

1. You may have a standard uncertainty with an associated degrees of freedom. For example, the uncertainty may incorporate type B components. In some cases, an associated degrees of freedom may not be available.

2. In some cases, the uncertainty may be $$s/\sqrt{n}$$. In theses cases, the value of n may or may not be available. If it is, then you should convert your uncertainty to a standard deviation of the data and a sample size before running the CONSENSUS MEAN command. However, in some cases the sample size may not be available.

To address these cases, the summary data may be entered in the following ways.

1. If the sample size is given as a negative value, then the absolute value of the sample size will be interpreted as the "effective degrees of freedom" and the standard deviation column will be interpreted as "s/n".

If your uncertainty is equivalent to a standard deviation (i.e., s rather than s/n), then enter the positive value of the effective degrees of freedom in the sample size column.

2. If the sample size is given as zero, this means that no effective degrees of freedom are available.

Many of the formulas for consensus means contain

$$n_{i} -c$$

terms where c denotes a constant (frequently c = 1). Methods that have terms like this will not be supported.

Your data may contain a mix of labs where some have the standard deviation and sample size and others where a standard uncertainty is provided. This is allowed, but the following methods will be suppressed if any of the sample sizes has a non-positive value

1. Vangel-Ruhkin
2. generalized confidence intervals
3. Schiller-Eberhardt
4. Bayesian consensus procedure
5. Fairweather
6. grand mean
Note:
The CONSENSUS MEANS command is typically used to compute Type A components of uncertainty. In some cases, you may have additional type B components of uncertainty. There are several ways to address this.

1. You can simply add the standard type B error to the sample standard deviations.

2. You can add the type B components to the type A (the sample standard deviation) by summing in quadrature to obtain a standard uncertainty. Then enter the sample size as zero. That is, the standard deviation column will be interpreted as $$s_{i}/\sqrt{n_{i}}$$. Although this is preferrable to method 1, it will restrict the methods that are available.

3. You can sum in quadrature to obtain the $$s_{i}/\sqrt{n_{i}}$$ estimate. Instead of using zero degrees of freedom, use Welch-Satterthwaite to obtain an effective degrees of freedom. Enter the negative of the effective degrees of freedom for the sample size column. This allows all methods to be used. However, in some cases insufficient information may be available to determine the effective degrees of freedom.
Note:
In some cases, it may be convenient to extract the value of a particular consensus mean statistic.

If you have raw data, you can enter one of the following

LET A = DERSIMONIAN LAIRD Y X
LET A = DERSIMONIAN LAIRD STANDARD ERROR Y X
LET A = DERSIMONIAN LAIRD HHD Y X
LET A = DERSIMONIAN LAIRD MINMAX Y X
LET A = MANDEL PAULE Y X
LET A = MANDEL PAULE STANDARD ERROR Y X
LET A = MODIFIED MANDEL PAULE Y X
LET A = MODIFIED MANDEL PAULE STANDARD ERROR Y X
LET A = VANGEL RUKHIN Y X
LET A = VANGEL RUKHIN STANDARD ERROR Y X
LET A = GENERALIZED CONFIDENCE INTERVAL Y X
LET A = GENERALIZED CONFIDENCE INTERVAL STANDARD ERROR Y X
LET A = BOB Y X
LET A = BOB STANDARD ERROR Y X
LET A = BCP Y X
LET A = BCP STANDARD ERROR Y X
LET A = MEAN OF MEANS Y X
LET A = MEAN OF MEANS STANDARD ERROR Y X
LET A = FAIRWEATHER Y X
LET A = FAIRWEATHER STANDARD ERROR Y X
LET A = SCHILLER-EBERHARDT Y X
LET A = SCHILLER-EBERHARDT STANDARD ERROR Y X
LET A = GRAYBILL DEAL Y X
LET A = GRAYBILL DEAL SINHA STANDARD ERROR Y X
LET A = GRAYBILL DEAL NAIVE STANDARD ERROR Y X
LET A = GRAYBILL DEAL ZHANG ONE STANDARD ERROR Y X
LET A = GRAYBILL DEAL ZHANG TWO STANDARD ERROR Y X

If you have summary data, you can enter one of the following

LET A = SUMMARY DERSIMONIAN LAIRD MEAN SD N
LET A = SUMMARY DERSIMONIAN LAIRD STANDARD ERROR MEAN SD N
LET A = SUMMARY DERSIMONIAN LAIRD HHD MEAN SD N
LET A = SUMMARY DERSIMONIAN LAIRD MINMAX MEAN SD N
LET A = SUMMARY MANDEL PAULE MEAN SD N
LET A = SUMMARY MANDEL PAULE STANDARD ERROR ...
MEAN SD N
LET A = SUMMARY MODIFIED MANDEL PAULE MEAN SD N
LET A = SUMMARY MODIFIED MANDEL PAULE STANDARD ERROR MEAN SD N
LET A = SUMMARY VANGEL RUKHIN MEAN SD N
LET A = SUMMARY VANGEL RUKHIN STANDARD ERROR MEAN SD N
LET A = SUMMARY GENERALIZED CONFIDENCE INTERVAL MEAN SD N
LET A = SUMMARY GENERALIZED CONFIDENCE INTERVAL ...
STANDARD ERROR MEAN SD N
LET A = SUMMARY BOB MEAN SD N
LET A = SUMMARY BOB STANDARD ERROR MEAN SD N
LET A = SUMMARY BCP MEAN SD N
LET A = SUMMARY BCP STANDARD ERROR MEAN SD N
LET A = SUMMARY MEAN OF MEANS MEAN SD N
LET A = SUMMARY MEAN OF MEANS STANDARD ERROR MEAN SD N
LET A = SUMMARY FAIRWEATHER MEAN SD N
LET A = SUMMARY FAIRWEATHER STANDARD ERROR MEAN SD N
LET A = SUMMARY SCHILLER-EBERHARDT MEAN SD N
LET A = SUMMARY SCHILLER-EBERHARDT STANDARD ERROR MEAN SD N
LET A = SUMMARY GRAYBILL DEAL MEAN SD N
LET A = SUMMARY GRAYBILL DEAL SINHA STANDARD ERROR MEAN SD N
LET A = SUMMARY GRAYBILL DEAL NAIVE STANDARD ERROR MEAN SD N
LET A = SUMMARY GRAYBILL DEAL ZHANG ONE STANDARD ERROR ...
MEAN SD N
LET A = SUMMARY GRAYBILL DEAL ZHANG TWO STANDARD ERROR ...
MEAN SD N

Dataplot statistics can be used in a number of other commands. For details, enter

For the SUMMARY cases, bootstrapping is not currently supported. However, we anticipate adding this capability in a subsequent release.

Note: Dataplot supports a large number of methods for determining a consensus mean. In most cases, you will only be interested in a few of these. The following commands can be used to select which methods will be used on subsequent CONSENSUS MEANS commands.

 SET MANDEL PAULE - default is ON SET MODIFIED MANDEL PAULE - default is ON SET VANGEL RUHKIN - default is ON SET DERSIMONIAN LAIRD - default is ON SET DERSIMONIAN LAIRD HHD - default is ON SET DERSIMONIAN LAIRD MINMAX - default is OFF SET DERSIMONIAN LAIRD BOOTSTRAP - default is OFF SET GRAYBILL DEAL - default is ON SET GENERALIZED CONFIDENCE INTERVAL - default is ON SET FAIRWEATHER - default is OFF SET MEAN OF MEANS - default is ON SET GRAND MEAN - default is ON SET BOB - default is ON SET SCHILLER EBERHARDT - default is OFF SET BAYESIAN CONSENSUS PROCEDURE - default is OFF SET MEDIAN OF MEANS - default is OFF SET HUBER MEAN OF MEANS - default is OFF

The following commands are available, but are for methods that are still under development. These commands should not currently be used.

 SET VANGEL RUHKIN BOOTSTRAP - default is OFF SET MEDIAN OF MEANS - default is OFF SET TRIMMED MEAN OF MEANS - default is OFF
Default:
None
Synonyms:
None
Related Commands:
 CONSENSUS MEAN PLOT = Generate a consensus mean plot. MEAN PLOT = Generate a mean plot. SD PLOT = Generate a standard deviation plot. YOUDEN PLOT = Generate a Youden plot. ANOVA = Perform an analysis of variance.
References:
DerSimonian and Laird (1986), "Meta-analysis in Clinical Trials", Controlled Clinical Trials, 7, pp. 177-188.

Graybill and Deal (1959), "Combining Unbiased Estimators", Biometrics, 15, pp. 543-550.

M. S. Levenson, D. L. Banks, K. R. Eberhardt, L. M. Gill, W. F. Guthrie, H. K. Liu, M. G. Vangel, J. H. Yen, and N. F. Zhang (2000), "An ISO GUM Approach to Combining Results from Multiple Methods", Journal of Research of the National Institute of Standards and Technology, Volume 105, Number 4.

John Mandel and Robert Paule (1970), "Interlaboratory Evaluation of a Material with Unequal Number of Replicates", Analytical Chemistry, 42, pp. 1194-1197.

Robert Paule and John Mandel (1982), "Consensus Values and Weighting Factors", Journal of Research of the National Bureau of Standards, 87, pp. 377-385.

Andrew Rukhin (2009), "Weighted Means Statistics in Interlaboratory Studies", Metrologia, Vol. 46, pp. 323-331.

Andrew Ruhkin (2003), "Two Procedures of Meta-analysis in Clinical Trials and Interlaboratory Studies", Tatra Mountains Mathematical Publications, 26, pp. 155-168.

Andrew Ruhkin and Mark Vangel (1998), "Estimation of a Common Mean and Weighted Means Statistics", Journal of the American Statistical Association, Vol. 93, No. 441.

Andrew Ruhkin, B. Biggerstaff, and Mark Vangel (2000), "Restricted Maximum Likelihood Estimation of a Common Mean and Mandel-Paule Algorithm", Journal of Statistical Planning and Inference, 83, pp. 319-330.

Mark Vangel and Andrew Ruhkin (1999), "Maximum Likelihood Analysis for Heteroscedastic One-Way Random Effects ANOVA in Interlaboratory Studies", Biometrics 55, 129-136.

Susannah Schiller and Keith Eberhardt (1991), "Combining Data from Independent Analysis Methods", Spectrochimica, ACTA 46 (12).

Susannah Schiller (1996), "Standard Reference Materials: Statistical Aspects of the Certification of Chemical SRMs", NIST SP 260-125, NIST, Gaithersburg, MD.

Bimal Kumar Sinha (1985), "Unbiased Estimation of the Variance of the Graybill-Deal Estimator of the Common Mean of Several Normal Populations", The Canadian Journal of Statistics, Vol. 13, No. 3, pp. 243-247.

Nien-Fan Zhang (2006), "The Uncertainty Associated with The Weighted Mean of Measurement Data", Metrologia, 43, PP. 195-204.

Hagwood and Guthrie (2006), "Combining Data in Small Multiple-Methods Studies", Technometrics, Vol. 48, No. 2.

Iyer, Wang, and Matthew (2004), "Models and Confidence Intervals for True Values in Interlaboratory Trials", Journal of the American Statistical Association, Vol. 99, No. 468, pp. 1060-1071.

Fairweather (1972), "A Method for Obtaining an Exact Confidence Interval for the Common Mean of Several Normal Populations", Applied Statistics, Vol. 21, pp. 229-233.

Cox (2002), "The Evaluation of Key Comparison Data", Metrologia, Vol. 39, pp. 589-595.

"CCQM Guidance note: Estimation of a Consensus KCRV and associated Degrees of Equivalence", Version 10, 2013.

Applications:
Interlaboratory Studies
Implementation Date:
2000/10
2002/10: Support for Latex and HTML output
2006/3: Reformat output for consistency and clarity
Add Tables 3 and 4 to the output
Updated the Graybill-Deal method
Added the generalized confidence intervals method
Added support for Rich Text Format (RTF) output
Added support for SET WRITE DECIMALS
2006/06: Added the Bayesian Consensus Procedure method
2010/06: Five methods can use labs with zero standard deviations
2011/11: For summary data, add optional lab-id variable
2014/10: For summary data, option to input mean and uncertainty (i.e., s/sqrt(n)) instead of s and n. Not all methods supported for this case.
2017/03: Added support for median of means and Huber mean of means methods.
2017/07: Changed the default for Schiller-Eberhardt, Fairweather, and Bayesian consensus procedure to OFF.
Program 1:

SKIP 25
READ STUTZ86.DAT ALITE JUNK2 JUNK3 JUNK4 JUNK5 LABID
.
FEEDBACK OFF
CONSENSUS MEANS ALITE LABID

The following output is generated:
            Consensus Means Analysis
(Full Sample Case)

Data Summary:
Response Variable: ALITE
Lab-ID Variable: LABID
Number of Observations:                  46
Grand Mean:                              57.22609
Grand Standard Deviation:                1.42742
Total Number of Labs:                    5
Minimum Lab Mean:                        56.50000
Maximum Lab Mean:                        61.20000
Minimum Lab SD:                          0.14142
Maximum Lab SD:                          1.68003
Mean of Lab Means:                       58.59556
SD of Lab Means:                         2.05321
SD of Lab Means (wrt to grand mean):     2.56125
Within Lab (pooled) SD:                  0.83691
Within Lab (pooled) Variance:            0.70042

Table 1: Summary Statistics by Lab

----------------------------------------------------------------------------
Standard
Lab                                             Standard      Deviation
ID    n(i)           Mean       Variance      Deviation    of the Mean
----------------------------------------------------------------------------
1      36       56.75278        0.55228        0.74315        0.12386
2       4       58.42500        2.82250        1.68003        0.84001
3       2       56.50000        0.18000        0.42426        0.30000
4       2       60.10000        0.02000        0.14142        0.10000
5       2       61.20000        0.72000        0.84853        0.60000

1. Method: Mandel-Paule
Estimate of (unscaled) Consensus Mean: 58.56633
Estimate of (scaled) Consensus Mean:   0.43964
Between Lab Variance (unscaled):       4.04657
Between Lab SD (unscaled):             2.01161
Between Lab Variance (scaled):         0.18319
Standard Deviation of Consensus Mean:  0.83173
Standard Uncertainty (k = 1):          0.83173
Expanded Uncertainty (k = 2):          1.66345
Expanded Uncertainty (k =  1.9599640): 1.63016
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   56.93617
Upper 95% (normal) Confidence Limit:   60.19648
Note: Mandel-Paule Best Usage:
6 or More Labs:

2. Method: Modified Mandel-Paule
Estimate of (unscaled) Consensus Mean:  58.55906
Estimate of (scaled) Consensus Mean:    0.43810
Between Lab Variance (unscaled):        3.20461
Between Lab SD (unscaled):              1.79014
Between Lab Variance (scaled):          0.14507
Standard Deviation of Consensus Mean:   0.83388
Standard Uncertainty (k = 1):           0.83388
Expanded Uncertainty (k = 2):           1.66775
Expanded Uncertainty (k =  1.9599640):  1.63437
Normal PPF of 0.975:                    1.95996
Lower 95% (normal) Confidence Limit:    56.92470
Upper 95% (normal) Confidence Limit:    60.19343
Note: Modified Mandel-Paule Best Usage:
6 or More Labs:

3. Method: Vangel-Rukhin Maximum Likelihood
Estimate of (unscaled) Consensus Mean: 58.55346
Estimate of (scaled) Consensus Mean:   0.43691
Between Lab Variance (unscaled):       3.23124
Between Lab SD (unscaled):             1.79756
Between Lab Variance (scaled):         0.14628
Standard Deviation of Consensus Mean:  0.83064
Standard Uncertainty (k = 1):          0.83064
Expanded Uncertainty (k = 2):          1.66128
Expanded Uncertainty (k =  1.9599640): 1.62802
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   56.92544
Upper 95% (normal) Confidence Limit:   60.18148
Note: Vangel-Rukhin Maximum Likelihood
Best Usage: 6 or More Labs

4a. Method: DerSimonian Laird (original variance)
Estimate of Consensus Mean:             58.55450
Estimate of Variance of Consensus Mean: 0.60832
Estimate of Between Lab Variance:       2.82722
Standard Uncertainty (k = 1):           0.77995
Expanded Uncertainty (k = 2):           1.55990
Degrees of Freedom:                     4
t Percent Point Value:                  2.77645
Lower 95% (t-value) Confidence Limit:   56.38900
Upper 95% (t-value) Confidence Limit:   60.71999
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4b. Method: DerSimonian Laird - Horn-Horn-Duncan Variance
Estimate of Consensus Mean:             58.55450
Estimate of Variance of Consensus Mean: 0.87653
Estimate of Between Lab Variance:       2.82722
Standard Uncertainty (k = 1):           0.93623
Expanded Uncertainty (k = 2):           1.87246
Degrees of Freedom:                     4
t Percent Point Value:                  2.77645
Lower 95% (t-value) Confidence Limit:   55.95511
Upper 95% (t-value) Confidence Limit:   61.15389
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

5. Method: Graybill-Deal
Estimate of Consensus Mean:           58.67330
Estimate of Variance (Naive):         0.00554
Standard Uncertainty (Naive) (k = 1): 0.07443
Expanded Uncertainty (Naive) (k = 2): 0.14887
Lower 95% (Rukhin) Confidence Limit:  54.47558
Upper 95% (Rukhin) Confidence Limit:  62.87103
Note: Graybill-Deal Best Usage:
Any Number of Labs,
but no Between Lab Variance

7. Method: Generalized Confidence Intervals
Estimate of Consensus Mean:                       58.45256
Standard Uncertainty (k = 1):                     1.27926
Expanded Uncertainty (k = 2):                     2.55853
Lower 95% (Simulation) Confidence Limit:          55.96620
Upper 95% (Simulation) Confidence Limit:          61.00745
Note: Generalized Confidence Interval Best Usage:
Any Number of Labs:

8. Method: Grand Mean (No Lab Effect)
Mean of All Data:                      57.22609
Standard Deviation of All Data:        2.05321
SD of Consensus Mean (sd/sqrt(n)):     0.30273
Standard Uncertainty (k = 1):          0.30273
Expanded Uncertainty (k = 2):          0.60546
Expanded Uncertainty (k =  2.0141034): 0.60973
Degrees of Freedom:                    45
t Percent Point Value (alpha = 0.05)   2.01410
Lower 95% (t-value) Confidence Limit:  56.61636
Upper 95% (t-value) Confidence Limit:  57.83582
Note: Grand Mean Best Usage:
Any Number of Labs, but no
Lab-to-Lab Differences

9. Method: Mean of Means
Mean of Lab Means:                     58.59556
Standard Deviation of Lab Means:       2.05321
Standard Uncertainty (sd/sqrt(n)):     0.91823
SD of Consensus Mean (sd/sqrt(n)):     0.91823
Standard Uncertainty (k = 1):          0.91823
Expanded Uncertainty (k = 2):          1.83645
Expanded Uncertainty (k =  2.7764451): 2.54940
Degrees of Freedom:                    4
t Percent Point Value (alpha = 0.05):  2.77645
Lower 95% (normal) Confidence Limit:   56.04615
Upper 95% (normal) Confidence Limit:   61.14496
Note: Mean of Means Best Usage:
Any Number of Labs:

11. Method: BOB (Bound on Bias)
Estimate of Consensus Mean:          58.59556
Within Lab Uncertainty:              0.21734
Between Lab Uncertainty:             1.35677
Standard Uncertainty (k = 1):        1.37407
Expanded Uncertainty (k = 2):        2.74814
Lower 95% (k = 2) Confidence Limit:  55.84741
Upper 95% (k = 2) Confidence Limit:  61.34370
Note: BOB Best Usage:
5 or Fewer Labs:

12. Method:Schiller-Eberhardt
Estimate of Consensus Mean:            58.59083
Estimate of Variance of Mean:          0.01692
Bias Allowance:                        2.60917
Sigmah (heterogeneity):                0.00000
Degrees of Freedom for Sigmah:         1
Standard Uncertainty (k = 1):          2.73924
Expanded Uncertainty (k = 2):          2.86931
Expanded Uncertainty (k =  2.3645754): 2.91673
Degrees of Freedom:                    7
t Percent Point Value (alpha = 0.05):  2.36458
Lower 95% Confidence Limit:            55.67410
Upper 95% Confidence Limit:            61.50756
Note: Schiller-Eberhardt Best Usage:
5 or Fewer Labs:

13. Method: BCP (Bayesian Consensus Procedure)
Estimate of Consensus Mean:           58.59556
Standard Deviation of Consensus Mean: 1.36276
Standard Uncertainty (k = 1):         1.36276
Expanded Uncertainty (k = 2):         2.72551
Degrees of Freedom:                   3.89434
t Percent Point Value:                2.80641
Lower 95% (t) Confidence Limit:       54.77110
Upper 95% (t) Confidence Limit:       62.42001
Note: BCP Best Usage:
6 or Fewer Labs:

Table 2:  95% Confidence Limits

----------------------------------------------------------------------------------------------------
Consensus          Lower          Upper    Uncertainty
Method                                             Mean          Limit          Limit         (k*SE)
----------------------------------------------------------------------------------------------------
1. Mandel-Paule                              58.56633       56.93617       60.19648        1.63016
2. Modified Mandel-Paule                     58.55906       56.92470       60.19343        1.63437
3a. Vangel-Rukhin ML                          58.55346       56.92544       60.18148        1.62802
4a. DerSimonian-Laird (original)              58.55450       56.38900       60.71999        2.16549
4b. DerSimonian-Laird (H-H-D)                 58.55450       55.95511       61.15389        2.59939
5. Graybill-Deal                             58.67330       54.47558       62.87103        4.19773
7. Generalized CI                            58.45256       55.96620       61.00745        2.55490
8. Grand Mean                                57.22609       56.61636       57.83582        0.60973
9. Mean of Means                             58.59556       56.04615       61.14496        2.54940
11. BOB                                       58.59556       55.84741       61.34370        2.74814
12. Schiller-Eberhardt                        58.59083       55.67410       61.50756        2.91673
13. BCP                                       58.59556       54.77110       62.42001        3.82445

Table 3:  Standard Uncertainties (k = 1)

-----------------------------------------------------------------------------------
Standard          Relative
Consensus    Uncertainty          Standard
Method                                        Mean        (k = 1)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                         58.56633        0.83173           1.42015
2. Modified Mandel-Paule                58.55906        0.83388           1.42399
3a. Vangel-Rukhin ML                     58.55346        0.83064           1.41860
4a. DerSimonian-Laird (original)         58.55450        0.77995           1.33201
4b. DerSimonian-Laird (H-H-D)            58.55450        0.93623           1.59890
5. Graybill-Deal                        58.67330        0.07443           0.12686
7. Generalized CI                       58.45256        1.27926           2.18855
8. Grand Mean                           57.22609        0.30273           0.52901
9. Mean of Means                        58.59556        0.91823           1.56706
11. BOB                                  58.59556        1.37407           2.34501
12. Schiller-Eberhardt                   58.59083        2.73924           4.67520
13. BCP                                  58.59556        1.36276           2.32570

Table 4:  Expanded Uncertainties (k = 2)

-----------------------------------------------------------------------------------
Expanded          Relative
Consensus    Uncertainty          Expanded
Method                                        Mean        (k = 2)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                         58.56633        1.66345           2.84029
2. Modified Mandel-Paule                58.55906        1.66775           2.84798
3a. Vangel-Rukhin ML                     58.55346        1.66128           2.83720
4a. DerSimonian-Laird (original)         58.55450        1.55990           2.66402
4b. DerSimonian-Laird (H-H-D)            58.55450        1.87246           3.19781
5. Graybill-Deal                        58.67330        0.14887           0.25372
7. Generalized CI                       58.45256        2.55853           4.37710
8. Grand Mean                           57.22609        0.60546           1.05801
9. Mean of Means                        58.59556        1.83645           3.13411
11. BOB                                  58.59556        2.74814           4.69002
12. Schiller-Eberhardt                   58.59083        2.86931           4.89719
13. BCP                                  58.59556        2.72551           4.65140


Program 1:

3.03    0.36       3
3.27    0.33       3
3.44    0.40      12
1.21    0.12       3
1.44    0.21       3
1.18    0.30       8
13.9    0.3        3
13.6    0.04       3
15.0    1.9        8
18.1    0.7        3
18.4    0.5        3
19.7    2.0        8
end of data
.
let n   = number mx
let ind = sequence 1 1 n
let tag = 1 for i = 1 1 n
let tag = 2 for i = 4 1 6
let tag = 3 for i = 7 1 9
let tag = 4 for i = 10 1 12
.
bootstrap samples 100000
set write decimals 5
SET DERSIMONIAN LAIRD BOOTSTRAP ON
SET SCHILLER EBERHARDT OFF
SET MEAN OF MEANS OFF
SET GRAND MEAN OFF
SET GRAYBILL DEAL OFF
SET GENERALIZED CONFIDENCE INTERVAL OFF
SET BAYESIAN CONSENSUS PROCEDURE OFF
SET FAIRWEATHER OFF
.
consensus mean mx sx nx subset tag = 1
consensus mean mx sx nx subset tag = 2
consensus mean mx sx nx subset tag = 3
consensus mean mx sx nx subset tag = 4

The following output is generated


Consensus Means Analysis
(Summary Statistics Case)

Data Summary:
Mean Variable: MX
SD Variable: SX
Sample Size Variable: NX
Total Number of Observations:            18
Grand Mean:                              3.34333
Grand Standard Deviation:                0.45800
Total Number of Labs:                    3
Minimum Lab Mean:                        3.03000
Maximum Lab Mean:                        3.44000
Minimum Lab SD:                          0.33000
Maximum Lab SD:                          0.40000
Within Lab (pooled) SD:                  0.38618
Within Lab (pooled) Variance:            0.14913
Mean of Lab Means:                       3.24667
SD of Lab Means:                         0.20599

Table 1: Summary Statistics by Lab

----------------------------------------------------------------------------
Standard
Lab                                             Standard      Deviation
ID    n(i)           Mean       Variance      Deviation    of the Mean
----------------------------------------------------------------------------
1       3        3.03000        0.12960        0.36000        0.20785
2       3        3.27000        0.10890        0.33000        0.19053
3      12        3.44000        0.16000        0.40000        0.11547

1. Method: Mandel-Paule
Estimate of (unscaled) Consensus Mean: 3.29713
Estimate of (scaled) Consensus Mean:   0.65154
Between Lab Variance (unscaled):       0.01418
Between Lab SD (unscaled):             0.11909
Between Lab Variance (scaled):         0.08436
Standard Deviation of Consensus Mean:  0.09506
Standard Uncertainty (k = 1):          0.09506
Expanded Uncertainty (k = 2):          0.19012
Expanded Uncertainty (k =  1.9599640): 0.18631
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   3.11081
Upper 95% (normal) Confidence Limit:   3.48344
Note: Mandel-Paule Best Usage:
6 or More Labs:

2. Method: Modified Mandel-Paule
Estimate of (unscaled) Consensus Mean:  3.32472
Estimate of (scaled) Consensus Mean:    0.71884
Between Lab Variance (unscaled):        0.00076
Between Lab SD (unscaled):              0.02751
Between Lab Variance (scaled):          0.00450
Standard Deviation of Consensus Mean:   0.08848
Standard Uncertainty (k = 1):           0.08848
Expanded Uncertainty (k = 2):           0.17697
Expanded Uncertainty (k =  1.9599640):  0.17342
Normal PPF of 0.975:                    1.95996
Lower 95% (normal) Confidence Limit:    3.15130
Upper 95% (normal) Confidence Limit:    3.49814
Note: Modified Mandel-Paule Best Usage:
6 or More Labs:

3. Method: Vangel-Rukhin Maximum Likelihood
Estimate of (unscaled) Consensus Mean:    3.32039
Estimate of (scaled) Consensus Mean:      0.70827
Between Lab Variance (unscaled):          0.00000
Between Lab SD (unscaled):                0.00000
Between Lab Variance (scaled):            0.00000
Standard Deviation of Consensus Mean:     0.08355
Standard Uncertainty (k = 1):             0.08355
Expanded Uncertainty (k = 2):             0.16711
Expanded Uncertainty (k =  1.9599640):    0.16376
Normal PPF of 0.975:                      1.95996
Lower 95% (normal) Confidence Limit:      3.15662
Upper 95% (normal) Confidence Limit:      3.48415
Note: Vangel-Rukhin Maximum Likelihood
Best Usage: 6 or More Labs

WARNING: ESTIMATED BETWEEN LAB VARIANCE
IS LESS THAN 0.00001.  THE
ESTIMATED STANDARD ERROR OF THE
CONSENSUS MEAN MAY BE SUSPECT.

4a. Method: DerSimonian Laird (original variance)
Estimate of Consensus Mean:             3.30565
Estimate of Variance of Consensus Mean: 0.01150
Estimate of Between Lab Variance:       0.00866
Standard Uncertainty (k = 1):           0.10722
Expanded Uncertainty (k = 2):           0.21445
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   2.84430
Upper 95% (t-value) Confidence Limit:   3.76699
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4b. Method: DerSimonian Laird - Horn-Horn-Duncan Variance
Estimate of Consensus Mean:             3.30565
Estimate of Variance of Consensus Mean: 0.01524
Estimate of Between Lab Variance:       0.00866
Standard Uncertainty (k = 1):           0.12344
Expanded Uncertainty (k = 2):           0.24688
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   2.77453
Upper 95% (t-value) Confidence Limit:   3.83677
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4d. Method: DerSimonian Laird - Bootstrap Variance
Number of Bootstrap Samples                        100000
Estimate of Consensus Mean:                        3.30565
Estimate of Variance of Consensus Mean:            0.01352
Standard Uncertainty (k = 1):                      0.11628
Expanded Uncertainty (k = 2):                      0.23256
Lower 95% (percentile bootstrap) Confidence Limit: 3.07915
Upper 95% (percentile bootstrap) Confidence Limit: 3.53499
Lower 95% (symmetric bootstrap) Confidence Limit:  3.07630
Upper 95% (symmetric bootstrap) Confidence Limit:  3.53499
K (symmetric bootstrap) Coverage Factor:           1.97239
Lower 95% (kernel bootstrap) Confidence Limit:     3.07682
Upper 95% (kernel bootstrap) Confidence Limit:     3.53458
K (kernel bootstrap) Coverage Factor:              1.96884
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

11. Method: BOB (Bound on Bias)
Estimate of Consensus Mean:          3.24667
Within Lab Uncertainty:              0.10156
Between Lab Uncertainty:             0.11836
Standard Uncertainty (k = 1):        0.15596
Expanded Uncertainty (k = 2):        0.31192
Lower 95% (k = 2) Confidence Limit:  2.93475
Upper 95% (k = 2) Confidence Limit:  3.55858
Note: BOB Best Usage:
5 or Fewer Labs:

Table 2:  95% Confidence Limits

----------------------------------------------------------------------------------------------------
Consensus          Lower          Upper    Uncertainty
Method                                             Mean          Limit          Limit         (k*SE)
----------------------------------------------------------------------------------------------------
1. Mandel-Paule                               3.29713        3.11081        3.48344        0.18631
2. Modified Mandel-Paule                      3.32472        3.15130        3.49814        0.17342
3a. Vangel-Rukhin ML                           3.32039        3.15662        3.48415        0.16376
4a. DerSimonian-Laird (original)               3.30565        2.84430        3.76699        0.46134
4b. DerSimonian-Laird (H-H-D)                  3.30565        2.77453        3.83677        0.53112
4d. DerSimonian-Laird (perc. bootstrap)        3.30565        3.07915        3.53499        0.22934
4d. DerSimonian-Laird (symm. bootstrap)        3.30565        3.07630        3.53499        0.22934
4d. DerSimonian-Laird (kern bootstrap)         3.30565        3.07682        3.53458        0.22893
11. BOB                                        3.24667        2.93475        3.55858        0.31192

Table 3:  Standard Uncertainties (k = 1)

-----------------------------------------------------------------------------------
Standard          Relative
Consensus    Uncertainty          Standard
Method                                        Mean        (k = 1)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                          3.29713        0.09506           2.88311
2. Modified Mandel-Paule                 3.32472        0.08848           2.66135
3a. Vangel-Rukhin ML                      3.32039        0.08355           2.51641
4a. DerSimonian-Laird (original)          3.30565        0.10722           3.24363
4b. DerSimonian-Laird (H-H-D)             3.30565        0.12344           3.73421
4d. DerSimonian-Laird (bootstrap)         3.30565        0.11628           3.51755
11. BOB                                   3.24667        0.15596           4.80366

Table 4:  Expanded Uncertainties (k = 2)

-----------------------------------------------------------------------------------
Expanded          Relative
Consensus    Uncertainty          Expanded
Method                                        Mean        (k = 2)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                          3.29713        0.19012           5.76621
2. Modified Mandel-Paule                 3.32472        0.17697           5.32271
3a. Vangel-Rukhin ML                      3.32039        0.16711           5.03282
4a. DerSimonian-Laird (original)          3.30565        0.21445           6.48726
4b. DerSimonian-Laird (H-H-D)             3.30565        0.24688           7.46842
4d. DerSimonian-Laird (bootstrap)         3.30565        0.23256           7.03510
11. BOB                                   3.24667        0.31192           9.60732

Consensus Means Analysis
(Summary Statistics Case)

Data Summary:
Mean Variable: MX
SD Variable: SX
Sample Size Variable: NX
Total Number of Observations:            14
Grand Mean:                              1.24214
Grand Standard Deviation:                0.30818
Total Number of Labs:                    3
Minimum Lab Mean:                        1.18000
Maximum Lab Mean:                        1.44000
Minimum Lab SD:                          0.12000
Maximum Lab SD:                          0.30000
Within Lab (pooled) SD:                  0.26059
Within Lab (pooled) Variance:            0.06791
Mean of Lab Means:                       1.27667
SD of Lab Means:                         0.14224

Table 1: Summary Statistics by Lab

----------------------------------------------------------------------------
Standard
Lab                                             Standard      Deviation
ID    n(i)           Mean       Variance      Deviation    of the Mean
----------------------------------------------------------------------------
1       3        1.21000        0.01440        0.12000        0.06928
2       3        1.44000        0.04410        0.21000        0.12124
3       8        1.18000        0.09000        0.30000        0.10607

1. Method: Mandel-Paule
Estimate of (unscaled) Consensus Mean: 1.25879
Estimate of (scaled) Consensus Mean:   0.30308
Between Lab Variance (unscaled):       0.00754
Between Lab SD (unscaled):             0.08682
Between Lab Variance (scaled):         0.11150
Standard Deviation of Consensus Mean:  0.05569
Standard Uncertainty (k = 1):          0.05569
Expanded Uncertainty (k = 2):          0.11137
Expanded Uncertainty (k =  1.9599640): 0.10914
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   1.14965
Upper 95% (normal) Confidence Limit:   1.36793
Note: Mandel-Paule Best Usage:
6 or More Labs:

2. Method: Modified Mandel-Paule
Estimate of (unscaled) Consensus Mean:  1.24810
Estimate of (scaled) Consensus Mean:    0.26196
Between Lab Variance (unscaled):        0.00089
Between Lab SD (unscaled):              0.02978
Between Lab Variance (scaled):          0.01312
Standard Deviation of Consensus Mean:   0.04683
Standard Uncertainty (k = 1):           0.04683
Expanded Uncertainty (k = 2):           0.09366
Expanded Uncertainty (k =  1.9599640):  0.09179
Normal PPF of 0.975:                    1.95996
Lower 95% (normal) Confidence Limit:    1.15632
Upper 95% (normal) Confidence Limit:    1.33989
Note: Modified Mandel-Paule Best Usage:
6 or More Labs:

3. Method: Vangel-Rukhin Maximum Likelihood
Estimate of (unscaled) Consensus Mean: 1.24541
Estimate of (scaled) Consensus Mean:   0.25160
Between Lab Variance (unscaled):       0.03538
Between Lab SD (unscaled):             0.18808
Between Lab Variance (scaled):         0.52326
Standard Deviation of Consensus Mean:  0.06383
Standard Uncertainty (k = 1):          0.06383
Expanded Uncertainty (k = 2):          0.12767
Expanded Uncertainty (k =  1.9599640): 0.12511
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   1.12030
Upper 95% (normal) Confidence Limit:   1.37052
Note: Vangel-Rukhin Maximum Likelihood
Best Usage: 6 or More Labs

4a. Method: DerSimonian Laird (original variance)
Estimate of Consensus Mean:             1.25331
Estimate of Variance of Consensus Mean: 0.00405
Estimate of Between Lab Variance:       0.00333
Standard Uncertainty (k = 1):           0.06363
Expanded Uncertainty (k = 2):           0.12726
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   0.97953
Upper 95% (t-value) Confidence Limit:   1.52709
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4b. Method: DerSimonian Laird - Horn-Horn-Duncan Variance
Estimate of Consensus Mean:             1.25331
Estimate of Variance of Consensus Mean: 0.00377
Estimate of Between Lab Variance:       0.00333
Standard Uncertainty (k = 1):           0.06136
Expanded Uncertainty (k = 2):           0.12272
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   0.98930
Upper 95% (t-value) Confidence Limit:   1.51732
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4d. Method: DerSimonian Laird - Bootstrap Variance
Number of Bootstrap Samples                        100000
Estimate of Consensus Mean:                        1.25331
Estimate of Variance of Consensus Mean:            0.00468
Standard Uncertainty (k = 1):                      0.06837
Expanded Uncertainty (k = 2):                      0.13675
Lower 95% (percentile bootstrap) Confidence Limit: 1.11892
Upper 95% (percentile bootstrap) Confidence Limit: 1.38668
Lower 95% (symmetric bootstrap) Confidence Limit:  1.11892
Upper 95% (symmetric bootstrap) Confidence Limit:  1.38771
K (symmetric bootstrap) Coverage Factor:           1.96557
Lower 95% (kernel bootstrap) Confidence Limit:     1.11858
Upper 95% (kernel bootstrap) Confidence Limit:     1.38814
K (kernel bootstrap) Coverage Factor:              1.97194
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

11. Method: BOB (Bound on Bias)
Estimate of Consensus Mean:          1.27667
Within Lab Uncertainty:              0.05845
Between Lab Uncertainty:             0.07506
Standard Uncertainty (k = 1):        0.09513
Expanded Uncertainty (k = 2):        0.19026
Lower 95% (k = 2) Confidence Limit:  1.08640
Upper 95% (k = 2) Confidence Limit:  1.46693
Note: BOB Best Usage:
5 or Fewer Labs:

Table 2:  95% Confidence Limits

----------------------------------------------------------------------------------------------------
Consensus          Lower          Upper    Uncertainty
Method                                             Mean          Limit          Limit         (k*SE)
----------------------------------------------------------------------------------------------------
1. Mandel-Paule                               1.25879        1.14965        1.36793        0.10914
2. Modified Mandel-Paule                      1.24810        1.15632        1.33989        0.09179
3a. Vangel-Rukhin ML                           1.24541        1.12030        1.37052        0.12511
4a. DerSimonian-Laird (original)               1.25331        0.97953        1.52709        0.27378
4b. DerSimonian-Laird (H-H-D)                  1.25331        0.98930        1.51732        0.26401
4d. DerSimonian-Laird (perc. bootstrap)        1.25331        1.11892        1.38668        0.13440
4d. DerSimonian-Laird (symm. bootstrap)        1.25331        1.11892        1.38771        0.13440
4d. DerSimonian-Laird (kern bootstrap)         1.25331        1.11858        1.38814        0.13483
11. BOB                                        1.27667        1.08640        1.46693        0.19026

Table 3:  Standard Uncertainties (k = 1)

-----------------------------------------------------------------------------------
Standard          Relative
Consensus    Uncertainty          Standard
Method                                        Mean        (k = 1)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                          1.25879        0.05569           4.42371
2. Modified Mandel-Paule                 1.24810        0.04683           3.75215
3a. Vangel-Rukhin ML                      1.24541        0.06383           5.12544
4a. DerSimonian-Laird (original)          1.25331        0.06363           5.07702
4b. DerSimonian-Laird (H-H-D)             1.25331        0.06136           4.89583
4d. DerSimonian-Laird (bootstrap)         1.25331        0.06837           5.45553
11. BOB                                   1.27667        0.09513           7.45155

Table 4:  Expanded Uncertainties (k = 2)

-----------------------------------------------------------------------------------
Expanded          Relative
Consensus    Uncertainty          Expanded
Method                                        Mean        (k = 2)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                          1.25879        0.11137           8.84741
2. Modified Mandel-Paule                 1.24810        0.09366           7.50430
3a. Vangel-Rukhin ML                      1.24541        0.12767          10.25087
4a. DerSimonian-Laird (original)          1.25331        0.12726          10.15403
4b. DerSimonian-Laird (H-H-D)             1.25331        0.12272           9.79165
4d. DerSimonian-Laird (bootstrap)         1.25331        0.13675          10.91107
11. BOB                                   1.27667        0.19026          14.90311

Consensus Means Analysis
(Summary Statistics Case)

Data Summary:
Mean Variable: MX
SD Variable: SX
Sample Size Variable: NX
Total Number of Observations:            14
Grand Mean:                              14.46429
Grand Standard Deviation:                1.47455
Total Number of Labs:                    3
Minimum Lab Mean:                        13.60000
Maximum Lab Mean:                        15.00000
Minimum Lab SD:                          0.04000
Maximum Lab SD:                          1.90000
Within Lab (pooled) SD:                  1.52116
Within Lab (pooled) Variance:            2.31393
Mean of Lab Means:                       14.16667
SD of Lab Means:                         0.73711

Table 1: Summary Statistics by Lab

----------------------------------------------------------------------------
Standard
Lab                                             Standard      Deviation
ID    n(i)           Mean       Variance      Deviation    of the Mean
----------------------------------------------------------------------------
1       3       13.90000        0.09000        0.30000        0.17321
2       3       13.60000        0.00160        0.04000        0.02309
3       8       15.00000        3.61000        1.90000        0.67175

1. Method: Mandel-Paule
Estimate of (unscaled) Consensus Mean: 13.94840
Estimate of (scaled) Consensus Mean:   0.24886
Between Lab Variance (unscaled):       0.26733
Between Lab SD (unscaled):             0.51704
Between Lab Variance (scaled):         0.13639
Standard Deviation of Consensus Mean:  0.23146
Standard Uncertainty (k = 1):          0.23146
Expanded Uncertainty (k = 2):          0.46292
Expanded Uncertainty (k =  1.9599640): 0.45365
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   13.49475
Upper 95% (normal) Confidence Limit:   14.40205
Note: Mandel-Paule Best Usage:
6 or More Labs:

2. Method: Modified Mandel-Paule
Estimate of (unscaled) Consensus Mean:  13.85264
Estimate of (scaled) Consensus Mean:    0.18047
Between Lab Variance (unscaled):        0.10383
Between Lab SD (unscaled):              0.32222
Between Lab Variance (scaled):          0.05297
Standard Deviation of Consensus Mean:   0.16986
Standard Uncertainty (k = 1):           0.16986
Expanded Uncertainty (k = 2):           0.33972
Expanded Uncertainty (k =  1.9599640):  0.33292
Normal PPF of 0.975:                    1.95996
Lower 95% (normal) Confidence Limit:    13.51973
Upper 95% (normal) Confidence Limit:    14.18556
Note: Modified Mandel-Paule Best Usage:
6 or More Labs:

3. Method: Vangel-Rukhin Maximum Likelihood
Estimate of (unscaled) Consensus Mean: 13.97095
Estimate of (scaled) Consensus Mean:   0.26497
Between Lab Variance (unscaled):       3.62182
Between Lab SD (unscaled):             1.90311
Between Lab Variance (scaled):         1.84784
Standard Deviation of Consensus Mean:  0.09755
Standard Uncertainty (k = 1):          0.09755
Expanded Uncertainty (k = 2):          0.19509
Expanded Uncertainty (k =  1.9599640): 0.19119
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   13.77976
Upper 95% (normal) Confidence Limit:   14.16214
Note: Vangel-Rukhin Maximum Likelihood
Best Usage: 6 or More Labs

4a. Method: DerSimonian Laird (original variance)
Estimate of Consensus Mean:             13.63630
Estimate of Variance of Consensus Mean: 0.00296
Estimate of Between Lab Variance:       0.00275
Standard Uncertainty (k = 1):           0.05445
Expanded Uncertainty (k = 2):           0.10889
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   13.40203
Upper 95% (t-value) Confidence Limit:   13.87057
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4b. Method: DerSimonian Laird - Horn-Horn-Duncan Variance
Estimate of Consensus Mean:             13.63630
Estimate of Variance of Consensus Mean: 0.01177
Estimate of Between Lab Variance:       0.00275
Standard Uncertainty (k = 1):           0.10851
Expanded Uncertainty (k = 2):           0.21702
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   13.16941
Upper 95% (t-value) Confidence Limit:   14.10319
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4d. Method: DerSimonian Laird - Bootstrap Variance
Number of Bootstrap Samples                        100000
Estimate of Consensus Mean:                        13.63630
Estimate of Variance of Consensus Mean:            0.00855
Standard Uncertainty (k = 1):                      0.09248
Expanded Uncertainty (k = 2):                      0.18497
Lower 95% (percentile bootstrap) Confidence Limit: 13.44749
Upper 95% (percentile bootstrap) Confidence Limit: 13.82056
Lower 95% (symmetric bootstrap) Confidence Limit:  13.44749
Upper 95% (symmetric bootstrap) Confidence Limit:  13.82510
K (symmetric bootstrap) Coverage Factor:           2.04152
Lower 95% (kernel bootstrap) Confidence Limit:     13.44899
Upper 95% (kernel bootstrap) Confidence Limit:     13.82378
K (kernel bootstrap) Coverage Factor:              2.02727
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

11. Method: BOB (Bound on Bias)
Estimate of Consensus Mean:          14.16667
Within Lab Uncertainty:              0.23137
Between Lab Uncertainty:             0.40415
Standard Uncertainty (k = 1):        0.46569
Expanded Uncertainty (k = 2):        0.93137
Lower 95% (k = 2) Confidence Limit:  13.23529
Upper 95% (k = 2) Confidence Limit:  15.09804
Note: BOB Best Usage:
5 or Fewer Labs:

Table 2:  95% Confidence Limits

----------------------------------------------------------------------------------------------------
Consensus          Lower          Upper    Uncertainty
Method                                             Mean          Limit          Limit         (k*SE)
----------------------------------------------------------------------------------------------------
1. Mandel-Paule                              13.94840       13.49475       14.40205        0.45365
2. Modified Mandel-Paule                     13.85264       13.51973       14.18556        0.33292
3a. Vangel-Rukhin ML                          13.97095       13.77976       14.16214        0.19119
4a. DerSimonian-Laird (original)              13.63630       13.40203       13.87057        0.23427
4b. DerSimonian-Laird (H-H-D)                 13.63630       13.16941       14.10319        0.46689
4d. DerSimonian-Laird (perc. bootstrap)       13.63630       13.44749       13.82056        0.18880
4d. DerSimonian-Laird (symm. bootstrap)       13.63630       13.44749       13.82510        0.18880
4d. DerSimonian-Laird (kern bootstrap)        13.63630       13.44899       13.82378        0.18749
11. BOB                                       14.16667       13.23529       15.09804        0.93137

Table 3:  Standard Uncertainties (k = 1)

-----------------------------------------------------------------------------------
Standard          Relative
Consensus    Uncertainty          Standard
Method                                        Mean        (k = 1)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                         13.94840        0.23146           1.65939
2. Modified Mandel-Paule                13.85264        0.16986           1.22618
3a. Vangel-Rukhin ML                     13.97095        0.09755           0.69821
4a. DerSimonian-Laird (original)         13.63630        0.05445           0.39928
4b. DerSimonian-Laird (H-H-D)            13.63630        0.10851           0.79576
4d. DerSimonian-Laird (bootstrap)        13.63630        0.09248           0.67821
11. BOB                                  14.16667        0.46569           3.28721

Table 4:  Expanded Uncertainties (k = 2)

-----------------------------------------------------------------------------------
Expanded          Relative
Consensus    Uncertainty          Expanded
Method                                        Mean        (k = 2)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                         13.94840        0.46292           3.31878
2. Modified Mandel-Paule                13.85264        0.33972           2.45237
3a. Vangel-Rukhin ML                     13.97095        0.19509           1.39641
4a. DerSimonian-Laird (original)         13.63630        0.10889           0.79856
4b. DerSimonian-Laird (H-H-D)            13.63630        0.21702           1.59152
4d. DerSimonian-Laird (bootstrap)        13.63630        0.18497           1.35642
11. BOB                                  14.16667        0.93137           6.57441

Consensus Means Analysis
(Summary Statistics Case)

Data Summary:
Mean Variable: MX
SD Variable: SX
Sample Size Variable: NX
Total Number of Observations:            14
Grand Mean:                              19.07857
Grand Standard Deviation:                1.78182
Total Number of Labs:                    3
Minimum Lab Mean:                        18.10000
Maximum Lab Mean:                        19.70000
Minimum Lab SD:                          0.50000
Maximum Lab SD:                          2.00000
Within Lab (pooled) SD:                  1.63707
Within Lab (pooled) Variance:            2.68000
Mean of Lab Means:                       18.73333
SD of Lab Means:                         0.85049

Table 1: Summary Statistics by Lab

----------------------------------------------------------------------------
Standard
Lab                                             Standard      Deviation
ID    n(i)           Mean       Variance      Deviation    of the Mean
----------------------------------------------------------------------------
1       3       18.10000        0.49000        0.70000        0.40415
2       3       18.40000        0.25000        0.50000        0.28868
3       8       19.70000        4.00000        2.00000        0.70711

1. Method: Mandel-Paule
Estimate of (unscaled) Consensus Mean: 18.57390
Estimate of (scaled) Consensus Mean:   0.29619
Between Lab Variance (unscaled):       0.34970
Between Lab SD (unscaled):             0.59135
Between Lab Variance (scaled):         0.13660
Standard Deviation of Consensus Mean:  0.30625
Standard Uncertainty (k = 1):          0.30625
Expanded Uncertainty (k = 2):          0.61251
Expanded Uncertainty (k =  1.9599640): 0.60025
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   17.97365
Upper 95% (normal) Confidence Limit:   19.17415
Note: Mandel-Paule Best Usage:
6 or More Labs:

2. Method: Modified Mandel-Paule
Estimate of (unscaled) Consensus Mean:  18.49855
Estimate of (scaled) Consensus Mean:    0.24910
Between Lab Variance (unscaled):        0.10964
Between Lab SD (unscaled):              0.33112
Between Lab Variance (scaled):          0.04283
Standard Deviation of Consensus Mean:   0.23892
Standard Uncertainty (k = 1):           0.23892
Expanded Uncertainty (k = 2):           0.47784
Expanded Uncertainty (k =  1.9599640):  0.46828
Normal PPF of 0.975:                    1.95996
Lower 95% (normal) Confidence Limit:    18.03028
Upper 95% (normal) Confidence Limit:    18.96683
Note: Modified Mandel-Paule Best Usage:
6 or More Labs:

3. Method: Vangel-Rukhin Maximum Likelihood
Estimate of (unscaled) Consensus Mean: 18.73333
Estimate of (scaled) Consensus Mean:   0.39584
Between Lab Variance (unscaled):       0.48222
Between Lab SD (unscaled):             0.69442
Between Lab Variance (scaled):         0.18837
Standard Deviation of Consensus Mean:  0.40092
Standard Uncertainty (k = 1):          0.40092
Expanded Uncertainty (k = 2):          0.80185
Expanded Uncertainty (k =  1.9599640): 0.78580
Normal PPF of 0.975:                   1.95996
Lower 95% (normal) Confidence Limit:   17.94754
Upper 95% (normal) Confidence Limit:   19.51913
Note: Vangel-Rukhin Maximum Likelihood
Best Usage: 6 or More Labs

4a. Method: DerSimonian Laird (original variance)
Estimate of Consensus Mean:             18.49153
Estimate of Variance of Consensus Mean: 0.08945
Estimate of Between Lab Variance:       0.09459
Standard Uncertainty (k = 1):           0.29908
Expanded Uncertainty (k = 2):           0.59817
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   17.20468
Upper 95% (t-value) Confidence Limit:   19.77838
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4b. Method: DerSimonian Laird - Horn-Horn-Duncan Variance
Estimate of Consensus Mean:             18.49153
Estimate of Variance of Consensus Mean: 0.07139
Estimate of Between Lab Variance:       0.09459
Standard Uncertainty (k = 1):           0.26719
Expanded Uncertainty (k = 2):           0.53439
Degrees of Freedom:                     2
t Percent Point Value:                  4.30265
Lower 95% (t-value) Confidence Limit:   17.34189
Upper 95% (t-value) Confidence Limit:   19.64117
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

4d. Method: DerSimonian Laird - Bootstrap Variance
Number of Bootstrap Samples                        100000
Estimate of Consensus Mean:                        18.49153
Estimate of Variance of Consensus Mean:            0.10234
Standard Uncertainty (k = 1):                      0.31991
Expanded Uncertainty (k = 2):                      0.63982
Lower 95% (percentile bootstrap) Confidence Limit: 17.86617
Upper 95% (percentile bootstrap) Confidence Limit: 19.11832
Lower 95% (symmetric bootstrap) Confidence Limit:  17.86474
Upper 95% (symmetric bootstrap) Confidence Limit:  19.11832
K (symmetric bootstrap) Coverage Factor:           1.95928
Lower 95% (kernel bootstrap) Confidence Limit:     17.86219
Upper 95% (kernel bootstrap) Confidence Limit:     19.11986
K (kernel bootstrap) Coverage Factor:              1.96411
Note: DerSimonian-Laird Best Usage:
Any Number of Labs:

11. Method: BOB (Bound on Bias)
Estimate of Consensus Mean:          18.73333
Within Lab Uncertainty:              0.28803
Between Lab Uncertainty:             0.46188
Standard Uncertainty (k = 1):        0.54433
Expanded Uncertainty (k = 2):        1.08866
Lower 95% (k = 2) Confidence Limit:  17.64467
Upper 95% (k = 2) Confidence Limit:  19.82200
Note: BOB Best Usage:
5 or Fewer Labs:

Table 2:  95% Confidence Limits

----------------------------------------------------------------------------------------------------
Consensus          Lower          Upper    Uncertainty
Method                                             Mean          Limit          Limit         (k*SE)
----------------------------------------------------------------------------------------------------
1. Mandel-Paule                              18.57390       17.97365       19.17415        0.60025
2. Modified Mandel-Paule                     18.49855       18.03028       18.96683        0.46828
3a. Vangel-Rukhin ML                          18.73333       17.94754       19.51913        0.78580
4a. DerSimonian-Laird (original)              18.49153       17.20468       19.77838        1.28685
4b. DerSimonian-Laird (H-H-D)                 18.49153       17.34189       19.64117        1.14964
4d. DerSimonian-Laird (perc. bootstrap)       18.49153       17.86617       19.11832        0.62679
4d. DerSimonian-Laird (symm. bootstrap)       18.49153       17.86474       19.11832        0.62679
4d. DerSimonian-Laird (kern bootstrap)        18.49153       17.86219       19.11986        0.62934
11. BOB                                       18.73333       17.64467       19.82200        1.08866

Table 3:  Standard Uncertainties (k = 1)

-----------------------------------------------------------------------------------
Standard          Relative
Consensus    Uncertainty          Standard
Method                                        Mean        (k = 1)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                         18.57390        0.30625           1.64885
2. Modified Mandel-Paule                18.49855        0.23892           1.29157
3a. Vangel-Rukhin ML                     18.73333        0.40092           2.14017
4a. DerSimonian-Laird (original)         18.49153        0.29908           1.61741
4b. DerSimonian-Laird (H-H-D)            18.49153        0.26719           1.44495
4d. DerSimonian-Laird (bootstrap)        18.49153        0.31991           1.73002
11. BOB                                  18.73333        0.54433           2.90568

Table 4:  Expanded Uncertainties (k = 2)

-----------------------------------------------------------------------------------
Expanded          Relative
Consensus    Uncertainty          Expanded
Method                                        Mean        (k = 2)   Uncertainty (%)
-----------------------------------------------------------------------------------
1. Mandel-Paule                         18.57390        0.61251           3.29769
2. Modified Mandel-Paule                18.49855        0.47784           2.58313
3a. Vangel-Rukhin ML                     18.73333        0.80185           4.28034
4a. DerSimonian-Laird (original)         18.49153        0.59817           3.23482
4b. DerSimonian-Laird (H-H-D)            18.49153        0.53439           2.88989
4d. DerSimonian-Laird (bootstrap)        18.49153        0.63982           3.46004
11. BOB                                  18.73333        1.08866           5.81136



NIST is an agency of the U.S. Commerce Department.

Date created: 06/05/2001
Last updated: 05/17/2016