Dataplot Vol 1 Vol 2

# ODDS RATIO CHI-SQUARE TEST

Name:
ODDS RATIO CHI-SQUARE TEST (LET)
Type:
Analysis Command
Purpose:
Perform an odds ratio chi-square test of a series of fourfold (2x2) tables.
Description:
Given two variables where each variable has exactly two possible outcomes (typically defined as success and failure), we define the odds ratio as:

o = (N11/N12)/ (N21/N22)
= (N11N22)/ (N12N21)

where

N11 = number of successes in sample 1
N21 = number of failures in sample 1
N12 = number of successes in sample 2
N22 = number of failures in sample 2

The first definition shows the meaning of the odds ratio clearly, although it is more commonly given in the literature with the second definition.

The log odds ratio is the logarithm of the odds ratio:

l(o) = LOG{(N11/N12)/ (N21/N22)}
= LOG{(N11N22)/ (N12N21)}

Alternatively, the log odds ratio can be given in terms of the proportions

l(o) = LOG{(p11/p12)/ (p21/p22)}
= LOG{(p11p22)/ (p12p21)}

where

p11 = N11/ (N11 + N21)
= proportion of successes in sample 1
p21 = N21/ (N11 + N21)
= proportion of failures in sample 1
p12 = N12/ (N12 + N22)
= proportion of successes in sample 2
p22 = N22/ (N12 + N22)
= proportion of failures in sample 2

Success and failure can denote any binary response. Dataplot expects "success" to be coded as "1" and "failure" to be coded as "0".

The bias corrected version of the statistic is:

l'(o) = LOG[{(N11+0.5) (N22+0.5)}/ {(N12+0.5) (N21+0.5)}]

In addition to reducing bias, this statistic also has the advantage that the odds ratio is still defined even when N12 or N21 is zero (the uncorrected statistic will be undefined for these cases).

Note that N11, N21, N12, and N22 defines a 2x2 contingency table. These types of contingency tables are also referred to as fourfold tables.

The odds ratio chi-square test is applied in the situation where we have a series of fourfold tables. That is, the two variables for the fourfold tables are the same, but data is collected from different populations or groups with regards to these variables. Fleiss, Levin, and Paik (p. 234) list the following questions that are typically asked about these type of data:

1. Is there evidence that the degree of association, whatever its magnitude, is consistent from one group to another?

2. Assuming that the degree of association is found to be consistent, is the common degree of association statistically significant?

3. Assuming that the common degree of association is significant, what is the best estimate of the common value for the measure of association? What is its standard error? How does one construct a confidence interval for the common measure?

The following description for this test is summarized from Chapter 10 of Fleiss, Levin, and Paik. Consult this reference for a more detailed discussion.

Suppose we have g fourfold tables. Then

 yi = measure of association for table i syi = standard error of yi wi = $$1/s_{y_{i}}^{2}$$ g = number of groups (i.e., number of 2x2 tables)

This test is based on decomposing the total chi-square in the following way:

$$\begin{array}{lcl} \chi_{\mbox{total}}^{2} & = & \sum_{i=1}^{g}{w_{i} y_{i}^{2}} \\ & = & \chi_{\mbox{homogeneity}}^{2} + \chi_{\mbox{association}}^{2} \end{array}$$

The $$\chi_{\mbox{homogeneity}}^{2}$$ assesses the degree of homogeneity (i.e., equality) among the g measures of association. The $$\chi_{\mbox{association}}^{2}$$ assesses the significance of the average degree of association.

The overall measure of association (across all groups) is the weighted average of the g individual measures:

$$\bar{y} = \frac{\sum_{i=1}^{g}{w_{i} y_{i}}} {\sum_{i=1}^{g}{w_{i}}}$$

Under the hypothesis of zero overall association, $$\bar{Y}$$ has an average value of zero and a standard error of

$$\bar{y} = \frac{\sum_{i=1}^{g}{w_{i} y_{i}}} {\sum_{i=1}^{g}{w_{i}}}$$

From this

$$\frac{\bar{y}} {s_{\bar{y}}} = \frac{\sum_{i=1}^{g}{w_{i} y_{i}}} {\sqrt{\sum_{i=1}^{g}{w_{i}}}}$$

follows an approximately a standard normal distribution under the null hypothesis and

$$\begin{array}{lcl} \chi_{\mbox{association}}^{2} & = & \bar{y}^{2} \sum_{i=1}^{g}{w_{i}} \\ & = & \frac{\left( \sum_{i=1}^{g}{w_{i} y_{i}} \right)^2} {\sum_{i=1}^{g}{w_{i}}} \end{array}$$

follows an approximately chi-square distribution with one degree of freedom.

$$\begin{array}{lcl} \chi_{\mbox{homogeneity}}^{2} & = & \chi_{\mbox{total}}^{2} - \chi_{\mbox{association}}^{2} \\ & = & \sum_{i=1}^{g}{w_{i} y_{i}^2} - \bar{y}^{2} \sum_{i=1}^{g}{w_{i}} \\ & = & \sum_{i=1}^{g}{w_{i} (y_{i} - \bar{y})^2} \end{array}$$

follows an approximately chi-square distribution with g - 1 degrees of freedom.

Note that $$\chi_{\mbox{association}}^{2}$$ and $$\chi_{\mbox{homogeneity}}^{2}$$ are uncorrelated.

Based on the above formulas, we can answer the above questions as follows.

1. Consistency of association can be tested using the $$\chi_{\mbox{homogeneity}}^{2}$$ statistic. If this statistic is significant, this indicates that groups are different with respect to the measure of association.

2. If $$\chi_{\mbox{homogeneity}}^{2}$$ is not signficant (i.e., the groups can be considered equivalent), then the overall degree of association can be tested using the $$\chi_{\mbox{association}}^{2}$$ statistic.

3. The estimate of overall association is and a large sample confidence interval is

$$\bar{y} \pm \Phi^{-1}(\alpha/2) s_{\bar{y}}$$

The above discussion is based on a generic statistic for the measure of association. For the odds ratio chi-square test, the specific measure of association is the bias corrected log odds ratio (given above). Note that the standard error of the bias corrected log odds ratio is:

$$s_{l'(o)} = \sqrt{\frac{1}{N_{11}+0.5} + \frac{1}{N_{21}+0.5} + \frac{1}{N_{12}+0.5} + \frac{1}{N_{22}+0.5}}$$

The ODDS RATIO CHI-SQUARE TEST generates the following output:

1. A summary table of various statistics (odds ratio, log(odds ratio), standard error of log(odds ratio), wi and wi*log(odds ratio)).

2. A table summarizing the combined log(odds ratio) and its standard error and the chi-square test statistics (total, association, and homogeneity).

3. A table for the chi-square test for homogeneity.

4. A table for the chi-square test for overall degree of association.

5. Estimates and large sample confidence intervals for the common log(odds ratio) and the common odds ratio.
Syntax 1:
ODDS RATIO CHI-SQUARE TEST <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax is used for the case where <y1> and <y2> denote a series of 2x2 tables (i.e., rows 1 and 2 are group 1, rows 3 and 4 are group 2, and so on).

Syntax 2:
ODDS RATIO CHI-SQUARE TEST <y1> <y2> <groupid>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<groupid> is a group id variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table). In this case, the two response variables have an equal number of cases for each group.

Syntax 3:
ODDS RATIO CHI-SQUARE TEST <y1> <groupid1> <y2> <groupid2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<groupid1> is a group id variable corresponding to <y1>;
<y2> is the second response variable;
<groupid2> is a group id variable corresponding to <y2>;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax is used for the case where you have raw data (i.e., the data has not yet been cross tabulated into a two-way table). In this case, the two response variables may have an unequal number of cases for each group, so <y1> and <y2> require different group id variables.

Examples:
ODDS RATIO CHI-SQUARE TEST Y1 Y2
ODDS RATIO CHI-SQUARE TEST Y1 Y2 X
ODDS RATIO CHI-SQUARE TEST Y1 X1 Y2 X2
Note:
This test is similar to the Mantel-Haenszel test. Fleiss, Levin, and Paik make the following recommendations in regard to these two tests (they include other tests in their comparison).

1. If the number of groups is small or moderate and the sample sizes within each group are large, the log(odds ratio) test performs well.

2. If the number of groups is large, but the sample sizes within the groups are small to moderate, then the Mantel-Haenszel test can be recommended. The log(odds ratio) test may perform poorly for this case.

3. If the number of groups and the sample sizes within the groups are both small, exact methods may be required. Dataplot does not currently support any exact methods for this problem.
Note:
The following information is written to the file dpst1f.dat (in the current directory):

 Column 1 = significance level Column 2 = lower confidence limit for common log(odds ratio) Column 3 = upper confidence limit for common log(odds ratio) Column 4 = lower confidence limit for common odds ratio Column 5 = upper confidence limit for common odds ratio

To read this information into Dataplot, enter

READ DPST1F.DAT SIGLEV LOGLOWCL LOGUPPCL ODDLOWCL ODDUPPCL

Dataplot saves the following internal parameters:

 STATTOT = the "total" test statistic CDFTOTAL = the cdf for the "total" test statistic STATASSO = the "association" test statistic CDFASSOC = the cdf for the "association" test statistic STATHOMO = the "homogeneity" test statistic CDFHOMOG = the cdf for the "homogeneity" test statistic
Default:
None
Synonyms:
None
Related Commands:
 MANTEL-HAENSZEL TEST = Perform a Mantel-Haenszel test. ODDS RATIO INDEPENDENCE TEST = Perform a log(odds ratio) independence test. CHI-SQUARE INDEPENDENCE TEST = Perform a chi-square independence test. FISHER EXACT TEST = Perform Fisher's exact test. ASSOCIATION PLOT = Generate an association plot. SIEVE PLOT = Generate a sieve plot. ROSE PLOT = Generate a Rose plot. BINARY TABULATION PLOT = Generate a binary tabulation plot. ROC CURVE = Generate a ROC curve. ODDS RATIO = Compute the bias corrected odds ratio. LOG ODDS RATIO = Compute the bias corrected log(odds ratio).
Reference:
Fleiss, Levin, and Paik (2003), Statistical Methods for Rates and Proportions, Third Edition, pp. 234-238.
Applications:
Categorical Data Analysis
Implementation Date:
2007/5
Program:

let n1 = 105
let n2 = 192
let n3 = 145
let n = n1 + n2 + n3
let x = 3 for i = 1 1 n
let istop = n1 + n2
let x = 2 for i = 1 1 istop
let x = 1 for i = 1 1 n1
.
set statistic missing value -99
.
.  Group 1 values
.
let y1 = 0 for i = 1 1 n
let y2 = 0 for i = 1 1 n
let y1 = 1 for i = 1 1  81
let y2 = 1 for i = 1 1  34
.
.  Group 2 values (have unequal samples here, so fill
.          with missing values
.
let istrt = n1 + 1
let istop1 = istrt + 118 - 1
let istop2 = istrt + 69 - 1
let y1 = 1 for i = istrt 1 istop1
let y2 = 1 for i = istrt 1 istop2
let istrt2 = n1 + 174 + 1
let istop2 = n1 + n2
let y2 = -99 for i = istrt2 1 istop2
.
.  Group 3 values
.
let istrt = n1 + n2 + 1
let istop1 = istrt + 82 - 1
let istop2 = istrt + 52 - 1
let y1 = 1 for i = istrt 1 istop1
let y2 = 1 for i = istrt 1 istop2
.
odds ratio chi-square test y1 y2 x

The following output is generated.
                   SUMMARY OF LOG(ODDS RATIO)

|                    LOG OF        STANDARD
|   ODDS RATIO     ODDS RATIO        ERROR    1/SE(L(i))**2        w(i)*
GROUP |      O(i)          L(i)          SE(L(i))        w(i)          L(i)**2
===============================================================================
1. |    6.894114       1.930668      0.3099319       10.41040       38.80455
2. |    2.414514      0.8814980      0.2138429       21.86806       16.99233
3. |    2.313836      0.8389067      0.2400251       17.35748       12.21558
===============================================================================
TOTAL |                                                 49.63593       68.01245

CHI-SQUARE ANALYSIS OF LOG(ODDS RATIO)

NUMBER OF GROUPS                            =        3
ESTIMATE OF COMBINED LOG(ODDS RATIO)        =    1.086652
STANDARD ERROR OF COMBINED LOG(ODDS RATIO)  =   0.1419390

CHI-SQUARE TEST STATISTIC (TOTAL)           =    68.01245
DEGRESS OF FREEDOM                          =        3
CDF OF TEST STATISTIC                       =    1.000000

CHI-SQUARE TEST STATISTIC (ASSOCIATION)     =    58.61073
DEGRESS OF FREEDOM                          =        1
CDF OF TEST STATISTIC                       =    1.000000

CHI-SQUARE TEST STATISTIC (HOMOGENEITY)     =    9.401718
DEGRESS OF FREEDOM                          =        2
CDF OF TEST STATISTIC                       =   0.9978321

CHI-SQUARE TEST FOR CONSISTENCY OF ASSOCIATION (HOMOGENEITY)
NULL HYPOTHESIS   NULL
NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
===================================================================
CONSISTENT       50.0%        1.39     (0,0.500)        REJECT
CONSISTENT       80.0%        3.22     (0,0.800)        REJECT
CONSISTENT       90.0%        4.61     (0,0.900)        REJECT
CONSISTENT       95.0%        5.99     (0,0.950)        REJECT
CONSISTENT       97.5%        7.38     (0,0.975)        REJECT
CONSISTENT       99.0%        9.21     (0,0.990)        REJECT

CHI-SQUARE TEST FOR OVERALL DEGREE OF ASSOCIATION
NULL HYPOTHESIS   NULL
NULL          CONFIDENCE    CRITICAL  ACCEPTANCE        HYPOTHESIS
HYPOTHESIS    LEVEL         VALUE     INTERVAL          CONCLUSION
===================================================================
NO ASSOCIATION   50.0%        0.45     (0,0.500)        REJECT
NO ASSOCIATION   80.0%        1.64     (0,0.800)        REJECT
NO ASSOCIATION   90.0%        2.71     (0,0.900)        REJECT
NO ASSOCIATION   95.0%        3.84     (0,0.950)        REJECT
NO ASSOCIATION   97.5%        5.02     (0,0.975)        REJECT
NO ASSOCIATION   99.0%        6.63     (0,0.990)        REJECT

LARGE SAMPLE CONFIDENCE INTERVAL FOR LOG(ODDS RATIO)
LOG(ODDS RATIO)                  ODDS RATIO
(   1.086652    )           (   2.964333    )
CONFIDENCE           LOWER         UPPER         LOWER         UPPER
VALUE (%)            LIMIT         LIMIT         LIMIT         LIMIT
-----------------------------------------------------------------------
50.000          0.990915       1.18239       2.69370       3.26216
80.000          0.904750       1.26855       2.47131       3.55571
90.000          0.853183       1.32012       2.34711       3.74387
95.000          0.808457       1.36485       2.24444       3.91513
97.500          0.768509       1.40479       2.15655       4.07469
99.000          0.721041       1.45226       2.05657       4.27277


NIST is an agency of the U.S. Commerce Department.

Date created: 10/10/2008
Last updated: 11/04/2015