 Dataplot Vol 1 Vol 2

# RANK SUM TEST

Name:
RANK SUM TEST
Type:
Analysis Command
Purpose:
Perform a two sample rank sum test.
Description:
The t-test is the standard test for testing that the difference between population means for two non-paired samples are equal. If the populations are non-normal, particularly for small samples, then the t-test may not be valid. The rank sum test is an alternative that can be applied when distributional assumptions are suspect. However, it is not as powerful as the t-test when the distributional assumptions are in fact valid.

The rank sum test is also commonly called the Mann-Whitney rank sum test or simply the Mann-Whitney test. Note that even though this test is commonly called the Mann-Whitney test, it was in fact developed by Wilcoxon.

To form the rank sum test, rank the combined samples. Then compute the sum of the ranks for sample one, T1, and the sum of the ranks for sample two, T2. If the sample sizes are equal, the rank sum test statistic is the minimum of T1 and T2. If the sample sizes are unequal, then find T1 equal the sum of the ranks for the smaller sample. Then compute T2 = n1(n1 + n2 + 1) - T1. T is the minimum of T1 and T2. Sufficiently small values of T cause rejection of the null hypothesis that the sample means are equal.

Significance levels have been tabulated for small values of n1 and n2. For sufficiently large n1 and n2, the following normal approximation is used:

$$Z = \frac{|\mu - T| - 0.5} {\sigma}$$

where

$$\mu = n_1(n_1 + n_2 + 1)/2$$

$$\sigma = \sqrt{n_2 \mu /6}$$

Syntax:
RANK SUM TEST <y1> <y2>             <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
RANK SUM TEST Y1 Y2
RANK SUM TEST Y1 Y2 SUBSET TAG > 2
Note:
Dataplot saves the following internal parameters after a rank sum test:

 STATVAL - The rank sum test statistic STATCD2 - the normal cdf value of T (only applies for sufficiently large N1 and N2) CUTLOW90 - 0.05 critical value CUTUPP90 - 0.95 critical value CUTLOW95 - 0.025 critical value CUTUPP95 - 0.975 critical value CUTLOW99 - 0.005 critical value CUTUPP99 - 0.995 critical value

Note that the above critical values are the lower and upper tails for two sided tests (i.e., each tail is alpha/2. For example, CUTLOW90 is the lower 5% of the normal percent point function (adjusted for the mean and standard deviation). This is the critical regions for alpha = 0.10, so there is 0.05 in each tail.

Note:
The following statistics are also supported:

LET A = MANN WHITNEY RANK SUM TEST Y1 Y2
LET A = MANN WHITNEY RANK SUM TEST CDF Y1 Y2
LET A = MANN WHITNEY RANK SUM TEST PVALUE Y1 Y2
LET A = MANN WHITNEY RANK SUM LOWER TAILED PVALUE Y1 Y2
LET A = MANN WHITNEY RANK SUM UPPER TAILED PVALUE Y1 Y2

In addition to the above LET command, built-in statistics are supported for about 20+ different commands (enter HELP STATISTICS for details).

Default:
None
Synonyms:
The following are synonyms for RANK SUM TEST:

MANN WHITNEY RANK SUM TEST
MANN WHITNEY RANK SUM
MANN WHITNEY TEST
MANN WHITNEY
RANK SUM
Related Commands:
 T-TEST = Compute a t-test. SIGN TEST = Compute a sign test. SIGNED RANK TEST = Compute a signed rank test. CHI-SQUARED 2 SAMPLE TEST = Compute a two sample chi-square test. BIHISTOGRAM = Generates a bihistogram. QUANTILE-QUANTILE PLOT = Generate a quantile-quantile plot. BOX PLOT = Generates a box plot.
Reference:
Snedecor and Cochran (1989), "Statistical Methods," Eighth Edition, Iowa State University Press, pp. 142-144.
Applications:
Confirmatory Data Analysis
Implementation Date:
1999/5
Program:

SKIP 25
RETAIN Y2 SUBSET Y2 > -90
SET WRITE DECIMALS 4
RANK SUM TEST Y1 Y2

The following output is generated.
            Two Sample Two-Sided Mann Whitney Rank Sum Test
(Conover Formulation)

First Response Variable: Y1
Second Response Variable: Y2

H0: F(x) = G(x)   for all x
Ha: F(x) <> G(x)  for some x

Summary Statistics:
Number of Observations for Sample 1:     13
Mean for Sample 1:                       80.0208
Median for Sample 1:                     80.0300
Number of Observations for Sample 2:     8
Mean for Sample 2:                       79.9788
Median for Sample 2:                     79.9700
Number of Tied Ranks:                    14

Test (Normal Approximation):
Test Statistic Value (W):                2.7105
CDF Value:                               0.9966
P-Value (2-tailed test):                 0.0067
P-Value (lower-tailed test):             0.9966
P-Value (upper-tailed test):             0.0034

Two-Tailed Test: Normal Approximation

H0: F(x) = G(x); Ha: F(x) <> G(x)  for some x
------------------------------------------------------------
Null
Significance           Test       Critical     Hypothesis
Level      Statistic    Value (+/-)     Conclusion
------------------------------------------------------------
80.0%         2.7105         1.2816         REJECT
90.0%         2.7105         1.6449         REJECT
95.0%         2.7105         1.9600         REJECT
99.0%         2.7105         2.5758         REJECT


NIST is an agency of the U.S. Commerce Department.

Date created: 06/05/2001
Last updated: 11/04/2015