SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

RANK SUM TEST

Name:
    RANK SUM TEST
Type:
    Analysis Command
Purpose:
    Perform a two sample rank sum test.
Description:
    The t-test is the standard test for testing that the difference between population means for two non-paired samples are equal. If the populations are non-normal, particularly for small samples, then the t-test may not be valid. The rank sum test is an alternative that can be applied when distributional assumptions are suspect. However, it is not as powerful as the t-test when the distributional assumptions are in fact valid.

    The rank sum test is also commonly called the Mann-Whitney rank sum test or simply the Mann-Whitney test. Note that even though this test is commonly called the Mann-Whitney test, it was in fact developed by Wilcoxon.

    To form the rank sum test, rank the combined samples. Then compute the sum of the ranks for sample one, T1, and the sum of the ranks for sample two, T2. If the sample sizes are equal, the rank sum test statistic is the minimum of T1 and T2. If the sample sizes are unequal, then find T1 equal the sum of the ranks for the smaller sample. Then compute T2 = n1(n1 + n2 + 1) - T1. T is the minimum of T1 and T2. Sufficiently small values of T cause rejection of the null hypothesis that the sample means are equal.

    Significance levels have been tabulated for small values of n1 and n2. For sufficiently large n1 and n2, the following normal approximation is used:

      \( Z = \frac{|\mu - T| - 0.5} {\sigma} \)

    where

      \( \mu = n_1(n_1 + n_2 + 1)/2 \)

      \( \sigma = \sqrt{n_2 \mu /6} \)

Syntax:
    RANK SUM TEST <y1> <y2>             <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.
Examples:
    RANK SUM TEST Y1 Y2
    RANK SUM TEST Y1 Y2 SUBSET TAG > 2
Note:
    Dataplot saves the following internal parameters after a rank sum test:

      STATVAL - The rank sum test statistic
      STATCD2 - the normal cdf value of T (only applies for sufficiently large N1 and N2)
      CUTLOW90 - 0.05 critical value
      CUTUPP90 - 0.95 critical value
      CUTLOW95 - 0.025 critical value
      CUTUPP95 - 0.975 critical value
      CUTLOW99 - 0.005 critical value
      CUTUPP99 - 0.995 critical value

    Note that the above critical values are the lower and upper tails for two sided tests (i.e., each tail is alpha/2. For example, CUTLOW90 is the lower 5% of the normal percent point function (adjusted for the mean and standard deviation). This is the critical regions for alpha = 0.10, so there is 0.05 in each tail.

Note:
    The following statistics are also supported:

      LET A = MANN WHITNEY RANK SUM TEST Y1 Y2
      LET A = MANN WHITNEY RANK SUM TEST CDF Y1 Y2
      LET A = MANN WHITNEY RANK SUM TEST PVALUE Y1 Y2
      LET A = MANN WHITNEY RANK SUM LOWER TAILED PVALUE Y1 Y2
      LET A = MANN WHITNEY RANK SUM UPPER TAILED PVALUE Y1 Y2

    In addition to the above LET command, built-in statistics are supported for about 20+ different commands (enter HELP STATISTICS for details).

Default:
    None
Synonyms:
    The following are synonyms for RANK SUM TEST:

      MANN WHITNEY RANK SUM TEST
      MANN WHITNEY RANK SUM
      MANN WHITNEY TEST
      MANN WHITNEY
      RANK SUM
Related Commands: Reference:
    Snedecor and Cochran (1989), "Statistical Methods," Eighth Edition, Iowa State University Press, pp. 142-144.
Applications:
    Confirmatory Data Analysis
Implementation Date:
    1999/5
Program:
     
    SKIP 25
    READ NATR323.DAT Y1 Y2
    RETAIN Y2 SUBSET Y2 > -90
    SET WRITE DECIMALS 4
    RANK SUM TEST Y1 Y2
        
    The following output is generated.
                Two Sample Two-Sided Mann Whitney Rank Sum Test
                             (Conover Formulation)
     
    First Response Variable: Y1
    Second Response Variable: Y2
     
    H0: F(x) = G(x)   for all x
    Ha: F(x) <> G(x)  for some x
     
    Summary Statistics:
    Number of Observations for Sample 1:     13
    Mean for Sample 1:                       80.0208
    Median for Sample 1:                     80.0300
    Number of Observations for Sample 2:     8
    Mean for Sample 2:                       79.9788
    Median for Sample 2:                     79.9700
    Number of Tied Ranks:                    14
     
    Test (Normal Approximation):
    Test Statistic Value (W):                2.7105
    CDF Value:                               0.9966
    P-Value (2-tailed test):                 0.0067
    P-Value (lower-tailed test):             0.9966
    P-Value (upper-tailed test):             0.0034
     
     
                Two-Tailed Test: Normal Approximation
     
    H0: F(x) = G(x); Ha: F(x) <> G(x)  for some x
    ------------------------------------------------------------
                                                            Null
       Significance           Test       Critical     Hypothesis
              Level      Statistic    Value (+/-)     Conclusion
    ------------------------------------------------------------
              80.0%         2.7105         1.2816         REJECT
              90.0%         2.7105         1.6449         REJECT
              95.0%         2.7105         1.9600         REJECT
              99.0%         2.7105         2.5758         REJECT
        
Date created: 06/05/2001
Last updated: 12/11/2023

Please email comments on this WWW page to alan.heckert@nist.gov.