SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 2 Vol 1

MANN WHITNEY U STATISTIC

Name:
    MANN WHITNEY U STATISTIC (LET)
Type:
    Analysis Command
Purpose:
    Compute the test statistic or alternatively the frequencies and CDF values for the U version of the Mann Whitney rank sum test.
Description:
    The t-test is the standard test for testing that the difference between population means for two non-paired samples are equal. The Mann Whitney rank sum test is a non-parameteric alternative to the t-test.

    The Mann Whitney rank sum test statistic is computed by:

    1. Rank the combined samples.

    2. Compute the sum of the ranks for each sample (call these T1 and T2).

    3. If the sample sizes are equal. the test statistic is

      T = min(T1,T2)

    4. If the sample sizes are unequal, let T1 be the sum of the smaller sample size and the test statistic is

      T = MIN(T1,N1*(N1 + N2 + 1) - T1)

    Sufficiently small values of T cause rejection of the null hypothesis that the sample locations are equal. Significance levels have been tabulated for small values of N1 and N2. For sufficiently large N1 and N2, the following normal approximation is used:

      \( Z = \frac{|\mu - T| - 0.5}{\sigma} \)

    where

      \( \mu = \frac{N_1 (N_1 + N_2 + 1)}{2} \)
      \( \sigma = \sqrt{\frac{N_2 \mu}{6}} \)

    Some analysts prefer a slightly different formulation for this test

      \( U = N_1 N_2 + 0.5 N_1(N_1 + 1) - T \)

    This form of the statistic can be computed with the command (Syntax 1)

      LET U = MANN WHITNEY U STATISTIC Y1 Y2

    Dataplot uses Applied Statistics algorithm 62 (as updated by Alan Miller) to obtain the cumulative frequencies and the corresponding CDF values of the U test statistic.

    That is, Syntax 1 is used to compute the value of the test statistic and Syntax 2 is used to obtain the CDF for the test statistic.

Syntax 1:
    LET <U> = MANN WHITEY U STATISTIC <y1> <y2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <y1> is the first response variable;
                <y2> is the second response variable;
                <U> is a parameter where the U version of the Mann Whitney rank sum statistic is saved;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax returns the value of U version of the Mann-Whitney statistic.

Syntax 2:
    LET <x> <freq> <cdf> = MANN WHITEY U STATISTIC <n1> <n2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <n1> is a parameter that specifies the sample size for the first response variable;
                <n2> is a parameter that specifies the sample size for the second response variable;
                <x> is a variable that returns the potential values of the test statistic;
                <freq> is a variable containing the cumulative frequencies corresponding to <x>;
                <cdf> is a variable containing the CDF values corresponding to <x>;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax returns the cumulative frequency table (and the corresponding CDF value) for the U version of the Mann Whitney statistic. Note that it only depends on the sample sizes for the two variables, not the data.

Examples:
    LET U = MANN WHITNEY U STATISTIC Y1 Y2

    LET N1 = SIZE Y1
    LET N2 = SIZE Y2
    LET X FREQ CDF = MANN WHITNEY U STATISTIC FREQUENCY N1 N2

Default:
    None
Synonyms:
    None
Related Commands: Reference:
    Applied Statistics, AS 62.

    Conover (1999), "Practical Non-Parametric Statistics," Third Edition, Wiley, pp. 272-281.

    Snedecor and Cochran (1989), "Statistical Methods," Eigth Edition, Iowa State University Press, pp. 142-144.

Applications:
    Non-Parametric Analysis, Two Sample Tests
Implementation Date:
    2011/5
Program:
     
    . Step 1: Read Data (example 2 from pp. 278-279 of Conover)
    .
    let y1 = data 1 2 3 5
    let y2 = data 4 6 7 8 9
    .
    set write decimals 3
    let u = mann whitney u statistic y1 y2
    let n1 = size y1
    let n2 = size y2
    let x freq cdf = mann whitney u statistic frequency  n1 n2
    print "Test Statistic = ^u"
    print x freq cdf
        
    The following output is generated
    Test Statistic = 19
      
     ---------------------------------------------
                   X           FREQ            CDF
     ---------------------------------------------
               0.000          1.000          0.007
               1.000          2.000          0.015
               2.000          4.000          0.031
               3.000          7.000          0.055
               4.000         12.000          0.095
               5.000         18.000          0.142
               6.000         26.000          0.206
               7.000         35.000          0.277
               8.000         46.000          0.365
               9.000         57.000          0.452
              10.000         69.000          0.547
              11.000         80.000          0.634
              12.000         91.000          0.722
              13.000        100.000          0.793
              14.000        108.000          0.857
              15.000        114.000          0.904
              16.000        119.000          0.944
              17.000        122.000          0.968
              18.000        124.000          0.984
              19.000        125.000          0.992
              20.000        126.000          1.000
        

Privacy Policy/Security Notice
Disclaimer | FOIA

NIST is an agency of the U.S. Commerce Department.

Date created: 12/11/2013
Last updated: 12/11/2013

Please email comments on this WWW page to alan.heckert@nist.gov.