 Dataplot Vol 2 Vol 1

# KENDALLS TAU

Name:
KENDALLS TAU (LET)
Type:
Let Subcommand
Purpose:
Compute Kendall's tau coefficient between two paired variables.
Description:
Kendall's tau coefficient is a measure of concordance between two paired variables. Given the pairs (Xi,Yi) and (Xj,Yj), then

$$\frac{Y_j - Y_i}{X_j - X_i}$$ > 0 - pair is concordant

$$\frac{Y_j - Y_i}{X_j - X_i}$$ < 0 - pair is discordant

$$\frac{Y_j - Y_i}{X_j - X_i}$$ = 0 - pair is considered a tie

Xi = Xj - pair is not compared

Kendall's tau is computed as

$$\tau = \frac{N_c - N_d}{N_c + N_d}$$

with Nc and Nd denoting the number of concordant pairs and the number of discordant pairs, respectively, in the sample. Ties add 0.5 to both the concordant and discordant counts. There are $$\left( \begin{array}{c} n \\ 2 \end{array} \right)$$ possible pairs in the bivariate sample.

The above definition of Kendall's tau is from Conover. This is equivalent to the Goodman and Kruskal gamma coefficient. There are several alternative definitions of Kendall's tau in the literature. In particular, Kendall's original definition, referred to as tau-a, is

$$\tau_a = \frac{N_c - N_d}{N(N-1)/2}$$

where the $$N_c$$ and $$N_d$$ do not add 0.5 for tied values.

The Conover formulation accounts for ties while the Kendall tau-a statistic does not.

Kendall's tau-b is defined as

$$\tau_b = \frac{N_c - N_d} {\sqrt{(N_c + N_d + T_x) (N_c + N_d + T_y)}}$$

with $$T_x$$ denoting the number of pairs tied for the first response variable only and $$T_y$$ denoting the number of pairs tied for the second variable only. As with Kendall's tau-a, the $$N_c$$ and $$N_d$$ do not add 0.5 for tied values. Kendall's tau-b is equal to Kendall's tau-a when there are no ties but is preferred to Kendall's tau-a when there are ties.

Kendall's tau-c is used when the two response variables can only take a discrete number of values, but the scales for the response variables are different. For example, X can take integer values from 1 to 10 while Y can take integer values from 1 to 20. The formula for Kendall's tau-c is

$$\tau_c = \frac{2(N_c - N_d)} {N^2 (m - 1)/m}$$

where m is the minimun of $$X_d$$ and $$Y_d$$ where $$X_d$$ is the number of distinct values of X and $$Y_d$$ is the number of distinct values for Y.

Kendall's tau is an alternative to the Spearman's rho rank correlation.

Kendall's tau or the rank correlation may be preferred to the standard correlation coefficient in the following cases:

1. When the underlying data does not have a meaningful numerical measure, but it can be ranked;

2. When the relationship between the two variables is not linear;

3. When the normality assumption for two variables is not valid.
Syntax 1:
LET <par> = KENDALLS TAU <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<par> is a parameter where the computed Kendall's tau is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes Kendall's tau as formulated by Conover.

Syntax 2:
LET <par> = KENDALLS TAU A <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<par> is a parameter where the computed Kendall's tau is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes Kendall's tau-a.

Syntax 3:
LET <par> = KENDALLS TAU B <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<par> is a parameter where the computed Kendall's tau is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes Kendall's tau-b.

Syntax 4:
LET <par> = KENDALLS TAU C <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<par> is a parameter where the computed Kendall's tau is stored;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax computes Kendall's tau-c.

Examples:
LET A = KENDALLS TAU Y1 Y2
LET A = KENDALLS TAU Y1 Y2 SUBSET TAG > 2
LET A = KENDALLS TAU A Y1 Y2
LET A = KENDALLS TAU B Y1 Y2
LET A = KENDALLS TAU C Y1 Y2
Note:
The two variables must have the same number of elements.
Note:
Dataplot statistics can be used in a number of commands. For details, enter

Default:
None
Synonyms:
None
Related Commands:
 CORRELATION (LET) = Compute the Pearson correlation coefficient between two variables. RANK CORRELATION (LET) = Compute the Spearman's rho correlation coefficient between two variables. STATISTICS PLOT = Generate a statistic versus subset plot. BOOTSTRAP PLOT = Generate a bootstrap plot. TABULATE = Perform a tabulation for a specified statistic.
Reference:
Conover (1999), "Practical Nonparametric Statistics," Third Edition, Wiley, pp. 319-323.
Applications:
Exploratory Data Analysis
Implementation Date:
2004/10
Program:

. Following data from page 320 of Conover, "Practical
. Nonparametric Statistics", Third Edition, 1999, Wiley.
LET Y1 = DATA 7 8 4 5.5 4.5 4 5 3 2 0.5 1
LET Y2 = DATA 4 2 5 0.5 1.5 2 0 1 0 1.5 0
LET A1 = KENDALLS TAU Y1 Y2


The computed value of Kendall's tau is 0.4355.

NIST is an agency of the U.S. Commerce Department.

Date created: 12/22/2004
Last updated: 08/29/2019