FISHER DISCRIMINATION PLOT

Name:

FISHER DISCRIMINATION PLOT Type:

Graphics Command Purpose:

Generate a Fisher (linear) discrimination plot. Description:

The data consists of both training data and observations to be classified. The training data are observations where the group-id is known.

The basic idea behind Fisher discriminant analysis is to determine a few linear combinations of the training that provide separation of the groups.

Dataplot follows the derivation given in Johnson and Wichern (1992). From Johnson and Wichern, given g groups, p variables and response variables x₁, x₂, ..., x_p, then

\( \bar{x}_{i} \) = the mean of the i-th group (for each variable)
\( \bar{x} \) = the mean vector of the combined groups
= \( \frac{1}{g} \sum_{i=1}^{g}{x_{i}} \)
\( \hat{B}_{0} \) = \( \sum_{i=1}^{g}{(x_{i} - \bar{x})(x_{i} - \bar{x})'} \)
W = sample within groups matrix
= \( \sum_{i=1}^{g}{S_{i}} \)
where \( S_{i} \) is sample covariance matrix of the i-th group
\( S_{\mbox{pooled}} \) = the pooled covariance matrix
= \( \frac{W}{n_1 + n_2 + \ldots + n_g - g} \)
\( s = \min(g-1,p) \)
\( \hat{\lambda}_{1}, \hat{\lambda}_{2}, \dots, \hat{\lambda}_{s} \) > 0 denote the nonzero eigenvalues of \( W^{-1} \hat{B}_{0} \)
\( \hat{e}_{1}, \hat{e}_{2}, \dots, \hat{e}_{s} \) are the corresponding eigenvectors scaled so that
\( \hat{e}' S_{\mbox{pooled}} \hat{e} \) = 1

Then the vector of coefficients lhat that maximixes the ratio

\( \frac{\hat{l}' \hat{B}_{0} \hat{l}} {\hat{l}' W \hat{l}} \)

is given by \( \hat{l}_{1} = \hat{e}_{1}, \hat{l}_{2} = \hat{e}_{2} \), and so on. That is, the scaled eigenvectors are the discriminant functions. The Fisher discriminants are based on the ratio of the between groups variability relative to the common variability within groups.

Fisher's discriminant analysis is based on two assumptions. The first is that the response variables have a multivariate normal distribution. The second is that the groups have a common covariance matrix. According to Johnson and Wichern, the second of these assumptions is the most critical.

Once the discriminants have been defined, a new observation, x, can be classified to one of the g groups. Specifically, allocate x to the k-th group if

\[ \sum_{j=1}^{r}{(\hat{y}_{j} - \bar{y}_{kj})^{2}} \le \sum_{j=1}^{r}{(\hat{l}_{j} (x - \bar{x}_{i}))^{2}} \hspace{0.2in} \mbox{for all} \hspace{0.1in} i \ne k \]

where r denotes the number of discriminants to use and yhat(i) = lhat(i)*x.

The FISHER DISCRIMINANT PLOT is then \( \hat{y}_{1} \) versus \( \hat{y}_{2} \). Note that although the plot will be based on the first two discriminants, the number of discriminants used for classification can be greater than two.

The first g traces in the plot are the g categories for the training data. Trace g+1 are the group means. Traces g+2 to 2g+1 are the g categories for the observations to be classified. The ordering is from the low value for category to the high value of category. For example, if there are two categories, you might do something like

This will draw the training observations as black unfilled circles and squares, the group means as black unfilled triangles and the observations to be classified as red or blue filled circles or squares. This is demonstrated in the Program example below.

In addition to the plot, the following will be printed

The B0 matrix
The sample pooled covariance matrix
The sample W^(-1)*B0 matrix
The sorted eigenvalues

Syntax:

All of the variables must have the same length. Values of the <tag> variable with a value of zero are the observations to be classified. If no values in the variable are identified with a value of zero, an error will be reported and no plot is generated.

Examples:

Note:

You can explicitly set the number of discrminants to use with the command

SET FISHER NUMBER OF DISCRIMINANTS <value>

where <value> is a positive integer. If <value> is less than 2, then two discriminants will be used. If <value> is greater than the minimum of the number of groups minus 1 (g-1) and the number of variables (p), then the min(g-1,p) discriminants will be used.

Alternatively, you can specify a cut-off for the eigenvalues with the command

SET FISHER DISCRIMINANTS EIGENVALUE EPSILON <value>

where <value> specifies the cut-off for the eigenvalues. For example, if 3 eigenvalues are greater than the cut-off, then 3 discriminants will be used in the classification. If the number of discriminants based on this is greater than min(g-1,p), then min(g-1,p) discriminants will be used.

Note:

The normalized eigenvectors are written to dpst2f.dat.

The discriminants are written to dpst3f.dat.

The discriminant group means are written to dpst4f.dat.

The classification results are written to dpst5f.dat. The first column contains the row-id of the observation being classified and the second column specifies the group to which the observation is assigned.

Note:

The FISHER DISCRIMINATION PLOT does not standardize the data. If you want to standardize the data, do that before utilizing this command. This is demonstrated in the Program example below. Performing the standardization as a separate step allows more flexibility in the choice of standardization method.

Note:

If there are a large number of variables, it may be helpful to perform some dimension reduction first. For example, you can take the principal components of the original data and generate the FISHER DISCRIMINATION PLOT based on the most important principal components. Note:

If there are more than 50 variables, it is recommended that some type of dimension reduction, such as principal components, be used before generating this plot.

Default:

The first two discriminants will be used for classification Synonyms:

Related Commands:

K NEAREST NEIGHBORS CLASSIFICATION PLOT	=	= a k nearest neighbors classification plot.
PRINCIPAL COMPONENTS	=	Compute the principal components of a matrix.
STANDARDIZE	=	Standardize a variable.
CLUSTER	=	Perform a cluster analysis.

Reference:

Johnson and Wichern (1992), "Applied Multivariate Analysis", Third Edition, Prentice-Hall, Chapter 11. Applications:

Classification Implementation Date:

2024/07 Program:

 
. Step 1:   Read the data
.
skip 25
read iris.dat seplen sepwidth petlen petwidth tag
skip 0
.
. Step 2:   Generate the plot
.
line blank all
character hw 1 0.75 all
character circle triangle revtri box circle triangle revtri
character fill off off off on on on on
.
y1label Discriminant One
x1label Discriminant Two
title Fisher Discriminant Analysis for Iris Data
.

  
           Fisher Linear Discriminant Analysis
  
  
  
             B0 (Sample Between Group) Matrix
  
 ------------------------------------------------------------
           Col 1          Col 2          Col 3          Col 4
 ------------------------------------------------------------
         1.26424       -0.39905        4.14230        1.33292
        -0.39905        0.22690       -1.51546       -0.17132
         4.14230       -1.51546       14.00072        3.85348
         1.33292       -0.17132        3.85348        2.02160
 ------------------------------------------------------------
  
  
             Sample Pooled Variance-Covariance Matrix
  
 ------------------------------------------------------------
           Col 1          Col 2          Col 3          Col 4
 ------------------------------------------------------------
         0.26501        0.09272        0.16751        0.03840
         0.09272        0.11539        0.05524        0.03271
         0.16751        0.05524        0.18519        0.04267
         0.03840        0.03271        0.04267        0.04188
 ------------------------------------------------------------
  
  
             Sample W^(-1)*B0 Matrix
  
 ------------------------------------------------------------
           Col 1          Col 2          Col 3          Col 4
 ------------------------------------------------------------
        -0.11454        0.05098       -0.40584       -0.08413
        -0.09041        0.01942       -0.27745       -0.11784
         0.25140       -0.11355        0.89419        0.18056
         0.13603        0.02594        0.30380        0.31360
 ------------------------------------------------------------
  
  
             Sorted Eigenvalues
  
 ---------------
           Col 1
 ---------------
         0.89013
         0.22254
         0.00000
         0.00000
 ---------------

fisher discriminant plot seplen sepwidth petlen petwidth tag
.
. Step 3:   Test "classification"
.
let tag2 = tag
let tag2 = 0 for i = 10 10 150
fisher discriminant plot seplen sepwidth petlen petwidth tag2

           Fisher Linear Discriminant Analysis
  
  
  
             B0 (Sample Between Group) Matrix
  
 ------------------------------------------------------------
           Col 1          Col 2          Col 3          Col 4
 ------------------------------------------------------------
         1.25660       -0.41416        4.16615        1.30654
        -0.41416        0.22318       -1.53940       -0.18845
         4.16615       -1.53940       14.13148        3.86717
         1.30654       -0.18845        3.86717        2.03507
 ------------------------------------------------------------
  
  
             Sample Pooled Variance-Covariance Matrix
  
 ------------------------------------------------------------
           Col 1          Col 2          Col 3          Col 4
 ------------------------------------------------------------
         0.27365        0.09310        0.17266        0.03758
         0.09310        0.11699        0.05342        0.02954
         0.17266        0.05342        0.19157        0.04095
         0.03758        0.02954        0.04095        0.03944
 ------------------------------------------------------------
  
  
             Sample W^(-1)*B0 Matrix
  
 ------------------------------------------------------------
           Col 1          Col 2          Col 3          Col 4
 ------------------------------------------------------------
        -0.12739        0.05448       -0.44634       -0.09754
        -0.08886        0.02063       -0.27801       -0.11658
         0.27058       -0.12029        0.95678        0.19441
         0.15796        0.02133        0.38290        0.36929
 ------------------------------------------------------------
  
  
             Sorted Eigenvalues
  
 ---------------
           Col 1
 ---------------
         0.96592
         0.25339
         0.00000
        -0.00000
 ---------------