SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 2 Vol 1

CONDITION INDICES

Name:
    CONDITION INDICES (LET)
Type:
    Let Subcommand
Purpose:
    Compute condition indices of a regression design matrix.
Description:
    Condition indices are a measure of the multi-colinearity in a regression design matrix (i.e., the independent variables).

    Multi-colinearity results when the columns of X have significant interdependence (i.e., one or more columns of X is close to a linear combination of the other columns). Multi-colinearity can result in numerically unstable estimates of the regression coefficients (small changes in X can result in large changes to the estimated regression coefficients).

    Pairwise colinearity can be determined from viewing a correlation matrix of the independent variables. However, correlation matrices will not reveal higher order colinearity.

    There are a number of approaches to dealing with multi-colinearity. Some of these include:

    1. Delete one or more of the independent variables from the fit.
    2. Perform a principal components regression.
    3. Compute the regression using a singular value decomposition approach. Note that Dataplot uses a modified Gram-Schmidt method (Dataplot can perform a singular value decomposition, however this has not been incorporated into the fit).

    Condition indices are one measure that can be used to detect multi-colinearity (variance inflation factors are another). The condition indices are calculated as follows:

    1. Scale the columns of the X matrix to have unit sums of squares.
    2. Calculate the singular values of the scaled X matrix and square them.

    Condition indices between 30 and 100 indicate moderate to strong colinearity.

Syntax:
    LET <y1> = CONDITION INDICES <mat1>               <SUBSET/EXCEPT/FOR qualification>
    where <mat1> is the design matrix for which the condition indices are to be computed;
                  <y1> is a vector where the resulting condition indices are saved;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional (and rarely used in this context).
Examples:
    LET Y = CONDITION INDICES X
Note:
    Matrices are created with either the READ MATRIX, CREATE MATRIX, or MATRIX DEFINITION command. Enter HELP MATRIX DEFINITION, HELP CREATE MATRIX, and HELP READ MATRIX for details.
Note:
    The columns of a matrix are accessible as variables by appending an index to the matrix name. For example, the 4x4 matrix C has columns C1, C2, C3, and C4. These columns can be operated on like any other DATAPLOT variable.
Note:
    The maximum size matrix that DATAPLOT can handle is set when DATAPLOT is built on a particular site. Enter the command HELP MATRIX DIMENSION for details on the maximum size matrix that can be accomodated.
Default:
    None
Synonyms:
    None
Related Commands:
    VARIANCE INFLATION FACTORS = Compute variance inflation factors.
    CREATE MATRIX = Create a matrix from a list of variables.
    FIT = Perform a least squares fit.
    CATCHER MATRIX = Compute the catcher matrix.
    PARTIAL REGRESSION PLOT = Compute the catcher matrix.
Reference:
    "Efficient Computing of Regression Diagnostics", Velleman and Welsch, American Statistician, November, 1981, Vol. 35, No. 4, pp. 234-242.
Applications:
    Regression Diagnostics
Implementation Date:
    2002/6
Program:
    DIMENSION 100 COLUMNS 
    SKIP 25 
    READ HALD647.DAT Y X1 X2 X3 X4 
    SKIP 0 
    LET N = SIZE X1 
    LET X0 = SEQUENCE 1 1 N 
    LET Z = CREATE MATRIX X0 X1 X2 X3 X4 
    LET C = CONDITION INDICES Z 
    SET WRITE DECIMALS 2
    PRINT C 
        
    The following ouput is generated.
     
     VARIABLES--C
    
               1.00
               7.11
              10.19
              55.34
             149.90
        

Date created: 8/6/2002
Last updated: 4/4/2003
Please email comments on this WWW page to alan.heckert@nist.gov.