SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

CROSS TABULATE PLOT

Name:
    CROSS TABULATE PLOT
Type:
    Graphics Command
Purpose:
    Generates a statistic versus a cross tabulated index plot for a given statistic.
Description:
    A cross tabulate plot consists of subsample statistic versus cross tabulated subsample index. The subsample statistic is the value of some statistic for the data in the subsample.

    This plot is an extension of the STATISTIC PLOT. The STATISTIC PLOT uses one index variable while the CROSS TABULATE PLOT uses two index variables.

    The X axis coordinate is determined from the two group variables in the following way:

    1. The levels of the first group variable (e.g., X1) are plotted at 1, 2, 3, etc.

    2. For each level of the group 1 variable, the levels of the group 2 variable are scaled +/- 0.2 around the level of the group 1 variable.

    For example, if X1 has 2 levels (at 1 and 2) and X2 has 3 levels (1, 2, and 3), then the following x-coordinates are used:

      X1 X2 X-COOR
      1 1 0.8
      1 2 1.0
      1 3 1.2
      2 1 1.8
      2 2 2.0
      2 3 2.2

    The syntax CROSS TABULATE X1 X2 is a special case. It plots the value of X1 on the X axis and the value of X2 on the Y axis. The plot character is then set to the count for that cell (this is done automatically and you do not need to set the plot character). This form of the plot has application in the design of experiments.

    The cross tabulate plot is used to answer the question--"Does the subsample statistic change over different subsamples?". The plot consists of:

      Vertical axis = subsample statistic
      Horizontal axis = subsample index

    The cross tabulate plot yields 2 traces:

    1. a subsample statistic trace;
    2. a sub-sample statistic reference line for the distinct values of the first index variable; and
    3. a full-sample statistic reference line.
    Like usual, the appearance of these 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.
Syntax 1:
    CROSS TABULATE <stat> PLOT <y> <x1> <x2> <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of the following statistics:
        MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN, GEOMETRIC MEAN, HARMONIC MEAN,
        SUM, PRODUCT, SIZE (or NUMBER or SIZE),
        STANDARD DEVIATION, STANDARD DEVIATION OF MEAN, AVERAGE ABSOLUTE DEVIATION (or AAD), MEDIAN ABSOLUTE DEVIATION (or MAD), VARIANCE, VARIANCE OF THE MEAN, RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE (or COEFFICIENT OF VARIATION),
        RANGE, MIDRANGE, MAXIMUM, MINIMUM, EXTREME, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE, <FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGHTH/ NINTH/TENTH> DECILE,
        SKEWNESS, KURTOSIS, NORMAL PPCC,
        AUTOCORRELATION, AUTOCOVARIANCE, SINE FREQUENCY, SINE AMPLITUDE,
        CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
        TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL),
        TAGUCHI SN- (or SNS), TAGUCHI SN00 (or SN2);
                <y> is the response (= dependent) variable;
                <x1> is the first subsample identifier variable (this variable appears on the horizontal axis);
                <x2> is the second subsample identifier variable (this variable appears on the horizontal axis);
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for statistics that require one response variable to compute.

Syntax 2:
    CROSS TABULATE <stat> PLOT <y1> <y2> <x1> <x2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of the following statistics:
        LINEAR INTERCEPT, LINEAR SLOPE,
        LINEAR RESSD, LINEAR CORRELATION,
        CORRELATION, RANK CORRELATION,
        COVARIANCE, RANK COVARIANCE,
        WINSORIZED COVARIANCE, WINSORIZED COVARIANCE,
        BIWEIGHT MIDCOVARIANCE, BIWEIGHT MIDCORRELATION,
        PERCENTAGE BEND CORRELATION;
    <y1> is the first response variable;
                <y2> is the second response variable;
                <x1> is the first subsample identifier variable (this variable appears on the horizontal axis);
                <x2> is the second subsample identifier variable (this variable appears on the horizontal axis);
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used for statistics that require two response variables to compute. If a linear fit is performed, the first variable is the dependent variable while the second variable is the independent variable.

Syntax 3:
    CROSS TABULATE WEIGHTED <stat> PLOT <y1> <wt> <x1> <x2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of the following statistics:
        MEAN, STANDARD DEVIATION (or SD), VARIANCE;
    <y1> is the response variable;            
    <wt> is the weights response variable;            
    <x1> is the first subsample identifier variable (this variable appears on the horizontal axis);
                <x2> is the second subsample identifier variable (this variable appears on the horizontal axis);
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used to compute the weighted version of the specified statistic for a single response variable.

Syntax 4:
    CROSS TABULATE DIFFERENCE OF <stat> PLOT <y1> <y2> <x1> <x2>
                            <SUBSET/EXCEPT/FOR qualification>
    where <stat> is one of the following statistics:
        MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINSORIZED MEAN, GEOMETRIC MEAN, HARMONIC MEAN, HODGES LEHMAN, MIDRANGE, BIWEIGHT LOCATION, SUM, STANDARD DEVIATION, STANDARD DEVIATION OF MEAN, VARIANCE, VARIANCE OF THE MEAN, TRIMMED MEAN STANDARD ERROR, AVERAGE ABSOLUTE DEVIATION (or AAD), MEDIAN ABSOLUTE DEVIATION (or MAD), IQ RANGE, BIWEIGHT MIDVARIANCE, BIWEIGHT SCALE, PERCENTAGE BEND MIDVARIANCE, WINSORIZED VARIANCE, WINSORIZED STANDARD DEVIATION, RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE, COEFFICIENT OF VARIATION, RANGE, MAXIMUM, MINIMUM, EXTREME, QUANTILE, SKEWNESS, KURTOSIS;
                <y1> is the first response variable;
                <y2> is the second response variable;
                <x1> is the first subsample identifier variable (this variable appears on the horizontal axis);            
    <x2> is the second subsample identifier variable (this variable appears on the horizontal axis);
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    This syntax is used to compute the difference between two response variables for the specified statistic. The variables can be either independent (i.e., not paired) or dependent (i.e., paired), but the response variables must have the same number of elements.

Examples:
    CROSS TABULATE MEAN PLOT Y X1 X2
    CROSS TABULATE STANDARD DEVIATION PLOT Y X1 X2
Note:
    If you have more than 2 index variables, you can use the SCATTER PLOT MATRIX command to generate all the pairwise combinations of the index variables on a single page. Enter

      HELP SCATTER PLOT MATRIX

    for details (and specifically, SET SCATTER PLOT MATRIX TYPE).
Note:
    There may be cases where you want the format used by the CROSS TABULATE PLOT X1 X2 used for one of the statistics. This can be specified with the following command:

      SET CROSS TABULATE PLOT DIMENSION 2

    To reset the default, enter

      SET CROSS TABULATE PLOT DIMENSION 1

Default:
    None
Synonyms:
    None
Related Commands:
    CHARACTERS = Sets the type for plot characters.
    LINES = Sets the type for plot lines.
    STATISTIC PLOT = Generates a statistic versus subsample plot for one index variable.
    CROSS TABULATE = Generates a table of cross tabulated values for several common statistics.
Applications:
    Exploratory Data Analysis
Implementation Date:
    2000/1 2003/3: Support added for "WEIGHTED" and "DIFFERENCE OF" statistics.
Program:
    SKIP 25
    READ RIPKEN.DAT Y X1 TO X4
    .
    CHARACTER X BLANK BLANK
    LINE BLANK SOLID SOLID
    TITLE AUTOMATIC
    Y1LABEL BATTING AVERAGE
    XLIMITS 1 3
    XTIC OFFSET 0.6 0.6
    TIC OFFSET UNITS DATA
    MAJOR XTIC MARK NUMBER 3
    MINOR XTIC MARK NUMBER 0
    XTIC MARK LABEL FORMAT ALPHA
    XTIC MARK LABEL CONTENT INSIDE MIDDLECR()LSP()MSO()H OUTSIDE
    CROSS TABULATE MEAN PLOT Y X1 X2

    plot generated by sample program

Date created: 06/05/2001
Last updated: 12/04/2023

Please email comments on this WWW page to alan.heckert@nist.gov.