|
FACTOR PLOTName:
The underlying plot generated can be any univariate or bivariate plot. The scatter plot is the most common application. Although factor plots can be generated using the MULTIPLOT command (and typically LOOPING), the FACTOR PLOT command allows some fairly involved multiplots to be generated with a minimum number of commands (and without looping).
There are a couple of variations on this. If a univariate plot (e.g., a histogram) is being generated, then FACTOR PLOT Y1 Y2 ... Yk would generate HISTOGRAM Y1, HISTOGRAM Y2, ... HISTOGRAM Yk. The most general case would have multiple response and multiple factor variables. For example,
would generate col 1 col 2 col 3 col 4 row 1: PLOT Y1 X1, PLOT Y1 X2, PLOT Y1 X3, PLOT Y1 X4 row 2: PLOT Y2 X1, PLOT Y2 X2, PLOT Y2 X3, PLOT Y2 X4 row 3: PLOT Y3 X1, PLOT Y3 X2, PLOT Y3 X3, PLOT Y3 X4 row 4: PLOT Y4 X1, PLOT Y4 X2, PLOT Y4 X3, PLOT Y4 X4There are a number of alternatives for the appearance of this plot. Dataplot tries to balance simplicity with flexibility by using default settings, but providing numerous SET commands to control the appearance of the plot. These are described in detail in the NOTES section below. Syntax 1:
where <y1> through <yk> are the response variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Up to 25 response variables can be specified. This syntax is used when generating a univariate plot.
where <y1> is the response variables; <x1> through <xk> are the factor variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates PLOT Y1 X1, PLOT Y1 X2, etc. Up to 25 factor variables can be specified. This syntax is used when generating a bivariate plot. In this case, the response variable is constant, but the factor variable is changing.
where <y1> ... <yl> are the response variables; <x1> through <xk> are the factor variables; and where the <SUBSET/EXCEPT/FOR qualification> is optional. This syntax generates a matrix of plots where the number of response variables determines the number of rows and the number of factor variables determines the number of columns. This syntax is used when generating a bivariate plot and there is more than one response variable and more than one factor variable.
FACTOR PLOT Y1 Y2 Y3 Y4 Y5
SET FACTOR PLOT TYPE PLOT
SET FACTOR PLOT TYPE PLOT
where <value> is one of the following. The folllowing plot two variables (e.g., BIHISTOGRAM Y1 Y2). Use either syntax 2 or syntax 3 above, depending on whether you have one or multiple response variables, for the FACTOR PLOT command.
The folllowing plot one variables (e.g., HISTOGRAM Y1). Use syntax 1 above.
Dataplot automatically defines X1LABEL, X2LABEL, and YLABEL commands for these plots. You can control the attributes of these labels with the standard label setting commands. If you have defined variable labels (with the VARIABLE LABEL command), these will automatically be substituted for variable names in the labels. If you have defined variable labels with the VARIABLE LABEL command and you want to suppress the automatic expansion of the variable name to the variable label, enter
To restore the default that variable names will be expanded to the corresponding variable label, enter
OFF means that all axis labels are suppressed (this can be useful if a large number of variables are being plotted). ON means that both X and Y axis labels are printed. XON only plots the x axis labels and YON only plots the y axis labels. BOX is a special option that creates an extra column on the left and an extra row on the bottom. The axis label is printed in this box. BOX is typically reserved for the plot types that plot the variable names in the axes labels. The default is ON (both x and y axis labels are printed).
BOTTOM specifies that the x axis labels are printed on the bottom axis (on the last row only). TOP specifies that the x axis labels are printed on the top axis (first row only). ALTERNATE specifies that the x axis labels alternate between the top (first row) and bottom axis (last row). We recommend using the TIC OFFSET command to avoid overlap of axis labels and tic marks. The default is ALTERNATE.
LEFT specifies that the y axis labels are printed on the left axis (on the first column only). RIGHT specifies that the y axis labels are printed on the right axis (last column only). ALTERNATE specifies that the y axis labels alternate between the left (first column) and right axis (last column). We recommend using the TIC OFFSET command to avoid overlap of axis labels and tic marks. The default is ALTERNATE.
DEFAULT connects neighboring frames (i.e., the FRAME CORNER
COORDINATES are set to 0 0 100 100). USER uses whatever
frame coordinates are currently set (15 20 85 90 by default)
and makes no special provisions for axis labels and tic marks
(i.e., you set them as you normally would, each plot uses
whatever you have set). CONNECTED uses whatever frame
coordinates have been set by the user, but it draws the axis
labels and tic marks as if DEFAULT were being used (that is, as
determined by the SET FACTOR PLOT
Since the plots can often have different limits for the axes,
the default is USER.
NORMAL means that all tic labels are plotted at a distance determined by the TIC LABEL DISPLACEMENT command. STAGGERED means that alternating plots will be staggered. That is, one will use the standard displacement while the next uses a staggered value. Entering this command with a numeric value specifies the amount of the displacement for the staggered tic labels. For example,
SET FACTOR PLOT LABEL DISPLACEMENT STAGGERED SET FACTOR PLOT LABEL DISPLACEMENT 25 These commands specify that the default tic label displacement is 10 and the staggered tic mark label displacement is 25.
NONE means that no fitted line is plotted. LOWESS means that a locally weighted least squares line will be overlaid. LINE means that a linear fit (Y = A0 + A1*X) will be overlaid. QUAD means that a quadratic fit (Y = A0 + A1*X + A2*X**2) will be overlaid. SMOOTH means that a least squares smoothing will be overlaid. For LOWESS, it is recommended that the lowess fraction be set fairly high (e.g., LOWESS FRACTION 0.6). The fitted line is currently only generated if the factor plot type is PLOT. The default is for no fitted line to be overlaid on the plot. If a overlaid fit is desired, the most common choice is to use LOWESS.
where <value> identifies the number of response variables. On the FACTOR PLOT command, Dataplot assummes that the response variables (y axis) come first, then the factor variables (x axis). For the two variable plot types, the default is one. For the univariate plot types, all variables are assummed to be response variables.
In this form of the plot command, TAG is a group identifier variable. Points belonging to the same group are plotted with the same attributes (controlled by the CHARACTER and LINE commands and their various attribute setting commands). Using a tag variable has two common purposes:
You can specify that the factor plot use the form of the PLOT command by using the command
OFF specifies that the standard plot command (PLOT Y1 Y2) will be used. ON specifies that the last variable on the FACTOR PLOT command is a tag variable. That is, it is not plotted directly, but is instead the third variable on all the plot commands generated by the factor plot. Currently, this command only applies if the factor plot plot type is set to PLOT. In some cases, you may want to use a tag variable for both purposes. That is, you may have natural groups in your data, but you also want to flag certain outlying points. You can do this by using a SUBSET clauuse. For example,
SET FACTOR PLOT TAG ON CHARACTER CIRCLE SQUARE TRIANGLE CHARACTER FILL OFF OFF OFF FACTOR PLOT Y X1 X2 X3 TAG SUBSET Y2 <= 100 PRE-ERASE OFF CHARACTER FILL ON ON ON FACTOR PLOT Y X1 X2 X3 TAG SUBSET Y2 > 100 The SET FACTOR PLOT LIMITS command, discussed below, can be used to control the axis limits for the individual plots. The default is OFF.
SET FACTOR PLOT XLIMITS Note that the pairs of limits correspond to the variable list in the FACTOR PLOT command. For univariate plot types, the plot order corresponds to the variable list. For bivariate plot types, the YLIMITS refer to the response variables and XLIMITS refer to the factor variables. That is, Dataplot determines which variable is being plotted on each axis, and gets the corresponding limits. The default is to allow the axis limits to float with the data.
SET FACTOR PLOT SUBREGION YLIMITS <LOW1> <UPP1> <LOW2> <UPP2> ... This command is similar to the SET FACTOR PLOT XLIMITS and SET FACTOR PLOT YLIMITS commands in that the list corresponds to the variables entered on the FACTOR PLOT command. Only one set of subregion limits can be set for each variable. The default is that no subregion limits are set.
where The following commands can be used to add a prefix and suffix to the X2LABEL. For example, you might want the PERCENT CORRELATION to append a "%" after the percent correlation and to start with "CORR = ".
SET X2LABEL SUFFIX The appearance and location of the X2LABEL are controlled with the standard X2LABEL attribute setting commands. There are occassions where you may want to use the values computed in the X2LABEL for additional numeric computations. These values are automatically written to the file "dpst5f.dat". The values are printed in the order the plots are generated. You can control the number of digits printed with the SET WRITE DECIMALS command.
MULTIPLOT SCALE FACTOR 3 TIC OFFSET UNITS SCREEN TIC OFFSET 5 5 is a fairly typical set of commands commonly used with factor plots.
SET SCATTER PLOT is a synonym for SET FACTOR PLOT.
"Graphical Exploratory Data Analysis", du Toit, Steyn, and Stumpf, Springer-Verlang, 1986.
skip 25 read simon1.dat y1 y2 x1 to x5 block runseq . multiplot scale factor 2 multiplot corner coordinates 10 5 90 90 tic offset units screen xtic offset 5 10 major xtic mark number 3 ytic offset 5 5 y1label displacement 40 y2label displacement 25 x1label displacement 3 x2label displacement 7 char x line blank . set factor plot frame type connected frame corner coordinates 0 0 100 100 set factor plot response variables 2 factor plot y1 y2 x1 x2 x3 x4 x5
Date created: 06/05/2001 |
Last updated: 12/04/2023 Please email comments on this WWW page to alan.heckert@nist.gov. |