STATISTIC PLOT

Name:

... STATISTIC PLOT Type:

Graphics Command Purpose:

Generates a statistic versus index plot for a given statistic. Description:

Vertical axis:	subsample statistic;
Horizontal axis:	subsample index.

The statistic plot yields 2 traces:

a subsample statistic trace; and
a full-sample statistic reference line.

The appearance of these two traces is controlled by the first two settings of the LINES, CHARACTERS, SPIKES, BARS, and associated attribute setting commands.

Syntax 1:

For a list of supported statistics, enter

HELP STATISTICS

Syntax 2:

This syntax is used for multiple response variables. See the Note section below for details on this syntax.

For a list of supported statistics, enter

HELP STATISTICS

Syntax 3:

For this syntax, there are two group variables. The <x> variable is used as in syntax 1. That is, this variable is used to define the sub-groups for computing the statistic. In syntax 1, there are two plot traces created. The first contains the statistic value for each group and the second contains the statistic for the full data set. With this syntax, the <tag> variable is used to define groups with the same plot attributes. For example, if <tag> contains three distinct values (1, 2, and 3), there will be four plot traces created. The first trace is for groups (<x> where the corresponding <tag> value is 1, the second trace is where the corresponding <tag> value is 2, the third trace is where the corresponding <tag> value is 3, and the fourth trace is the statistic for the full data set.

The <tag> value should be the same for all rows in a group defined by <x>. However, if this is not the case, the <tag> value corresponding to the first row in <x> for that group will be used.

This syntax is used to highlight certain groups. For example, groups that denote potential outliers might be highlighted in a different color.

This syntax is demonstrated in the Program 3 example.

Examples:

Note:

A number of the subcommands (e.g., MEAN PLOT) are documented individually. Note:

This basic idea can be easily adapted to other statistics (even ones that are not built-in to DATAPLOT). It can also be adapted to statistics requiring any arbitrary number of variables to compute.

The 2016/08 version of Dataplot added the STATISTIC BLOCK command that can be used to define a statistic.

Note:

PLOT command to support multiple response variables (Syntax 2). For example,

MEAN PLOT Y1 TO Y4 X

That is, for each distinct value of X, there are now 4 means plotted instead of just one.

The following commands can be used to control the appearance of the plot:

If the FORMAT option is set to OVERLAY and the SUMMARY option is set to VARIABLE, this is equivalent to the following:

That is, there will be a curve corresponding to each response variable and there will be a reference line corresponding to each variable.

If the FORMAT option is set to DEX, then this plot uses a format similar to the DEX <stat> PLOT command. That is, for each distinct value of X, there will be curve connecting the mean values for the 4 response variables.

If the SUMMARY option is set to GROUP, there will be a single reference curve. At each distinct value of X, a single overall mean is computed for all 4 of the response variables.

In addition, the following option is added to this command:

<stat> <zscore/uscore> PLOT

If ZSCORE is given, then a z-score transformation (subtract the mean and then divide by the standard deviation) is computed on each response variable first. If USCORE is given, then a u-score transformation (subtract the minimum and divide by the range) is computed on each column. Note these z-score and u-score transformations apply to the entire response variable, not to each distinct group within the response variable.

Note:

For some statistics (e.g., STANDARD DEVIATION and other scale statistics), this may not be particularly meaningful. Alternatively you can specify either the mean or the median value of the statistic over the groups. For example, if you are generating a standard deviation plot and you have 10 groups, you can specify that the reference line be drawn at the mean (or the median) of the 10 computed standard deviations.

To specify what reference line is drawn, enter

SET STATISTIC PLOT REFERENCE LINE <OVERALL/AVERAGE/MEDIAN>

where OVERALL is the value of the statistic for all of the data, AVERAGE is the mean of the statistic over the groups, and MEDIAN is the median of the statistic over the groups.

The default is OVERALL.

Default:

None Synonyms:

On most of the commands, the word STATISTIC is optional and is usually omitted (e.g., the mean plot is documented under MEAN PLOT rather than MEAN STATISTIC PLOT). Related Commands:

CHARACTERS	= Sets the type for plot characters.
LINES	= Sets the type for plot lines.
BOX PLOT	= Generates a box plot.
CONTROL CHART	= Generates a control chart.
PLOT	= Generates a data or function plot.
SUMMARY	= Computes various statistics for a variable.
STATISTIC BLOCK	= Define a new statistic.

Applications:

Exploratory Data Analysis Implementation Date:

The list of supported statistics has been regulary updated since the original 1988/2 implementation.

Program 1:

 
SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
TITLE AUTOMATIC
TITLE OFFSET 2
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 3 0 100 100
MULTIPLOT SCALE FACTOR 2
X1LABEL DISPLACEMENT 14
Y1LABEL DISPLACEMENT 12
TIC MARK LABEL SIZE 1.8
.
XTIC OFFSET 1 1
X1LABEL BATCH
LINE BLANK SOLID
CHARACTER X BLANK
Y1LABEL MEAN
TITLE MEAN PLOT
MEAN PLOT DIAMETER BATCH
Y1LABEL STANDARD DEVIATION
TITLE SD PLOT
STANDARD DEVIATION PLOT DIAMETER BATCH
Y1LABEL RELATIVE STANDARD DEVIATION
TITLE RELSD PLOT
RELSD PLOT DIAMETER BATCH
Y1LABEL RANGE
TITLE RANGE PLOT
RANGE PLOT DIAMETER BATCH
.
END OF MULTIPLOT

Program 2:

 
skip 25
read iris.dat y1 to y4 x
.
title case asis
title offset 2
label case asis
y1label Mean
x1label Group-ID
xlimits 1 3
major xtic mark number 3
minor xtic mark number 0
xtic offset 0.6 0.6
ytic offset 1 1
.
set stat plot format  dex
set stat plot summary vari
title sp()Case 1: Format = DEX, Summary = Variable
line color black black black blue red green cyan
mean plot y1 to y4 x
.
set stat plot format  dex
set stat plot summary group
title sp()Case 2: Format = DEX, Summary = Group
mean plot y1 to y4 x
.
set stat plot format overlay
set stat plot summary group
line color blue red green cyan
line so so so so bl
char bl bl bl bl x
title sp()Case 3: Format = Overlay, Summary = Group
mean plot y1 to y4 x
.
set stat plot format overlay
set stat plot summary variable
line so all
char bl all
line color blue red green cyan blue red green cyan
title sp()Case 4: Format = Overlay, Summary = Variable
mean plot y1 to y4 x

plot generated by sample program

Program 3:

 
. Step 1:   Read the data
.
skip 25
read gear.dat y x
skip 0
let tag = sequence 1 10 1 2 for i = 1 1 100
.
. Step 2:   Define plot control settings
.
case asis
label case asis
title case asis
title offset 2
.
xlimits 1 10
major x1tic mark number 10
minor x1tic mark number  0
tic offset units data
x1tic mark offset 0.5 0.5
.
title Mean Plot of GEAR.DAT
y1label Mean Diameter
x1label Batch
.
. Step 3:   Generate the plot without tags
.
character X blank
line blank solid
mean plot y x
.
. Step 4:   Generate the plot with tags
.
character X X blank
character color blue red
line blank blank solid
.
mean tag plot y x tag

plot generated by sample program