READ PAD MISSING COLUMNS (SET)
Name:
READ PAD MISSING COLUMNS (SET)
Type:
Purpose:
If the number of variables specified on a READ command is
greater than the number of values read from a data record,
specify whether these extra values will be set to a
"missing value" or this data record will be considered an
error.
Description:
Dataplot typically expects all variables to be of equal length. That
is, the data is rectangular with no empty fields.
Dataplot reads the data file one row at a time. When reading a row,
Dataplot assigns the first value read to the first variable name, the
second value read to the second variable name, and so on. By default,
the row with the smallest number of values defines the number of
variables that will be read. For example, if you entered the command
and one of the rows has only two values, then only Y and X1 will
be read into Dataplot.
In some cases, it may be more convenient to assign a missing value to
any variables that did not have a corresponding data item. To do
this enter the command
SET READ PAD MISSING COLUMNS ON
You can specify the value to use for missing values (the default is
zero) with the command
SET READ MISSING VALUE <value>
This works well if the empty fields are the end columns. However,
be aware that if your empty fields are the beginning or middle
columns, the data values may not be assigned to the variables
in the way you expect. See the NOTE: section below for alternative
methods for dealing with empty fields are columns of unequal length.
NOTE 2019/04: The default behavior was modifed. Now, Dataplot will
always pad with missing values. The only distinction between ON
and OFF is that OFF is set a warning message will be printed when
a row is encountered with less than the expected number of values.
Syntax:
SET READ PAD MISSING COLUMNS <ON/OFF>
where ON specifies that missing columns will be padded with
a missing data value and OFF specifies that missing
columns will be treated as an error.
NOTE 2019/04: Now missing columns will always be padded with the
missing value code. If OFF is specified, a warning
message will be printed when rows with a fewer than
expected number of values are encountered. If ON
is specified, no warning messages will be printed.
Examples:
SET READ PAD MISSING COLUMN ON
SET READ PAD MISSING COLUMN OFF
Note:
When your data file has columns of unequal length or empty
fields, there are several alternative approaches.
- Pick some value to represent a missing value and fill in
missing data points with that value. After reading the data,
you can use a RETAIN command to remove them. For example, if
you use -99 to signify a missing value, you can enter
something like
Alternatively, you can use a SUBSET clause on subsequent plot
and analysis commands.
There are two SET commands that pertain to missing values.
SET DATA MISSING VALUE <value> specifies a character
string that will be interpreted as a missing value in the data
file (this character string can be a numeric value).
SET READ MISSING VALUE <value> specifies the numeric
value that will be saved to the Dataplot variable when a
missing value (as defined by the SET DATA MISSING VALUE) is
encountered.
Where feasible, this is the recommended solution.
- If your data file has consistent formats for the rows, then
there are two possible solutions.
If the fields are justified by the decimal point so that a
Fortran format statement can be applied, then you can use the
SET READ FORMAT command. In this case, empty fields are read as
zero. If zero can be a valid data value for one or more of your
variables, then it can be ambiguous whether a zero in your
variable denotes a valid data point or a missing value. The
SET READ MISSING VALUE setting does not apply when the
SET READ FORMAT is used.
Many spreadsheets have an option for saving data to a "fixed
width" ASCII text file. In these cases, the fields are typically
either right or left justified. However, the column for the
decimal point will not be consistent so that the SET READ FORMAT
command cannot be used. In this case, you can use the variable
form of the COLUMN LIMITS command. That is
COLUMN LIMITS LOWLIMIT UPPLIMIT
where LOWLIMIT and UPPLIMIT are variables containing the start
and stop columns, respectively, for each of the variables. By
default, when a blank field is encountered, it is set to zero.
You can specify the value to use by entering the command
SET READ MISSING VALUE <value>
- If your data has both columns of unequal length (or empty
fields) and inconsistent columns for given data fields, an
alternative is to use a comma delimited data file. If there is
no data between successive commas, this is treated as a missing
value. The default is to assign a value of zero. Alternatively,
you can use the SET READ MISSING VALUE command described above.
You can specify a delimiter other than a comma with the command
SET READ DELIMITER <character>
Default:
Synonyms:
Related Commands:
Applications:
Implementation Date:
2004/10
2019/04: The default behavior for empty fields was changed.
Program:
SET READ MISSING VALUE -99
SET READ PAD MISSING COLUMN ON
READ DUMMY.DAT Y X
RETAIN Y X SUBSET Y > -99 SUBSET X > -99
PLOT Y X
Privacy
Policy/Security Notice
Disclaimer |
FOIA
NIST is an agency of the U.S.
Commerce Department.
Date created: 12/05/2005
Last updated: 04/18/2019
Please email comments on this WWW page to
[email protected].
|
|