READ PAD MISSING COLUMNS (SET)

Name:

READ PAD MISSING COLUMNS (SET) Type:

Subcommand under SET Purpose:

If the number of variables specified on a READ command is greater than the number of values read from a data record, specify whether these extra values will be set to a "missing value" or this data record will be considered an error. Description:

Dataplot reads the data file one row at a time. When reading a row, Dataplot assigns the first value read to the first variable name, the second value read to the second variable name, and so on. By default, the row with the smallest number of values defines the number of variables that will be read. For example, if you entered the command

READ FILE.DAT Y X1 X2 X3

and one of the rows has only two values, then only Y and X1 will be read into Dataplot.

In some cases, it may be more convenient to assign a missing value to any variables that did not have a corresponding data item. To do this enter the command

SET READ PAD MISSING COLUMNS ON

You can specify the value to use for missing values (the default is zero) with the command

SET READ MISSING VALUE <value>

This works well if the empty fields are the end columns. However, be aware that if your empty fields are the beginning or middle columns, the data values may not be assigned to the variables in the way you expect. See the NOTE: section below for alternative methods for dealing with empty fields are columns of unequal length.

NOTE 2019/04: The default behavior was modifed. Now, Dataplot will always pad with missing values. The only distinction between ON and OFF is that OFF is set a warning message will be printed when a row is encountered with less than the expected number of values.

Syntax:

NOTE 2019/04: Now missing columns will always be padded with the missing value code. If OFF is specified, a warning message will be printed when rows with a fewer than expected number of values are encountered. If ON is specified, no warning messages will be printed.

Examples:

Note:

Pick some value to represent a missing value and fill in missing data points with that value. After reading the data, you can use a RETAIN command to remove them. For example, if you use -99 to signify a missing value, you can enter something like
Alternatively, you can use a SUBSET clause on subsequent plot and analysis commands.
There are two SET commands that pertain to missing values.
SET DATA MISSING VALUE <value> specifies a character string that will be interpreted as a missing value in the data file (this character string can be a numeric value).
SET READ MISSING VALUE <value> specifies the numeric value that will be saved to the Dataplot variable when a missing value (as defined by the SET DATA MISSING VALUE) is encountered.
Where feasible, this is the recommended solution.
If your data file has consistent formats for the rows, then there are two possible solutions.
If the fields are justified by the decimal point so that a Fortran format statement can be applied, then you can use the SET READ FORMAT command. In this case, empty fields are read as zero. If zero can be a valid data value for one or more of your variables, then it can be ambiguous whether a zero in your variable denotes a valid data point or a missing value. The SET READ MISSING VALUE setting does not apply when the SET READ FORMAT is used.
Many spreadsheets have an option for saving data to a "fixed width" ASCII text file. In these cases, the fields are typically either right or left justified. However, the column for the decimal point will not be consistent so that the SET READ FORMAT command cannot be used. In this case, you can use the variable form of the COLUMN LIMITS command. That is
where LOWLIMIT and UPPLIMIT are variables containing the start and stop columns, respectively, for each of the variables. By default, when a blank field is encountered, it is set to zero. You can specify the value to use by entering the command
If your data has both columns of unequal length (or empty fields) and inconsistent columns for given data fields, an alternative is to use a comma delimited data file. If there is no data between successive commas, this is treated as a missing value. The default is to assign a value of zero. Alternatively, you can use the SET READ MISSING VALUE command described above.
You can specify a delimiter other than a comma with the command

Default:

OFF Synonyms:

None Related Commands:

READ	=	Carries out a column-wise input of data.
COLUMN LIMITS	=	Specify what columns to read.
SET READ MISSING VALUE	=	Specify the value used to denote a missing value.
SET DATA MISSING VALUE	=	Specify the value used to denote a missing value.
SET READ DELIMITER	=	Specify the character that will be interperted as a delimiter between fields on a data record.

Applications:

Input/Output Implementation Date:

Program:

 
SET READ MISSING VALUE -99
SET READ PAD MISSING COLUMN ON
READ DUMMY.DAT  Y X
RETAIN Y X SUBSET Y > -99 SUBSET X > -99
PLOT Y X