SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

READ PAD MISSING COLUMNS (SET)

Name:
    READ PAD MISSING COLUMNS (SET)
Type:
    Subcommand under SET
Purpose:
    If the number of variables specified on a READ command is greater than the number of values read from a data record, specify whether these extra values will be set to a "missing value" or this data record will be considered an error.
Description:
    Dataplot typically expects all variables to be of equal length. That is, the data is rectangular with no empty fields.

    Dataplot reads the data file one row at a time. When reading a row, Dataplot assigns the first value read to the first variable name, the second value read to the second variable name, and so on. By default, the row with the smallest number of values defines the number of variables that will be read. For example, if you entered the command

      READ FILE.DAT Y X1 X2 X3

    and one of the rows has only two values, then only Y and X1 will be read into Dataplot.

    In some cases, it may be more convenient to assign a missing value to any variables that did not have a corresponding data item. To do this enter the command

      SET READ PAD MISSING COLUMNS ON

    You can specify the value to use for missing values (the default is zero) with the command

      SET READ MISSING VALUE <value>

    This works well if the empty fields are the end columns. However, be aware that if your empty fields are the beginning or middle columns, the data values may not be assigned to the variables in the way you expect. See the NOTE: section below for alternative methods for dealing with empty fields are columns of unequal length.

    NOTE 2019/04: The default behavior was modifed. Now, Dataplot will always pad with missing values. The only distinction between ON and OFF is that OFF is set a warning message will be printed when a row is encountered with less than the expected number of values.

Syntax:
    SET READ PAD MISSING COLUMNS <ON/OFF> where ON specifies that missing columns will be padded with a missing data value and OFF specifies that missing columns will be treated as an error.

    NOTE 2019/04: Now missing columns will always be padded with the missing value code. If OFF is specified, a warning message will be printed when rows with a fewer than expected number of values are encountered. If ON is specified, no warning messages will be printed.

Examples:
    SET READ PAD MISSING COLUMN ON
    SET READ PAD MISSING COLUMN OFF
Note:
    When your data file has columns of unequal length or empty fields, there are several alternative approaches.

    1. Pick some value to represent a missing value and fill in missing data points with that value. After reading the data, you can use a RETAIN command to remove them. For example, if you use -99 to signify a missing value, you can enter something like

        RETAIN Y SUBSET Y > -99

      Alternatively, you can use a SUBSET clause on subsequent plot and analysis commands.

      There are two SET commands that pertain to missing values.

      SET DATA MISSING VALUE <value> specifies a character string that will be interpreted as a missing value in the data file (this character string can be a numeric value).

      SET READ MISSING VALUE <value> specifies the numeric value that will be saved to the Dataplot variable when a missing value (as defined by the SET DATA MISSING VALUE) is encountered.

      Where feasible, this is the recommended solution.

    2. If your data file has consistent formats for the rows, then there are two possible solutions.

      If the fields are justified by the decimal point so that a Fortran format statement can be applied, then you can use the SET READ FORMAT command. In this case, empty fields are read as zero. If zero can be a valid data value for one or more of your variables, then it can be ambiguous whether a zero in your variable denotes a valid data point or a missing value. The SET READ MISSING VALUE setting does not apply when the SET READ FORMAT is used.

      Many spreadsheets have an option for saving data to a "fixed width" ASCII text file. In these cases, the fields are typically either right or left justified. However, the column for the decimal point will not be consistent so that the SET READ FORMAT command cannot be used. In this case, you can use the variable form of the COLUMN LIMITS command. That is

        COLUMN LIMITS LOWLIMIT UPPLIMIT

      where LOWLIMIT and UPPLIMIT are variables containing the start and stop columns, respectively, for each of the variables. By default, when a blank field is encountered, it is set to zero. You can specify the value to use by entering the command

        SET READ MISSING VALUE <value>

    3. If your data has both columns of unequal length (or empty fields) and inconsistent columns for given data fields, an alternative is to use a comma delimited data file. If there is no data between successive commas, this is treated as a missing value. The default is to assign a value of zero. Alternatively, you can use the SET READ MISSING VALUE command described above.

      You can specify a delimiter other than a comma with the command

        SET READ DELIMITER <character>
Default:
    OFF
Synonyms:
    None
Related Commands: Applications:
    Input/Output
Implementation Date:
    2004/10
    2019/04: The default behavior for empty fields was changed.
Program:
     
    SET READ MISSING VALUE -99
    SET READ PAD MISSING COLUMN ON
    READ DUMMY.DAT  Y X
    RETAIN Y X SUBSET Y > -99 SUBSET X > -99
    PLOT Y X
        

Privacy Policy/Security Notice
Disclaimer | FOIA

NIST is an agency of the U.S. Commerce Department.

Date created: 12/05/2005
Last updated: 04/18/2019

Please email comments on this WWW page to alan.heckert@nist.gov.