SED navigation bar go to SED home page go to Dataplot home page go to NIST home page SED Home Page SED Staff SED Projects SED Products and Publications Search SED Pages
Dataplot Vol 1 Vol 2

READ EXCEL

Name:
    READ EXCEL
Type:
    Support Command
Purpose:
    Read variables (vectors) from an Excel file.
Description:
    Spreadsheet programs are a popular method for saving data. Most spreadsheet programs support the Microsoft Excel file format.

    Dataplot does not natively support reading or writing to Excel format files. The READ EXCEL command works as follows

    1. Dataplot writes the name of the Excel file to line 1 of the file "dpst5f.dat". It writes the name of the Excel sheet to line 2 of "dpst5f.dat". The default sheet name is "Sheet1". To change the sheet name, enter the command

        SET EXCEL SHEET <sheet-name>

    2. Dataplot then invokes a Python script to read the variables from the specified Excel file and then writes the variables to the file "dpst1f.dat".

      The Python script, "read_excel.py", is located in the "scripts" subdirectory of the Dataplot auxiliary directory. This script uses the Pandas function "dataframe.read_excel" to read the data from the Excel file. It then uses the Pandas function "pandas.to_csv" to write the variables as a comma separated file (CSV) to the file "dpst1f.dat".

    3. Dataplot reads the contents of the variables from the file "dpst1f.dat".

    This command assumes that Python and the Python package Pandas are already installed on your local platform. Dataplot does not check if Python and Pandas are installed on your system and it does not initiate the Python installation if it is not already installed.

    As Python is used by many popular applications on Linux platforms, most Linux platforms will already have Python installed. However, this is not the case for Windows and MacOS platforms. See the Notes section below for alternatively using Power Shell on Windows platforms.

    If you need to install Python, there are a number of Python distributions (see https://wiki.python.org/moin/PythonDistributions). However, the most common are ActivePython from ActiveState and Anaconda from Continuum Analytics. Dataplot does not depend on a specific Python distribution and we make no recommendation for the preferred distribution.

    This command will typically work well with Excel files that are clean rectangular data sets. If you have added graphs, formulas, cross-tabulations, and so on to the Excel sheet, then the READ EXCEL command is likely to be unsatisfactory. You can either copy the data portion of the spreadsheet to a new sheet or use one of the other methods for reading data from spreadsheets listed below.

    Broman and Woo (see References below) wrote a useful article in The American Statistician that provides useful guidelines for creating Excel files that can be more easily read by non-Excel programs.

    If you infrequently need to read Excel files and do not already have Python installed on your local platform, there are several additional methods for reading the data from Excel files. However, if you anticipate the need for frequently reading Excel files, then going through the Python installation is probably worth the effort.

    In addition to READ EXCEL, the following methods can also be used to read data from Excel files.

    1. Spreadsheets typically support writing data to ASCII files. For example, Excel can write formatted text (space delimited) (.PRN extension), comma delimited files (.CSV extension), or tab delimited files (.TXT extension). Enter HELP READ for details of this.

      As with READ EXCEL, this option works best with clean rectangular data sets. If you do not have a clean Excel sheet, you will probably need to manually edit the ASCII file before trying to read it with Dataplot. However, it is a viable alternative to READ EXCEL when you do not have Python installed on your platform and do not wish to install it.

    2. You can also use Copy and Paste. You can Copy the desired data to the clipboard and then use the READ CLIPBOARD command in Dataplot (READ CLIPBOARD is system dependent, but should be available on most platforms). Alternatively, you can paste the data into an ASCII editor (e.g., Notepad or Wordpad on Windows, vi or Emacs on Linux), save the file and then use the Dataplot READ command.

      This option is particularly useful when your spreadsheet is not a clean rectangular data set.

Syntax:
    READ EXCEL <fname> <variable list>
                            <SUBSET/EXCEPT/FOR qualification>
    where <fname> is the name of the Excel file;
                <variable list> is the list of variables to read;
    and where the <SUBSET/EXCEPT/FOR qualification> is optional.

    Note that Pandas adds a "row index" variable as the first column in the CSV file. Be sure to add this variable to <variable list>.

Examples:
    READ EXCEL FILE.XLSX Y1 Y2 Y3 X
Note:
    Dataplot does not search for the Excel file in the Dataplot auxiliary files directories as it does for the READ for ASCII files. It also does not check for all lower case or all upper case versions of the file name. You need to enter the case sensitive version of the file name and include the full path name if the Excel file is not in the currrent directory.
Note:
    This command will not work successfully on all Excel files. Spreadsheets commonly contain information such as cross tabulations and graphs. Dataplot is not a spreadsheet, so cells containing graphs, tables, cross tabulations and so on will have unpredictable results.
Note:
    The Pandas pandas.to_csv function adda a row id variable to the ASCII file.
Note:
    Pandas by default assumes that the first row of the Excel file is a header line containing the variable names. If your Excel file does not contain a header line, enter the command

      SET EXCEL HEADER NONE

    To reset the default of a header line, enter the command

      SET EXCEL HEADER ON

    This command was added 2021/07. The "read_python.py" script was also updated at this time to support this option.

Note:
    The python script will by default read the entire spreadsheet.

    To specify the first and last rows of the Excel file to read, enter the following commands

      SET EXCEL START ROW <value>
      SET EXCEL STOP ROW <value>

    You can specify the start row only if you simply want to skip over the beginning lines of the Excel file. For example, if you want to skip the first five rows, enter

      SET EXCEL START ROW 6

    If you specify only the last row, the first row is assumed to be 1.

    Note that although Python/Pandas starts row indexing as row 0, you should start the row index at 1. Dataplot will automatically adjust the row number to use the Python/Pandas convention.

    Likewise, you can specify the first and last columns to read with the commands

      SET EXCEL START COLUMN <value>
      SET EXCEL STOP COLUMN <value>

    You should start the column indexing with 1 rather than 0.

Note:
    Excel (or some other spreadsheet program) does not need to be installed on your local platform. For example, you can create the Excel file on a Windows platform and then move the Excel file to a Linux platform.
Note:
    If Python is not installed on your default path, you can specify it using the SET PYTHON PATH command. For example, the following is for the Anaconda installation of Python 3 under Windows (where Anaconda is installed for the single user heckert)

      set python path c:\Users\heckert\AppData\Local\Continum\anaconda3\

    There are several different Python distributions. The appropriate Python path will depend on the specific distribution you used to install Python and whether you choose to install it for a single user or for all users.

Note:
    Both Python 2 and Python 3 are available. The Dataplot python script for reading Excel files assumes you are running Python 3. On some systems, you may need to enter the following command to run version 3 of Python

      SET PYTHON VERSION 3

    This will use the command "python3" rather than "python".

Note:
    On Windows platforms, the "read_excel.py" script will be copied to the current directory.
Note:
    The 2026/01 version added the option for using PowerShell rather than Python on Windows platforms. Although Python is likely to be installed on most Linux and MacOS platforms, this is less likely on Windows platforms.

    To specify that PowerShell be used rather than Python/Pandas, enter the command

      SET EXCEL METHOD POWERSHELL

    To reset the default to Python, enter

      SET EXCEL METHOD PYTHON

    Dataplot utilizes the freely importable module Import-Excel. To install this module, open up a PowerShell terminal and enter the command

      Install-Module -Name ImportExcel -Scope CurrentUser

    In order to run this command, you may need to set your PowerShell execution policy by entering the command

      Set-ExecutionPolicy RemoteSigned -Scope CurrentUser

    Note that this only needs to be done once. Dataplot does not check to see if this module has been installed and it does not initiate the installation.

    When using PowerShell, Dataplot dynamically creates the script "read_excel.ps1" in the current directory. This will look something like

      $excelFilePath = "C:\consult\ALAN_TEST.XLSX"
      $worksheetName = "Sheet1"
      $csvFilePath = "C:\consult\dpst1f.dat"
      $startRow = 3
      $endRow = 77
      $startColumn = 1
      $endColumn = 3
      Import-Excel -Path $excelFilePath -worksheetName $worksheetName
                  -StartRow $startRow -EndRow $endRow
                  -StartColumn $startColumn -EndColumn $endColumn |
                  Export-Csv -Path $csvFilePath -NoTypeInformation
                  -Encoding UTF8
      exit

    You can specify the starting and stopping rows and columns in the Excel sheet with the commands

      SET EXCEL START ROW <value>
      SET EXCEL STOP ROW <value>
      SET EXCEL START COLUMN <value>
      SET EXCEL STOP COLUMN <value>

    If these are not entered, Dataplot will try to read all rows and columns. The PowerShell option does not automatically account for a header line, so you may need to enter the start and stop rows to skip over a header line. Alternatively, you can enter a SKIP 1 before entering the READ EXCEL command.

    For convenience, the start and stop columns can also be entered as A, B, and so on up to ZZ (i.e., this works for the first 52 columns).

    PowerShell exports the data to the comma separated file "dpst1f.dat" in the current working directory. PowerShell encloses all fields in double quotes ("). The command

      SET READ REMOVE QUOTES ON

    was added to specify that all quotes on a line being read should be removed before parsing that line. Dataplot sets this automatically for the READ EXCEL command, so you do not have to enter it. However, you can enter this command manually if you have a file that also uses this syntax. Set this to OFF to revert back to the default of not removing quotes.

    As with the Python version, Dataplot does not search its sub-directories for the Excel file. So you should enter the full path name for the Excel file unless the the file is in the current working directory (if Dataplot does not detect a full path name, it will pre-pend the current working directory name to the file name).

Default:
    None
Synonyms:
    None
Related Commands:
    WRITE EXCEL = Write variables to an Excel file.
    PYTHON = Run a user specified Python script from within Dataplot.
    READ = Read variables, strings, parameters and matrices.
References:
    McKinney (2018), "Python for Data Analysis," Second Edition, O'Reilly.

    Chen (2018), "Pandas for Everyone: Python Data Analysis," O'Reilly.

    Broman and Woo (2018), "Data Organization in Spreadsheets," The American Statistician, Vol. 72, No. 1, pp. 2-10.

Applications:
    Data Input
Implementation Date:
    2020/02
    2020/05: Added SET EXCEL START ROW and SET EXCEL STOP ROW
    2021/07: Added SET EXCEL HEADER
    2026/01: Added SET EXCEL METHOD 2026/01: Added SET EXCEL START COLUMN and SET EXCEL STOP COLUMN
Program 1:
     
    dimension 40 columns
    . Path name for Windows
    . read excel "C:\Program Files (x86)\NIST\DATAPLOT\LIB\DATA\FIBERS.XLSX" ...
    .     rowid breakid xcent ycent xc1 yc1 xc2 yc2 xc3 yc3 xc4 yc4 length
    . Path name for Linux
    read excel /usr/local/lib/data/FIBERS.XLSX ...
         rowid breakid xcent ycent xc1 yc1 xc2 yc2 xc3 yc3 xc4 yc4 length
    set write decimals 3
    print breakid xcent length
        
    The following output is generated
    ---------------------------------------------
            BREAKID          XCENT         LENGTH
    ---------------------------------------------
              1.000       3924.447          0.000
              2.000       7230.652       3260.000
              3.000       7825.889        556.000
              4.000       9600.147       1728.000
              5.000      11662.307       2013.000
              6.000      12588.740        880.000
              7.000      13559.739        927.000
              8.000      14799.128       1194.000
              9.000      15984.171       1141.000
             10.000      16951.411        922.000
             11.000      19179.848       2177.000
             12.000      20234.844       1009.000
             13.000      21569.521       1287.000
             14.000      22745.419       1133.000
             15.000      23840.270       1052.000
             16.000      24850.845        962.000
             17.000      25998.789       1103.000
             18.000      26910.859        867.000
             19.000      28470.836       1517.000
             20.000      29140.799        623.000
             21.000      30514.016       1327.000
             22.000      31554.430        995.000
             23.000      32368.668        766.000
             24.000      33224.684        808.000
             25.000      34101.653        829.000
             26.000      35314.130       1165.000
             27.000      36619.810       1260.000
             28.000      37353.368        684.000
             29.000      38151.635        747.000
             30.000      39244.924       1048.000
             31.000      39919.396        629.000
        
Program 2:
     
    . Step 1: READ EXCEL does not search Dataplot sub-directories, so need
    .         to pre-pend the appropriate sub-directory to Excel file name.
    .
    probe path
    let string ipath = ^probestr
    probe iopsy1
    if probeval = 2
       let string islash = \
    else
       let string islash = /
    end of if
    let string subdir = data
    let ipath = string concatenate ipath subdir islash
    .
    let string fname = TEST.XLSX
    let fname = string concatenate ipath fname
    .
    . Step 2: Read the Excel file
    .
    set excel start row 1
    set excel stop  row 77
    read excel ^fname rowid labid y1 y2
    .
    . Step 3: Print the first few lines of the Excel file
    .
    set write decimals 2
    print rowid labid y1 y2 for i = 1 1 10
        
     ------------------------------------------------------------
               ROWID          LABID             Y1             Y2
     ------------------------------------------------------------
                0.00      101014.00          10.25           5.92
                1.00      101031.00          16.62           7.00
                2.00      101032.00          13.75           5.75
                3.00      101048.00          10.25           3.57
                4.00      101125.00           7.10           4.37
                5.00      101130.00          10.00           4.60
                6.00      101143.00          15.13           5.80
                7.00      101147.00           8.35           3.80
                8.00      101151.00           7.37           5.00
                9.00      101165.00           8.15           5.10
      
        
Program 3:
     
    set read excel method power shell
    set excel start row 3
    set excel stop row 77
    . set excel start column a
    . set excel stop column c
    set excel start column 1
    set excel stop column 3
    read excel "C:\Program Files (x86)\NIST\DATAPLOT\TEST.XLSX" labid y1 y2
        
Date created: 02/20/2020
Last updated: 01/20/2026

Please email comments on this WWW page to [email protected].