|
READ EXCELName:
Dataplot does not natively support reading or writing to Excel format files. The READ EXCEL command works as follows
This command assumes that Python and the Python package Pandas are already installed on your local platform. Dataplot does not check if Python is installed and it does not initiate the Python installation if it is not already installed. As Python is used by many popular applications on Linux platforms, most Linux platforms will already have Python installed. However, this is not the case for Windows and MacOS platforms. If you need to install Python, there are a number of Python distributions (see https://wiki.python.org/moin/PythonDistributions). However, the most common are ActivePython from ActiveState and Anaconda from Continuum Analytics. Dataplot does not depend on a specific Python distribution and we make no recommendation for the preferred distribution. This command will typically work well with Excel files that are clean rectangular data sets. If you have added graphs, formulas, cross-tabulations, and so on to the Excel sheet, then the READ EXCEL command is likely to be unsatisfactory. You can either copy the data portion of the spreadsheet to a new sheet or use one of the other methods for reading data from spreadsheets listed below. Broman and Woo (see References below) wrote a useful article in The American Statistician that provides useful guidelines for creating Excel files that can be more easily read by non-Excel programs. If you infrequently need to read Excel files and do not already have Python installed on your local platform, there are several additional methods for reading the data from Excel files. However, if you anticipate the need for frequently reading Excel files, then going through the Python installation is probably worth the effort. In addition to READ EXCEL, the following methods can also be used to read data from Excel files.
<SUBSET/EXCEPT/FOR qualification> where <fname> is the name of the Excel file; <variable list> is the list of variables to read; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Note that Pandas adds a "row index" variable as the first column in the CSV file. Be sure to add this variable to <variable list>.
Also, Dataplot will not search for the Excel file in the Dataplot auxiliary files directories as it does for the READ for ASCII files. Note: The 2020/05 version of Dataplot now supports specifying the first and last rows of the Excel file to read with the following commands
SET EXCEL STOP ROW <value> Python/Pandas start row numbers with 0. Also, the first line read is assumed to contain the variable names. So if the first good line is row 3 in the Excel file, use
Currently you need to specify both the START ROW and the STOP row. Otherwise, Pandas will include all rows in the Excel sheet. One use of this is to skip over header lines in the Excel file.
To reset the default of a header line, enter the command
This command was added 2021/07. The "read_python.py" script was also updated at this time to support this option.
There are several different Python distributions. The appropriate Python path will depend on the specific distribution you used to install Python and whether you choose to install it for a single user or for all users.
Chen (2018), "Pandas for Everyone: Python Data Analysis," O'Reilly. Broman and Woo (2018), "Data Organization in Spreadsheets," The American Statistician, Vol. 72, No. 1, pp. 2-10.
2020/05: Added SET EXCEL START ROW and SET EXCEL STOP ROW 2021/07: Added SET EXCEL HEADER dimension 40 columns . Path name for Windows . read excel "C:\Program Files (x86)\NIST\DATAPLOT\LIB\DATA\FIBERS.XLSX" ... . rowid breakid xcent ycent xc1 yc1 xc2 yc2 xc3 yc3 xc4 yc4 length . Path name for Linux read excel /usr/local/lib/data/FIBERS.XLSX ... rowid breakid xcent ycent xc1 yc1 xc2 yc2 xc3 yc3 xc4 yc4 length set write decimals 3 print breakid xcent lengthThe following output is generated --------------------------------------------- BREAKID XCENT LENGTH --------------------------------------------- 1.000 3924.447 0.000 2.000 7230.652 3260.000 3.000 7825.889 556.000 4.000 9600.147 1728.000 5.000 11662.307 2013.000 6.000 12588.740 880.000 7.000 13559.739 927.000 8.000 14799.128 1194.000 9.000 15984.171 1141.000 10.000 16951.411 922.000 11.000 19179.848 2177.000 12.000 20234.844 1009.000 13.000 21569.521 1287.000 14.000 22745.419 1133.000 15.000 23840.270 1052.000 16.000 24850.845 962.000 17.000 25998.789 1103.000 18.000 26910.859 867.000 19.000 28470.836 1517.000 20.000 29140.799 623.000 21.000 30514.016 1327.000 22.000 31554.430 995.000 23.000 32368.668 766.000 24.000 33224.684 808.000 25.000 34101.653 829.000 26.000 35314.130 1165.000 27.000 36619.810 1260.000 28.000 37353.368 684.000 29.000 38151.635 747.000 30.000 39244.924 1048.000 31.000 39919.396 629.000Program 2: . Step 1: READ EXCEL does not search Dataplot sub-directories, so need . to pre-pend the appropriate sub-directory to Excel file name. . probe path let string ipath = ^probestr probe iopsy1 if probeval = 2 let string islash = \ else let string islash = / end of if let string subdir = data let ipath = string concatenate ipath subdir islash . let string fname = TEST.XLSX let fname = string concatenate ipath fname . . Step 2: Read the Excel file . set excel start row 1 set excel stop row 77 read excel ^fname rowid labid y1 y2 . . Step 3: Print the first few lines of the Excel file . set write decimals 2 print rowid labid y1 y2 for i = 1 1 10 ------------------------------------------------------------ ROWID LABID Y1 Y2 ------------------------------------------------------------ 0.00 101014.00 10.25 5.92 1.00 101031.00 16.62 7.00 2.00 101032.00 13.75 5.75 3.00 101048.00 10.25 3.57 4.00 101125.00 7.10 4.37 5.00 101130.00 10.00 4.60 6.00 101143.00 15.13 5.80 7.00 101147.00 8.35 3.80 8.00 101151.00 7.37 5.00 9.00 101165.00 8.15 5.10
|
Privacy
Policy/Security Notice
NIST is an agency of the U.S.
Commerce Department.
Date created: 02/20/2020 |