|
READ EXCELName:
Dataplot does not natively support reading or writing to Excel format files. The READ EXCEL command works as follows
This command assumes that Python and the Python package Pandas are already installed on your local platform. Dataplot does not check if Python is installed and it does not initiate the Python installation if it is not already installed. As Python is used by many popular applications on Linux platforms, most Linux platforms will already have Python installed. However, this is not the case for Windows and MacOS platforms. If you need to install Python, there are a number of Python distributions (see https://wiki.python.org/moin/PythonDistributions). However, the most common are ActivePython from ActiveState and Anaconda from Continuum Analytics. Dataplot does not depend on a specific Python distribution and we make no recommendation for the preferred distribution. This command will typically work well with Excel files that are clean rectangular data sets. If you have added graphs, formulas, cross-tabulations, and so on to the Excel sheet, then the READ EXCEL command is likely to be unsatisfactory. You can either copy the data portion of the spreadsheet to a new sheet or use one of the other methods for reading data from spreadsheets listed below. Broman and Woo (see References below) wrote a useful article in The American Statistician that provides useful guidelines for creating Excel files that can be more easily read by non-Excel programs. If you infrequently need to read Excel files and do not already have Python installed on your local platform, there are several additional methods for reading the data from Excel files. However, if you anticipate the need for frequently reading Excel files, then going through the Python installation is probably worth the effort. In addition to READ EXCEL, the following methods can also be used to read data from Excel files.
<SUBSET/EXCEPT/FOR qualification> where <fname> is the name of the Excel file; <variable list> is the list of variables to read; and where the <SUBSET/EXCEPT/FOR qualification> is optional. Note that Pandas adds a "row index" variable as the first column in the CSV file. Be sure to add this variable to <variable list>.
Also, Dataplot will not search for the Excel file in the Dataplot auxiliary files directories as it does for the READ for ASCII files. Note: The 2020/05 version of Dataplot now supports specifying the first and last rows of the Excel file to read with the following commands
SET EXCEL STOP ROW <value> Python/Pandas start row numbers with 0. Also, the first line read is assumed to contain the variable names. So if the first good line is row 3 in the Excel file, use
Currently you need to specify both the START ROW and the STOP row. Otherwise, Pandas will include all rows in the Excel sheet. One use of this is to skip over header lines in the Excel file.
To reset the default of a header line, enter the command
This command was added 2021/07. The "read_python.py" script was also updated at this time to support this option.
There are several different Python distributions. The appropriate Python path will depend on the specific distribution you used to install Python and whether you choose to install it for a single user or for all users.
Chen (2018), "Pandas for Everyone: Python Data Analysis," O'Reilly. Broman and Woo (2018), "Data Organization in Spreadsheets," The American Statistician, Vol. 72, No. 1, pp. 2-10.
2020/05: Added SET EXCEL START ROW and SET EXCEL STOP ROW 2021/07: Added SET EXCEL HEADER
dimension 40 columns
. Path name for Windows
. read excel "C:\Program Files (x86)\NIST\DATAPLOT\LIB\DATA\FIBERS.XLSX" ...
. rowid breakid xcent ycent xc1 yc1 xc2 yc2 xc3 yc3 xc4 yc4 length
. Path name for Linux
read excel /usr/local/lib/data/FIBERS.XLSX ...
rowid breakid xcent ycent xc1 yc1 xc2 yc2 xc3 yc3 xc4 yc4 length
set write decimals 3
print breakid xcent length
The following output is generated
---------------------------------------------
BREAKID XCENT LENGTH
---------------------------------------------
1.000 3924.447 0.000
2.000 7230.652 3260.000
3.000 7825.889 556.000
4.000 9600.147 1728.000
5.000 11662.307 2013.000
6.000 12588.740 880.000
7.000 13559.739 927.000
8.000 14799.128 1194.000
9.000 15984.171 1141.000
10.000 16951.411 922.000
11.000 19179.848 2177.000
12.000 20234.844 1009.000
13.000 21569.521 1287.000
14.000 22745.419 1133.000
15.000 23840.270 1052.000
16.000 24850.845 962.000
17.000 25998.789 1103.000
18.000 26910.859 867.000
19.000 28470.836 1517.000
20.000 29140.799 623.000
21.000 30514.016 1327.000
22.000 31554.430 995.000
23.000 32368.668 766.000
24.000 33224.684 808.000
25.000 34101.653 829.000
26.000 35314.130 1165.000
27.000 36619.810 1260.000
28.000 37353.368 684.000
29.000 38151.635 747.000
30.000 39244.924 1048.000
31.000 39919.396 629.000
Program 2:
. Step 1: READ EXCEL does not search Dataplot sub-directories, so need
. to pre-pend the appropriate sub-directory to Excel file name.
.
probe path
let string ipath = ^probestr
probe iopsy1
if probeval = 2
let string islash = \
else
let string islash = /
end of if
let string subdir = data
let ipath = string concatenate ipath subdir islash
.
let string fname = TEST.XLSX
let fname = string concatenate ipath fname
.
. Step 2: Read the Excel file
.
set excel start row 1
set excel stop row 77
read excel ^fname rowid labid y1 y2
.
. Step 3: Print the first few lines of the Excel file
.
set write decimals 2
print rowid labid y1 y2 for i = 1 1 10
------------------------------------------------------------
ROWID LABID Y1 Y2
------------------------------------------------------------
0.00 101014.00 10.25 5.92
1.00 101031.00 16.62 7.00
2.00 101032.00 13.75 5.75
3.00 101048.00 10.25 3.57
4.00 101125.00 7.10 4.37
5.00 101130.00 10.00 4.60
6.00 101143.00 15.13 5.80
7.00 101147.00 8.35 3.80
8.00 101151.00 7.37 5.00
9.00 101165.00 8.15 5.10
|
Privacy
Policy/Security Notice
NIST is an agency of the U.S.
Commerce Department.
Date created: 02/20/2020 | |||||||||||||