cape.attdb.ftypes.xlsfile
: Excel spreadsheet data interface¶
This module provides a class XLSFile
for reading and writing
tabular data from Excel-like spreadsheets with the file extension
.xls
or .xlsx
. It relies on two third-party libraries readily
available from the Python Package Index (PyPI):
xlrd
for reading spreadsheets
xlsxwriter
for writing them
These can be readily installed on any machine with both Python and access to the internet (even without elevated privileges) using the commands below:
$ pip install --user xlrd $ pip install --user xlsxwriter
Because CAPE may also be used on machines without regular access to the
internet, this module does not raise an ImportError
in the case that
these third-party modules are not available. However, the module will
provide no functionality if these modules are not available.
- class cape.attdb.ftypes.xlsfile.XLSFile(fname=None, sheet=None, **kw)¶
Class for reading
.xls
and.xlsx
files- Call:
>>> db = XLSFile(fname, sheet=0, **kw) >>> db = XLSFile(wb, sheet=0, **kw) >>> db = XLSFile(ws, **kw)
- Inputs:
- fname:
str
Name of
.xls
or.xlsx
file to read- sheet: {
0
} |int
|str
Worksheet name or number
- wb:
xlrd.book.Book
Open workbook (spreadsheet file)
- ws:
xlrd.sheet.Sheet
Direct access to a worksheet
- fname:
- Outputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- db.cols:
list
[str
] List of columns read
- db.opts:
dict
Options for this interface
- db[col]:
np.ndarray
|list
Numeric array or list of strings for each column
- db:
- Versions:
2019-12-12
@ddalle
: First version
- get_autoskip(ws, **kw)¶
Automatically determine number of rows and columns to skip
- Call:
>>> wsopts = db.get_autoskip(ws, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- ws:
xlrd.sheet.Sheet
Direct access to a worksheet
- skiprows: {
None
} |int
>= 0 Number of rows to skip before reading data
- subcols: {
0
} |int
>= 0 Number of cols to skip right of first col (for arrays)
- subrows: {
0
} |int
>= 0 Number of rows below header row to skip
- skipcols: {
None
} |int
>= 0 Number of columns to skip before first data column
- maxcols: {
None
} |int
> skipcols Maximum column number of data
- db:
- Outputs:
- wsopts:
XLSSheetOpts
Options from db.opts_by_sheet[ws.name],
- wsopts:
- Versions:
2019-12-26
@ddalle
: Split fromread_xls_header()
2020-01-14
@ddalle
: Moved everything to smaller funcs2020-02-07
@ddalle
: UsingXLSSheetOpts
- get_worksheet_opts(sheet, **kw)¶
Get or create specific options for each worksheet
- Call:
>>> opts = db.get_worksheet_opts(sheet, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- sheet:
str
Name of worksheet in question
- kw:
dict
Keyword options valid to db._optscls
- db:
- Outputs:
- opts:
XLSFileOpts
| db._optscls Combined options from db.opts and kw
- opts:
- Effects:
- db.opts_by_sheet[sheet]:
XLSFileOpts
Set to opts
- db.opts_by_sheet[sheet]:
- Versions:
2020-02-06
@ddalle
: First version
- read_xls(fname, sheet=None, **kw)¶
Read an
.xls
or.xlsx
file- Call:
>>> db.read_xls(fname, sheet, **kw) >>> db.read_xls(wb, sheet, **kw) >>> db.read_xls(wb, **kw) >>> db.read_xls(ws, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- fname:
str
Name of
.xls
or.xlsx
file to read- sheet: {
0
} |int
|str
Worksheet name or number
- wb:
xlrd.book.Book
Open workbook (spreadsheet file)
- ws:
xlrd.sheet.Sheet
Direct access to a worksheet
- db:
- Versions:
2019-12-12
@ddalle
: First version2019-12-26
@ddalle
: Support “array” worksheets2020-02-07
@ddalle
: UsingXLSSheetOpts
- read_xls_coldata(ws, cols, **kw)¶
Read column data from one
.xls
or.xlsx
worksheet- Call:
>>> db.read_xls_coldata(ws, cols, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- ws:
xlrd.sheet.Sheet
Direct access to a worksheet
- cols:
list
[str
] List of column names to process
- skiprows: {
None
} |int
>= 0 Number of rows to skip before reading data
- subrows: {
0
} |int
> 0 Number of rows below header row to skip
- skipcols: {
None
} |int
>= 0 Number of columns to skip before first data column
- maxcols: {
None
} |int
> skipcols Maximum column number of data
- db:
- Effects:
- db[col]:
list
|np.ndarray
Column of data read from each column
- db._n[col]:
int
Length of each column
- db[col]:
- Versions:
2019-12-12
@ddalle
: First version
- read_xls_header(ws, **kw)¶
Read header row from a worksheet
- Call:
>>> cols = db.read_xls_header(ws, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- ws:
xlrd.sheet.Sheet
Direct access to a worksheet
- colspec: {
None
} |list
List of column names col or
(col, colwidth)
; using this option preempts reading column names from header row- skiprows: {
None
} |int
>= 0 Number of rows to skip before reading data
- subrows: {
None
} |int
>= 0 Number of rows below header row to skip
- skipcols: {
None
} |int
>= 0 Number of columns to skip before first data column
- maxcols: {
None
} |int
> skipcols Maximum column number of data
- db:
- Outputs:
- cols:
list
[str
] List of column names if read
- cols:
- Versions:
2019-12-12
@ddalle
: First version2020-01-16
@ddalle
: Full scalar/array support2020-02-07
@ddalle
: UsingXLSSheetOpts
- read_xls_workbook(wb, **kw)¶
Read
.xls
or.xlsx
workbook with multiple worksheets- Call:
>>> db.read_xls_workbook(wb, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- wb:
xlrd.Book
Direct access to a workbook
- ndim: {
0
} |1
Dimensionality of one row of data column(s) to read
- skiprows: {
None
} |int
>= 0 Number of rows to skip before reading data
- subrows: {
None
} |int
>= 0 Number of rows below header row to skip
- skipcols: {
None
} |int
>= 0 Number of columns to skip before first data column
- maxrows: {
None
} |int
> skiprows Maximum row number of data
- maxcols: {
None
} |int
> skipcols Maximum column number of data
- db:
- Versions:
2020-01-08
@jmeeroff
: First version2020-02-07
@ddalle
: UsingXLSSheetOpts
- read_xls_worksheet(ws, **kw)¶
Read one worksheet of an
.xls
or.xlsx
file- Call:
>>> db.read_xls_worksheet(ws, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- ws:
xlrd.sheet.Sheet
Direct access to a worksheet
- db:
- Versions:
2019-12-12
@ddalle
: First version2019-12-26
@ddalle
: Support “array” worksheets2020-01-10
@jmeeroff
: Array read as fallback2020-01-16
@ddalle
: Unified two worksheet methods2020-02-07
@ddalle
: UsingXLSSheetOpts
- write_xls(fname, cols=None, **kw)¶
Write data to one or more worksheets
- Call:
>>> db.write_xls(fname, cols=None, **kw) >>> db.write_xls(wb, cols=None, **kw)
- Inputs:
- db:
cape.attdb.ftypes.xlsfile.XLSFile
XLS file interface
- fname: {db.fname} |
str
Name of XLS file to create
- wb:
xlsxwriter.workbook.Workbook
Workbook (write) interface
- cols:
list
[str
] List of data columns to write
- SheetNames | sheets:
list
[str
] List of worksheets to write
- SheetCols | sheetcols:
dict
[list
] List of columns to write to each worksheet
- SheetWriters:
dict
[callable] Handles to standalone writers for some worksheets
- SheetPreWriters:
dict
[callable] Functions to write some things to worksheet before also writing data columns
- SheetPostWriters:
dict
[callable] Functions to write to worksheet after writing data
- SheetWritersSelfArg:
list
|set
List/set of sheets whose Writers take db as an arg
- SheetWritersWorksheetArg:
list
|set
List/set of sheets whose Writers take worksheet instead of workbook as arg
- SkipCols: {
0
} |int
> 0 Number of columns to skip before writing
- SkipRows: {
0
} |int
> 0 Number of rows to skip before writing
- ColMasks: {
{}
} |dict
Masks (subset indices) for some cols
- Translators: {
{}
} |dict
Column name translators (applied in reverse)
- Prefix: {
None
} |dict
|str
Prefix (applied in reverse)
- Suffix: {
None
} |dict
|str
Suffix (applied in reverse)
- db:
- Versions:
2020-05-21
@ddalle
: First version
- class cape.attdb.ftypes.xlsfile.XLSFileDefn(_optsdict=None, _warnmode=1, **kw)¶
- class cape.attdb.ftypes.xlsfile.XLSFileOpts(_optsdict=None, _warnmode=1, **kw)¶
- class cape.attdb.ftypes.xlsfile.XLSSheetOpts(_optsdict=None, _warnmode=1, **kw)¶