cape.dkit.textdata: Generic textual data interface¶
This module contains a basic interface in the spirit of
cape.dkit.ftypes for standard text data files. It creates a
class, TextDataFile that does not rely on the popular
numpy.loadtxt() function and supports a more capabilities than
the cape.dkit.ftypes.csv.CSVFile class.
For example, the TextDataFile class supports a variety of
delimiters, whereas a CSVFile instance must use ',' as the
delimiter. The TextDataFile class also remembers its text
If possible, the column names (which become keys in the
dict-like class) are read from the header row. If the file
begins with multiple comment lines, the column names are read from the
final comment before the beginning of data.
- class cape.dkit.textdata.TextDataDefn(_optsdict=None, _warnmode=1, **kw)¶
- class cape.dkit.textdata.TextDataFile(fname=None, **kw)¶
Interface to generic data text files
- Call:
>>> db = TextDataFile(fname=None, **kw)
- Inputs:
- Outputs:
- db:
cape.dkit.ftypes.textdata.TextDatafile Text data file interface
- db.cols:
list[str] List of columns read
- db.lines:
list[str] Lines of text from the file that was read
- db.opts:
TextdataOpts Options for this instance
- db.defns:
dict[TextDataDefn Definitions for each column
- db[col]:
np.ndarray|list Numeric array or list of strings for each column
- db:
- Versions:
2019-12-02
@ddalle: v1.0
- finish_defns()¶
Process Definitions of column types
- Call:
>>> db.finish_defns(**kw)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Data file interface
- db:
- Versions:
2014-06-05
@ddalle: v1.02014-06-17
@ddalle: Read from defnsdict2019-11-12
@ddalle: Forked fromRunMatrix2020-02-06
@ddalle: Using self.opts
- fromtext_boolmap(txt, col)¶
Convert boolean flag text to dictionary
- Call:
>>> v, vmap = db.fromtext_boolmap(txt, col)
- Inputs:
- Outputs:
- Versions:
2019-12-02
@ddalle: v1.0
- fromtext_val(txt, clsname, col=None)¶
Convert a string to appropriate type
- Call:
>>> v = db.fromtext_val(txt, clsname, col)
- Inputs:
- Outputs:
- v:
clsname Text translated to requested type
- v:
- Versions:
2019-12-02
@ddalle: v1.0
- process_defns_boolmap(col, bmap)¶
Process definitions for columns of type BoolMap
- Call:
>>> db.process_defns_boolmap(col, bmap)
- Inputs:
- See Also:
- Versions:
2019-12-03
@ddalle: v1.0
- read_textdata(fname)¶
Read an entire text data file
- Call:
>>> db.read_textdata(fname)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- fname:
str Name of file to read
- db:
- See Also:
- Versions:
2019-12-02
@ddalle: v1.0
- read_textdata_data(f)¶
Read data portion of text data file
- read_textdata_firstrowtypes(f)¶
Get initial guess at data types from first data row
If (and only if) the DefaultType input is an integer type, guessed types can be integers. Otherwise the sequence of possibilities is
float,complex,str.- Call:
>>> db.read_textdata_firstrowtypes(f, **kw)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- f:
file Open file handle
- DefaultType: {
"float"} |str Name of default class
- db:
- Versions:
2019-11-25
@ddalle: v1.02019-12-02
@ddalle: Copied fromCSVFile
- read_textdata_header(f)¶
Read column names from beginning of open file
- read_textdata_headerdefaultcols(f)¶
Create column names “col1”, “col2”, etc. if needed
- Call:
>>> db.read_textdata_headerdefaultcols(f)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- f:
file Open file handle
- db:
- Effects:
- Versions:
2019-11-27
@ddalle: v1.02019-12-02
@ddalle: Copied fromCSVFile
- read_textdata_headerline(f)¶
Read line and process column names if possible
- Call:
>>> db.read_textdata_headerline(f)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- f:
file Open file handle
- db:
- Effects:
- Versions:
2019-11-22
@ddalle: v1.02019-12-02
@ddalle: Copied fromCSVFile
- read_textdata_line(f)¶
Read a data row from a text data file
- Call:
>>> db.read_textdata_line(f)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- f:
file Open file handle
- db:
- Versions:
2019-11-25
@ddalle: v1.0
- set_regex_linesplitter()¶
Generate regular expression used to split a line
- Call:
>>> db.set_regex_linesplitter()
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- db:
- Effects:
- db.regex_linesplit:
re.SRE_Pattern Compiled regular expression object
- db.regex_linesplit:
- Versions:
2019-12-02
@ddalle: v1.0
- split_textdata_line(line)¶
Split a line into its parts
Splits line of text by specified delimiter and strips whitespace and delimiter from each entry
- validate_boolmap(boolmap)¶
Translate free-form Type option into validated code
- Call:
>>> bmap = db.validate_boolmap(boolmap)
- Inputs:
- Outputs:
- Versions:
2019-12-03
@ddalle: v1.0
- write_textdata(fname=None)¶
Write text data file based on existing db.lines
Checks are not performed that values in e.g. db[col] have been synchronized with the text in db.lines. It is therefore possible to write a file that does not match the values in the database. To avoid this, use
set_colval().- Call:
>>> db.write_textdata() >>> db.write_textdata(fname)
- Inputs:
- db:
cape.dkit.ftypes.textdata.TextDataFile Text data file interface
- fname: {db.fname} |
str Name of file to write
- db:
- Versions:
2019-12-04
@ddalle: v1.0
- class cape.dkit.textdata.TextDataOpts(_optsdict=None, _warnmode=1, **kw)¶