`cape.attdb.ftypes.textdata`: Generic textual data interface¶

This module contains a basic interface in the spirit of cape.attdb.ftypes for standard text data files. It creates a class, TextDataFile that does not rely on the popular numpy.loadtxt() function and supports a more capabilities than the cape.attdb.ftypes.csv.CSVFile class.

For example, the TextDataFile class supports a variety of delimiters, whereas a CSVFile instance must use ',' as the delimiter. The TextDataFile class also remembers its text

If possible, the column names (which become keys in the dict-like class) are read from the header row. If the file begins with multiple comment lines, the column names are read from the final comment before the beginning of data.

class cape.attdb.ftypes.textdata.TextDataDefn(_optsdict=None, _warnmode=1, **kw)¶

class cape.attdb.ftypes.textdata.TextDataFile(fname=None, **kw)¶

Interface to generic data text files

Call:

>>> db = TextDataFile(fname=None, **kw)

Inputs:

fname: str: Name of file to read
delim, Delimiter: {", "} | str: Delimiter(s) option

Outputs:

db: cape.attdb.ftypes.textdata.TextDatafile: Text data file interface
db.cols: list[str]: List of columns read
db.lines: list[str]: Lines of text from the file that was read
db.opts: TextdataOpts: Options for this instance
db.defns: dict[TextDataDefn: Definitions for each column
db[col]: np.ndarray | list: Numeric array or list of strings for each column

Versions:

2019-12-02 @ddalle: v1.0

finish_defns()¶

Process Definitions of column types

Call:

>>> db.finish_defns(**kw)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Data file interface

Versions:

2014-06-05 @ddalle: v1.0
2014-06-17 @ddalle: Read from defns dict
2019-11-12 @ddalle: Forked from RunMatrix
2020-02-06 @ddalle: Using self.opts

fromtext_boolmap(txt, col)¶

Convert boolean flag text to dictionary

Call:

>>> v, vmap = db.fromtext_boolmap(txt, col)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
txt: str: Text to be converted to float
clsname: {"float64"} | "int32" | str: Valid data type name
col: str: Name of flag column, for "boolmap" keys

Outputs:

txt: str: Text returned
vmap: dict[True | False]: Flags for each flag in col definition

Versions:

2019-12-02 @ddalle: v1.0

fromtext_val(txt, clsname, col=None)¶

Convert a string to appropriate type

Call:

>>> v = db.fromtext_val(txt, clsname, col)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
txt: str: Text to be converted to float
clsname: {"float64"} | "int32" | str: Valid data type name
col: str: Name of flag column, for "boolmap" keys

Outputs:

v: clsname: Text translated to requested type

Versions:

2019-12-02 @ddalle: v1.0

process_defns_boolmap(col, bmap)¶

Process definitions for columns of type BoolMap

Call:

>>> db.process_defns_boolmap(col, bmap)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Data file interface
col: str: Name of column with type "BollMap"
bmap: dict: Map for abbreviations that set boolean columns

See Also:

validate_boolmap()

Versions:

2019-12-03 @ddalle: v1.0

read_textdata(fname)¶

Read an entire text data file

Call:

>>> db.read_textdata(fname)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
fname: str: Name of file to read

See Also:

read_textdata_header()
read_textdata_data()

Versions:

2019-12-02 @ddalle: v1.0

read_textdata_data(f)¶

Read data portion of text data file

Call:

>>> db.read_textdata_data(f)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
f: file: Open file handle

Effects:

db.cols: list[str]: List of column names

Versions:

2019-11-25 @ddalle: v1.0

read_textdata_firstrowtypes(f)¶

Get initial guess at data types from first data row

If (and only if) the DefaultType input is an integer type, guessed types can be integers. Otherwise the sequence of possibilities is float, complex, str.

Call:

>>> db.read_textdata_firstrowtypes(f, **kw)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
f: file: Open file handle
DefaultType: {"float"} | str: Name of default class

Versions:

2019-11-25 @ddalle: v1.0
2019-12-02 @ddalle: Copied from CSVFile

read_textdata_header(f)¶

Read column names from beginning of open file

Call:

>>> db.read_textdata_header(f)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
f: file: Open file handle

Effects:

db.cols: list[str]: List of column names

Versions:

2019-11-12 @ddalle: v1.0

read_textdata_headerdefaultcols(f)¶

Create column names “col1”, “col2”, etc. if needed

Call:

>>> db.read_textdata_headerdefaultcols(f)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
f: file: Open file handle

Effects:

db.cols: list[str]: If not previously determined, this becomes ["col1", "col2", ...] based on number of columns in the first data row

Versions:

2019-11-27 @ddalle: v1.0
2019-12-02 @ddalle: Copied from CSVFile

read_textdata_headerline(f)¶

Read line and process column names if possible

Call:

>>> db.read_textdata_headerline(f)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
f: file: Open file handle

Effects:

db.cols: None | list[str]: List of column names if read
db._textdata_header_once: True | False: Set to True if column names are read at all
db._textdata_header_complete: True | False: Set to True if next line is expected to be data

Versions:

2019-11-22 @ddalle: v1.0
2019-12-02 @ddalle: Copied from CSVFile

read_textdata_line(f)¶

Read a data row from a text data file

Call:

>>> db.read_textdata_line(f)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
f: file: Open file handle

Versions:

2019-11-25 @ddalle: v1.0

set_regex_linesplitter()¶

Generate regular expression used to split a line

Call:

>>> db.set_regex_linesplitter()

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface

Effects:

db.regex_linesplit: re.SRE_Pattern: Compiled regular expression object

Versions:

2019-12-02 @ddalle: v1.0

split_textdata_line(line)¶

Split a line into its parts

Splits line of text by specified delimiter and strips whitespace and delimiter from each entry

Call:

>>> parts = db.split_textdata_line(line)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
line: str: Line of text to be split

Outputs:

parts: list[str]: List of strings

Versions:

2019-12-02 @ddalle: v1.0
2024-01-10 @ddalle: v1.1; allow whitespace in cols

validate_boolmap(boolmap)¶

Translate free-form Type option into validated code

Call:

>>> bmap = db.validate_boolmap(boolmap)

Inputs:

db: cape.attdb.ftypes.textdata.TextData: Data file interface
boolmap: str[str | list]: Initial boolean flag map; the keys are names of the boolean coefficients that are set, and the item values are the one or more abbreviations for each key

Outputs:

bmap: str[list[str]]: Validated map

Versions:

2019-12-03 @ddalle: v1.0

write_textdata(fname=None)¶

Write text data file based on existing db.lines

Checks are not performed that values in e.g. db[col] have been synchronized with the text in db.lines. It is therefore possible to write a file that does not match the values in the database. To avoid this, use set_colval().

Call:

>>> db.write_textdata()
>>> db.write_textdata(fname)

Inputs:

db: cape.attdb.ftypes.textdata.TextDataFile: Text data file interface
fname: {db.fname} | str: Name of file to write

Versions:

2019-12-04 @ddalle: v1.0

class cape.attdb.ftypes.textdata.TextDataOpts(_optsdict=None, _warnmode=1, **kw)¶

`cape.attdb.ftypes.textdata`: Generic textual data interface¶

Previous topic

Next topic

This Page

cape.attdb.ftypes.textdata: Generic textual data interface¶

`cape.attdb.ftypes.textdata`: Generic textual data interface¶