cape.attdb.ftypes.textdata
: Generic textual data interface¶
This module contains a basic interface in the spirit of
cape.attdb.ftypes
for standard text data files. It creates a
class, TextDataFile
that does not rely on the popular
numpy.loadtxt()
function and supports a more capabilities than
the cape.attdb.ftypes.csv.CSVFile
class.
For example, the TextDataFile
class supports a variety of
delimiters, whereas a CSVFile
instance must use ','
as the
delimiter. The TextDataFile
class also remembers its text
If possible, the column names (which become keys in the
dict
-like class) are read from the header row. If the file
begins with multiple comment lines, the column names are read from the
final comment before the beginning of data.
- class cape.attdb.ftypes.textdata.TextDataDefn(_optsdict=None, _warnmode=1, **kw)¶
- class cape.attdb.ftypes.textdata.TextDataFile(fname=None, **kw)¶
Interface to generic data text files
- Call:
>>> db = TextDataFile(fname=None, **kw)
- Inputs:
- fname:
str
Name of file to read
- delim, Delimiter: {
", "
} |str
Delimiter(s) option
- fname:
- Outputs:
- db:
cape.attdb.ftypes.textdata.TextDatafile
Text data file interface
- db.cols:
list
[str
] List of columns read
- db.lines:
list
[str
] Lines of text from the file that was read
- db.opts:
TextdataOpts
Options for this instance
- db.defns:
dict
[TextDataDefn
Definitions for each column
- db[col]:
np.ndarray
|list
Numeric array or list of strings for each column
- db:
- Versions:
2019-12-02
@ddalle
: v1.0
- finish_defns()¶
Process Definitions of column types
- Call:
>>> db.finish_defns(**kw)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Data file interface
- db:
- Versions:
2014-06-05
@ddalle
: v1.02014-06-17
@ddalle
: Read from defnsdict
2019-11-12
@ddalle
: Forked fromRunMatrix
2020-02-06
@ddalle
: Using self.opts
- fromtext_boolmap(txt, col)¶
Convert boolean flag text to dictionary
- Call:
>>> v, vmap = db.fromtext_boolmap(txt, col)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- txt:
str
Text to be converted to
float
- clsname: {
"float64"
} |"int32"
|str
Valid data type name
- col:
str
Name of flag column, for
"boolmap"
keys
- db:
- Outputs:
- txt:
str
Text returned
- vmap:
dict
[True
|False
] Flags for each flag in col definition
- txt:
- Versions:
2019-12-02
@ddalle
: v1.0
- fromtext_val(txt, clsname, col=None)¶
Convert a string to appropriate type
- Call:
>>> v = db.fromtext_val(txt, clsname, col)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- txt:
str
Text to be converted to
float
- clsname: {
"float64"
} |"int32"
|str
Valid data type name
- col:
str
Name of flag column, for
"boolmap"
keys
- db:
- Outputs:
- v:
clsname
Text translated to requested type
- v:
- Versions:
2019-12-02
@ddalle
: v1.0
- process_defns_boolmap(col, bmap)¶
Process definitions for columns of type BoolMap
- Call:
>>> db.process_defns_boolmap(col, bmap)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Data file interface
- col:
str
Name of column with type
"BollMap"
- bmap:
dict
Map for abbreviations that set boolean columns
- db:
- See Also:
- Versions:
2019-12-03
@ddalle
: v1.0
- read_textdata(fname)¶
Read an entire text data file
- Call:
>>> db.read_textdata(fname)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- fname:
str
Name of file to read
- db:
- See Also:
- Versions:
2019-12-02
@ddalle
: v1.0
- read_textdata_data(f)¶
Read data portion of text data file
- Call:
>>> db.read_textdata_data(f)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- f:
file
Open file handle
- db:
- Effects:
- db.cols:
list
[str
] List of column names
- db.cols:
- Versions:
2019-11-25
@ddalle
: v1.0
- read_textdata_firstrowtypes(f)¶
Get initial guess at data types from first data row
If (and only if) the DefaultType input is an integer type, guessed types can be integers. Otherwise the sequence of possibilities is
float
,complex
,str
.- Call:
>>> db.read_textdata_firstrowtypes(f, **kw)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- f:
file
Open file handle
- DefaultType: {
"float"
} |str
Name of default class
- db:
- Versions:
2019-11-25
@ddalle
: v1.02019-12-02
@ddalle
: Copied fromCSVFile
- read_textdata_header(f)¶
Read column names from beginning of open file
- Call:
>>> db.read_textdata_header(f)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- f:
file
Open file handle
- db:
- Effects:
- db.cols:
list
[str
] List of column names
- db.cols:
- Versions:
2019-11-12
@ddalle
: v1.0
- read_textdata_headerdefaultcols(f)¶
Create column names “col1”, “col2”, etc. if needed
- Call:
>>> db.read_textdata_headerdefaultcols(f)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- f:
file
Open file handle
- db:
- Effects:
- db.cols:
list
[str
] If not previously determined, this becomes
["col1", "col2", ...]
based on number of columns in the first data row
- db.cols:
- Versions:
2019-11-27
@ddalle
: v1.02019-12-02
@ddalle
: Copied fromCSVFile
- read_textdata_headerline(f)¶
Read line and process column names if possible
- Call:
>>> db.read_textdata_headerline(f)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- f:
file
Open file handle
- db:
- Effects:
- db.cols:
None
|list
[str
] List of column names if read
- db._textdata_header_once:
True
|False
Set to
True
if column names are read at all- db._textdata_header_complete:
True
|False
Set to
True
if next line is expected to be data
- db.cols:
- Versions:
2019-11-22
@ddalle
: v1.02019-12-02
@ddalle
: Copied fromCSVFile
- read_textdata_line(f)¶
Read a data row from a text data file
- Call:
>>> db.read_textdata_line(f)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- f:
file
Open file handle
- db:
- Versions:
2019-11-25
@ddalle
: v1.0
- set_regex_linesplitter()¶
Generate regular expression used to split a line
- Call:
>>> db.set_regex_linesplitter()
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- db:
- Effects:
- db.regex_linesplit:
re.SRE_Pattern
Compiled regular expression object
- db.regex_linesplit:
- Versions:
2019-12-02
@ddalle
: v1.0
- split_textdata_line(line)¶
Split a line into its parts
Splits line of text by specified delimiter and strips whitespace and delimiter from each entry
- Call:
>>> parts = db.split_textdata_line(line)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- line:
str
Line of text to be split
- db:
- Outputs:
- parts:
list
[str
] List of strings
- parts:
- Versions:
2019-12-02
@ddalle
: v1.02024-01-10
@ddalle
: v1.1; allow whitespace in cols
- validate_boolmap(boolmap)¶
Translate free-form Type option into validated code
- Call:
>>> bmap = db.validate_boolmap(boolmap)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextData
Data file interface
- boolmap:
str
[str
|list
] Initial boolean flag map; the keys are names of the boolean coefficients that are set, and the item values are the one or more abbreviations for each key
- db:
- Outputs:
- bmap:
str
[list
[str
]] Validated map
- bmap:
- Versions:
2019-12-03
@ddalle
: v1.0
- write_textdata(fname=None)¶
Write text data file based on existing db.lines
Checks are not performed that values in e.g. db[col] have been synchronized with the text in db.lines. It is therefore possible to write a file that does not match the values in the database. To avoid this, use
set_colval()
.- Call:
>>> db.write_textdata() >>> db.write_textdata(fname)
- Inputs:
- db:
cape.attdb.ftypes.textdata.TextDataFile
Text data file interface
- fname: {db.fname} |
str
Name of file to write
- db:
- Versions:
2019-12-04
@ddalle
: v1.0
- class cape.attdb.ftypes.textdata.TextDataOpts(_optsdict=None, _warnmode=1, **kw)¶