cape.attdb.datakithub
: Hub for importing DataKits by name¶
This module provides the class DataKitHub
that provides a tool
to simplify the importing of named “datakits” (from
cape.attdb.rdb.DataKit
). More specifically, it allows users to
create one or more naming conventions for databases/datakits and to
read that data with minimal low-level Python programming.
An instance of the DataKitHub
class is created by reading a
JSON file that contains the naming conventions, such as the following:
from cape.attdb.datakithub import DataKitHub # Create an instance hub = DataKitHub()
This will look for a file
data/datakithub/datakithub.json
in the current folder and each parent folder.
A simple datakithub.json
file might contain the following:
{ "DB-ATT": { "repo": "/home/user/datakit/db", "type": "module", "module_attribute": "db", "module_regex": { "DB-ATT-([0-9]+)": "dbatt.db%s", }, } }
It will make more sense to explain this content after seeing an example.
Now we can use the DataKitHub
instance to read databases by
their title, such as "DB-ATT-1"
or "DB-ATT-002"
, as long as they
start with "DB-ATT"
or some other string defined in the JSON file.
from cape.attdb.datakithub import DataKitHub # Create an instance hub = DataKitHub("/home/user/datakit/datakithub.json") # Read the database "DB-ATT-1" db1 = hub.read_db("DB-ATT-1") # Read the database "DB-ATT-002" db2 = hub.read_db("DB-ATT-002")
This is roughly the same as
# Read the database "DB-ATT-1" import dbatt.db1 db1 = dbatt.db1.db # Read the database "DB-ATT-002" import dbatt.db002 db2 = dbatt.db002.db
but without having to deal with either sys.path
or the PYTHONPATH
environment variable, which can be both tedious and difficult to make
work for multiple users on different types of computers.
Here is a description of the JSON parameters
- repo:
str
Name of the folder containing the data or modules
- module_attribute:
str
|list
|None
Name of variable(s) in imported module to use as datakit
- module_function:
str
|list
|None
Name of function(s) from imported module that return datakit
- module_regex:
dict
[str
]Rules for converting a regular expression to module names
- class cape.attdb.datakithub.DataKitHub(fjson=None, cwd=None)¶
Load datakits using only the database name
- Call:
>>> hub = DataKitHub(fjson)
- Inputs:
- fjson: {
None
} |str
Path to JSON file with import rules for one or more db names
- cwd: {
None
} |str
Path from which to begin search
- fjson: {
- Outputs:
- hub:
DataKitHub
Instance that implements import rules by name
- hub:
- Versions:
2019-02-17
@ddalle
: Version 1.0- 2021-08-19
@ddalle
: Version 2.0 simpler search for JSON file
similar to how git finds
.git
folderbetter regular expression support
can try multiple sections if one matches but fails
- 2021-08-19
- abspath(path)¶
Expand absolute path to a relative path
- Call:
>>> abspath = hub.abspath(path)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- path:
str
Path to some file, relative or absolute
- hub:
- Outputs:
- abspath:
None
|str
Absolute path to path
- abspath:
- Versions:
2021-08-18
@ddalle
: Version 1.0
- expand_regex(regex_template)¶
Expand a regular expression template
Use defined groups from hub.regex_groups
- Call:
>>> regex = hub.expand_regex(regex_template)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- regex_template:
str
Raw template with
<grp>
or%(grp)s
groups
- hub:
- Outputs:
- regex:
str
Expanded regex with
(?P<grp>...)
filled in
- regex:
- Versions:
2021-08-17
@ddalle
: Version 1.0
- fullmatch(regex_template, dbname)¶
Match a full string (usually DB name) to a regex template
- Call:
>>> groupdict = hub.match(regex_template, dbname)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- regex_template:
str
Regular expression template for section of datakits
- dbname:
str
Database name for one datakit
- hub:
- Outputs:
- groupdict:
None
|dict
[str
] Augmented
dict
of groups from regex
- groupdict:
- Versions:
2021-08-17
@ddalle
: Version 1.0
- genr8_modname(dbname, regex, template)¶
Determine module name from DB name, regex, and template
- Call:
>>> modname = hub.genr8_modname(dbname, regex, template)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- dbname:
str
Database name for one datakit
- sec:
str
Regular expression template for section of datakits
- regex:
str
Regular expression template for database names
- template:
str
Template for module name based on regex match groups
- hub:
- Outputs:
- modname:
None
|str
Name of module according to regex and template
- modname:
- Versions:
2021-08-17
@ddalle
: Version 1.0
- genr8_modpath(dbname, sec)¶
Generate $PYTHONPATH for given database name (if any)
- Call:
>>> modpath = hub.genr8_modpath(dbname, sec)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- dbname:
str
Database name for one datakit
- sec:
str
Regular expression template for section of datakits
- template:
str
Template for module name based on regex match groups
- hub:
- Outputs:
- modpath:
None
|str
Path to module if not in existing
$PYTHONPATH
- modpath:
- Versions:
2021-08-18
@ddalle
: Version 1.0
- get_regex_groups()¶
Get expanded regular expressions from hub.regex_groups
- Call:
>>> regex_dict = hub.get_regex_groups()
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- regex_template:
str
Raw template with
<grp>
or%(grp)s
groups
- hub:
- Outputs:
- regex_dict:
dict
[str
] Expanded regex with
(?P<grp>...)
for each group
- regex_dict:
- Versions:
2021-08-17
@ddalle
: Version 1.0
- get_section(sec)¶
Get options for specified module section
- Call:
>>> secopts = hub.get_section(sec)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- sec:
str
Name of datakit section
- hub:
- Outputs:
- secopts:
dict
Options for sec loaded in hub[sec]
- secopts:
- Versions:
2021-08-18
@ddalle
: Version 1.0
- get_section_opt(sec, opt, vdef=None)¶
Get the type of a given datakit group
- Call:
>>> v = hub.get_section_opt(grp, opt, vdef=None)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- sec:
str
Name of datakit section
- opt:
str
Name of option to access
- vdef: {
None
} | any Default value for opt
- hub:
- Outputs:
- v: {vdef} |
Value of hub[grp][opt] or vdef
- Versions:
2021-02-18
@ddalle
: Version 1.0- 2021-08-18
@ddalle
: Version 1.1 was
get_group_opt()
add module-level defaults
- 2021-08-18
- get_section_repo(sec)¶
Get repo option for section
- Call:
>>> repo = hub.get_section_repo(sec)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- sec:
str
Name of datakit section
- hub:
- Outputs:
- repo:
None
|dict
Name of folder to add to path
- repo:
- Versions:
2021-08-18
@ddalle
: Version 1.0
- get_section_type(sec)¶
Get type option for section
- Call:
>>> sectype = hub.get_section_type(sec)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- sec:
str
Name of datakit section
- hub:
- Outputs:
- sectype:
str
Name of folder to add to path
- sectype:
- Versions:
2021-08-18
@ddalle
: Version 1.0
- import_dbname(dbname, **kw)¶
Import a datakit module based on DB name
- Call:
>>> mod = hub.import_dbname(dbname, **kw)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- dbname:
str
Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True
| {False
} Option to report results of matching modules
- vv, veryverbose:
True
| {False
} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True
| {False
} Option to report all attempts
- v, verbose:
- Outputs:
- mod:
None
|module
Imported module if possible
- mod:
- Versions:
2021-02-18
@ddalle
: Version 1.0- 2021-08-19
@ddalle
: Version 2.0 forked from
load_module()
better regular expression support
better fallback if more than one section matches
- 2021-08-19
- import_module(dbname, **kw)¶
Import a datakit module based on DB name
- Call:
>>> mod = hub.import_module(dbname, **kw)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- dbname:
str
Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True
| {False
} Option to report results of matching modules
- vv, veryverbose:
True
| {False
} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True
| {False
} Option to report all attempts
- v, verbose:
- Outputs:
- mod:
None
|module
Imported module if possible
- mod:
- Versions:
2021-02-18
@ddalle
: Version 1.0- 2021-08-19
@ddalle
: Version 2.0 forked from
load_module()
better regular expression support
better fallback if more than one section matches
- 2021-08-19
- match(regex_template, dbname)¶
Match a regular expression template to a target string
- Call:
>>> groupdict = hub.match(regex_template, dbname)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- regex_template:
str
Regular expression template for section of datakits
- dbname:
str
Database name for one datakit
- hub:
- Outputs:
- groupdict:
None
|dict
[str
] Augmented
dict
of groups from regex
- groupdict:
- Versions:
2021-08-17
@ddalle
: Version 1.0
- match_section(sec, dbname)¶
Check if a database name matches a given section
- Call:
>>> groupdict = hub.match_section(section, dbname)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- section:
str
Regular expression template for section of datakits
- dbname:
str
Database name for one datakit
- hub:
- Outputs:
- groupdict:
None
|dict
[str
] Augmented
dict
of groups from regex
- groupdict:
- Versions:
2021-08-17
@ddalle
: Version 1.0
- read_db(dbname, **kw)¶
Read a datakit based on DB name
- Call:
>>> db = hub.read_db(dbname, **kw)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- dbname:
str
Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True
| {False
} Option to report results of matching modules
- vv, veryverbose:
True
| {False
} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True
| {False
} Option to report all attempts
- v, verbose:
- Outputs:
- db:
None
|DataKit
Data interface if successful
- db:
- Versions:
2021-02-18
@ddalle
: Version 1.0- 2021-08-19
@ddalle
: Version 2.0 better regex and fallback support
verbosity options
calls
read_dbname()
- 2021-08-19
- read_dbname(dbname, **kw)¶
Read a datakit based on DB name
- Call:
>>> db = hub.read_dbname(dbname, **kw)
- Inputs:
- hub:
DataKitHub
Instance of datakit-reading hub
- dbname:
str
Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True
| {False
} Option to report results of matching modules
- vv, veryverbose:
True
| {False
} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True
| {False
} Option to report all attempts
- v, verbose:
- Outputs:
- db:
None
|DataKit
Data interface if successful
- db:
- Versions:
2021-08-18
@ddalle
: Version 1.0
- cape.attdb.datakithub.prepare_template(template)¶
Expand a string template with some substitutions
The substitutions made include:
r"\g<grp>"
–>"%(grp)s"
r"\l\g<grp>"
–>"%(l-grp)s"
r"\u\1"
–>"%(u-1)s"
r"\1"
–>"%(1)s"
- Call:
>>> fmt = prepare_template(template)
- Inputs:
- template:
str
Initial template, mixing
dict
string expansion andre.sub()
syntax
- template:
- Outputs:
- fmt:
str
Template ready for standard string expansion, for example using
fmt % grpdict
where grpdict is adict
- fmt:
- Versions:
2021-08-18
@ddalle
: Version 1.0