cape.attdb.datakithub: Hub for importing DataKits by name¶
This module provides the class DataKitHub that provides a tool
to simplify the importing of named “datakits” (from
cape.attdb.rdb.DataKit). More specifically, it allows users to
create one or more naming conventions for databases/datakits and to
read that data with minimal low-level Python programming.
An instance of the DataKitHub class is created by reading a
JSON file that contains the naming conventions, such as the following:
from cape.attdb.datakithub import DataKitHub # Create an instance hub = DataKitHub()
This will look for a file
data/datakithub/datakithub.json
in the current folder and each parent folder.
A simple datakithub.json file might contain the following:
{ "DB-ATT": { "repo": "/home/user/datakit/db", "type": "module", "module_attribute": "db", "module_regex": { "DB-ATT-([0-9]+)": "dbatt.db%s", }, } }
It will make more sense to explain this content after seeing an example.
Now we can use the DataKitHub instance to read databases by
their title, such as "DB-ATT-1" or "DB-ATT-002", as long as they
start with "DB-ATT" or some other string defined in the JSON file.
from cape.attdb.datakithub import DataKitHub # Create an instance hub = DataKitHub("/home/user/datakit/datakithub.json") # Read the database "DB-ATT-1" db1 = hub.read_db("DB-ATT-1") # Read the database "DB-ATT-002" db2 = hub.read_db("DB-ATT-002")
This is roughly the same as
# Read the database "DB-ATT-1" import dbatt.db1 db1 = dbatt.db1.db # Read the database "DB-ATT-002" import dbatt.db002 db2 = dbatt.db002.db
but without having to deal with either sys.path or the PYTHONPATH
environment variable, which can be both tedious and difficult to make
work for multiple users on different types of computers.
Here is a description of the JSON parameters
- repo:
strName of the folder containing the data or modules
- module_attribute:
str|list|NoneName of variable(s) in imported module to use as datakit
- module_function:
str|list|NoneName of function(s) from imported module that return datakit
- module_regex:
dict[str]Rules for converting a regular expression to module names
- class cape.attdb.datakithub.DataKitHub(fjson=None, cwd=None)¶
Load datakits using only the database name
- Call:
>>> hub = DataKitHub(fjson)
- Inputs:
- Outputs:
- hub:
DataKitHub Instance that implements import rules by name
- hub:
- Versions:
2019-02-17
@ddalle: Version 1.0- 2021-08-19
@ddalle: Version 2.0 simpler search for JSON file
similar to how git finds
.gitfolderbetter regular expression support
can try multiple sections if one matches but fails
- 2021-08-19
- abspath(path)¶
Expand absolute path to a relative path
- Call:
>>> abspath = hub.abspath(path)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- path:
str Path to some file, relative or absolute
- hub:
- Outputs:
- abspath:
None|str Absolute path to path
- abspath:
- Versions:
2021-08-18
@ddalle: Version 1.0
- expand_regex(regex_template)¶
Expand a regular expression template
Use defined groups from hub.regex_groups
- Call:
>>> regex = hub.expand_regex(regex_template)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- regex_template:
str Raw template with
<grp>or%(grp)sgroups
- hub:
- Outputs:
- regex:
str Expanded regex with
(?P<grp>...)filled in
- regex:
- Versions:
2021-08-17
@ddalle: Version 1.0
- fullmatch(regex_template, dbname)¶
Match a full string (usually DB name) to a regex template
- Call:
>>> groupdict = hub.match(regex_template, dbname)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- regex_template:
str Regular expression template for section of datakits
- dbname:
str Database name for one datakit
- hub:
- Outputs:
- Versions:
2021-08-17
@ddalle: Version 1.0
- genr8_modname(dbname, regex, template)¶
Determine module name from DB name, regex, and template
- Call:
>>> modname = hub.genr8_modname(dbname, regex, template)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- dbname:
str Database name for one datakit
- sec:
str Regular expression template for section of datakits
- regex:
str Regular expression template for database names
- template:
str Template for module name based on regex match groups
- hub:
- Outputs:
- modname:
None|str Name of module according to regex and template
- modname:
- Versions:
2021-08-17
@ddalle: Version 1.0
- genr8_modpath(dbname, sec)¶
Generate $PYTHONPATH for given database name (if any)
- Call:
>>> modpath = hub.genr8_modpath(dbname, sec)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- dbname:
str Database name for one datakit
- sec:
str Regular expression template for section of datakits
- template:
str Template for module name based on regex match groups
- hub:
- Outputs:
- modpath:
None|str Path to module if not in existing
$PYTHONPATH
- modpath:
- Versions:
2021-08-18
@ddalle: Version 1.0
- get_regex_groups()¶
Get expanded regular expressions from hub.regex_groups
- Call:
>>> regex_dict = hub.get_regex_groups()
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- regex_template:
str Raw template with
<grp>or%(grp)sgroups
- hub:
- Outputs:
- Versions:
2021-08-17
@ddalle: Version 1.0
- get_section(sec)¶
Get options for specified module section
- Call:
>>> secopts = hub.get_section(sec)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- sec:
str Name of datakit section
- hub:
- Outputs:
- secopts:
dict Options for sec loaded in hub[sec]
- secopts:
- Versions:
2021-08-18
@ddalle: Version 1.0
- get_section_opt(sec, opt, vdef=None)¶
Get the type of a given datakit group
- Call:
>>> v = hub.get_section_opt(grp, opt, vdef=None)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- sec:
str Name of datakit section
- opt:
str Name of option to access
- vdef: {
None} | any Default value for opt
- hub:
- Outputs:
- v: {vdef} |
Value of hub[grp][opt] or vdef
- Versions:
2021-02-18
@ddalle: Version 1.0- 2021-08-18
@ddalle: Version 1.1 was
get_group_opt()add module-level defaults
- 2021-08-18
- get_section_repo(sec)¶
Get repo option for section
- Call:
>>> repo = hub.get_section_repo(sec)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- sec:
str Name of datakit section
- hub:
- Outputs:
- repo:
None|dict Name of folder to add to path
- repo:
- Versions:
2021-08-18
@ddalle: Version 1.0
- get_section_type(sec)¶
Get type option for section
- Call:
>>> sectype = hub.get_section_type(sec)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- sec:
str Name of datakit section
- hub:
- Outputs:
- sectype:
str Name of folder to add to path
- sectype:
- Versions:
2021-08-18
@ddalle: Version 1.0
- import_dbname(dbname, **kw)¶
Import a datakit module based on DB name
- Call:
>>> mod = hub.import_dbname(dbname, **kw)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- dbname:
str Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True| {False} Option to report results of matching modules
- vv, veryverbose:
True| {False} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True| {False} Option to report all attempts
- v, verbose:
- Outputs:
- mod:
None|module Imported module if possible
- mod:
- Versions:
2021-02-18
@ddalle: Version 1.0- 2021-08-19
@ddalle: Version 2.0 forked from
load_module()better regular expression support
better fallback if more than one section matches
- 2021-08-19
- import_module(dbname, **kw)¶
Import a datakit module based on DB name
- Call:
>>> mod = hub.import_module(dbname, **kw)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- dbname:
str Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True| {False} Option to report results of matching modules
- vv, veryverbose:
True| {False} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True| {False} Option to report all attempts
- v, verbose:
- Outputs:
- mod:
None|module Imported module if possible
- mod:
- Versions:
2021-02-18
@ddalle: Version 1.0- 2021-08-19
@ddalle: Version 2.0 forked from
load_module()better regular expression support
better fallback if more than one section matches
- 2021-08-19
- match(regex_template, dbname)¶
Match a regular expression template to a target string
- Call:
>>> groupdict = hub.match(regex_template, dbname)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- regex_template:
str Regular expression template for section of datakits
- dbname:
str Database name for one datakit
- hub:
- Outputs:
- Versions:
2021-08-17
@ddalle: Version 1.0
- match_section(sec, dbname)¶
Check if a database name matches a given section
- Call:
>>> groupdict = hub.match_section(section, dbname)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- section:
str Regular expression template for section of datakits
- dbname:
str Database name for one datakit
- hub:
- Outputs:
- Versions:
2021-08-17
@ddalle: Version 1.0
- read_db(dbname, **kw)¶
Read a datakit based on DB name
- Call:
>>> db = hub.read_db(dbname, **kw)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- dbname:
str Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True| {False} Option to report results of matching modules
- vv, veryverbose:
True| {False} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True| {False} Option to report all attempts
- v, verbose:
- Outputs:
- db:
None|DataKit Data interface if successful
- db:
- Versions:
2021-02-18
@ddalle: Version 1.0- 2021-08-19
@ddalle: Version 2.0 better regex and fallback support
verbosity options
calls
read_dbname()
- 2021-08-19
- read_dbname(dbname, **kw)¶
Read a datakit based on DB name
- Call:
>>> db = hub.read_dbname(dbname, **kw)
- Inputs:
- hub:
DataKitHub Instance of datakit-reading hub
- dbname:
str Database name for one datakit
- hub:
- Keyword Arguments:
- v, verbose:
True| {False} Option to report results of matching modules
- vv, veryverbose:
True| {False} Option to report all attempts in matching sections
- vvv, veryveryverbose:
True| {False} Option to report all attempts
- v, verbose:
- Outputs:
- db:
None|DataKit Data interface if successful
- db:
- Versions:
2021-08-18
@ddalle: Version 1.0
- cape.attdb.datakithub.prepare_template(template)¶
Expand a string template with some substitutions
The substitutions made include:
r"\g<grp>"–>"%(grp)s"r"\l\g<grp>"–>"%(l-grp)s"r"\u\1"–>"%(u-1)s"r"\1"–>"%(1)s"