cape.attdb.datakitloader: Tools for reading DataKits from a collection

This class provides the DataKitLoader, which takes as input the module __name__ and __file__ to automatically determine a variety of DataKit parameters.

class cape.attdb.datakitloader.DataKitLoader(name=None, fname=None, **kw)

Tool for reading datakits based on module name and file

Call:
>>> dkl = DataKitLoader(name, fname, **kw)
Inputs:
name: str

Module name, from __name__

fname: str

Absolute path to module file name, from __file__

Outputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

Versions:
  • 2021-06-25 @ddalle: Version 0.1; Started

check_dvcfile(fname, f=False)

Check if a file exists with appended``.dvc`` extension

Call:
>>> q = dkl.check_dvcfile(fname)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file [optionally relative to MODULE_DIR]

Keys:
  • MODULE_DIR

Outputs:
q: True | False

Whether or not fname or DVC file exists

Versions:
  • 2021-07-19 @ddalle: v1.0

check_file(fname, f=False, dvc=True)

Check if a file exists OR a .dvc version

  • If f is True, this returns False always

  • If fabs exists, this returns True

  • If fabs plus .dvc exists, it also returns True

Call:
>>> q = dkl.check_file(fname, f=False, dvc=True)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file [optionally relative to MODULE_DIR]

f: True | {False}

Force-overwrite option; always returns False

dvc: {True} | False

Option to check for .dvc extension

Keys:
  • MODULE_DIR

Outputs:
q: True | False

Whether or not fname or DVC file exists

Versions:
  • 2021-07-19 @ddalle: v1.0

check_modfile(fname)

Check if a file exists OR a .dvc version

Call:
>>> q = dkl.check_modfile(fname)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file [optionally relative to MODULE_DIR]

Keys:
  • MODULE_DIR

Outputs:
q: True | False

Whether or not fname or DVC file exists

Versions:
  • 2021-07-19 @ddalle: v1.0

create_db_name()

Create and save database name from module name

This utilizes the following parameters:

  • MODULE_NAME_REGEX_LIST

  • MODULE_NAME_REGEX_GROUPS

  • DB_NAME_TEMPLATE_LIST

Call:
>>> dbname = dkl.create_db_name()
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

Outputs:
dbname: str

Prescribed datakit name

Versions:
  • 2021-06-28 @ddalle: v1.0

dvc_add(frel, **kw)

Add (cache) a file using DVC

Call:
>>> ierr = dkl.dvc_add(frel, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to MODULE_DIR

Outputs:
ierr: int

Return code

  • 0: success

  • 512: not a git repo

Versions:
  • 2021-09-15 @ddalle: v1.0

dvc_pull(frel, **kw)

Pull a DVC file

Call:
>>> ierr = dkl.dvc_pull(frel, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to MODULE_DIR

Outputs:
ierr: int

Return code

  • 0: success

  • 256: no DVC file

  • 512: not a git repo

Versions:
  • 2021-07-19 @ddalle: v1.0

  • 2023-02-21 @ddalle: v2.0; DVC -> LFC

dvc_push(frel, **kw)

Push a DVC file

Call:
>>> ierr = dkl.dvc_push(frel, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to MODULE_DIR

Outputs:
ierr: int

Return code

  • 0: success

  • 256: no DVC file

  • 512: not a git repo

Versions:
  • 2021-09-15 @ddalle: v1.0

dvc_status(frel, **kw)

Check status a DVC file

Call:
>>> ierr = dkl.dvc_status(frel, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to MODULE_DIR

Outputs:
ierr: int

Return code

  • 0: success

  • 1: out-of-date

  • 256: no DVC file

  • 512: not a git repo

Versions:
  • 2021-09-23 @ddalle: v1.0

genr8_db_name(modname=None)

Get database name based on first matching regular expression

This utilizes the following parameters:

  • MODULE_NAME

  • MODULE_NAME_REGEX_LIST

  • MODULE_NAME_REGEX_GROUPS

  • DB_NAME_TEMPLATE_LIST

Call:
>>> dbname = dkl.genr8_db_name(modname=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

modname: {None} | str

Name of module to parse (default: MODULE_NAME)

Outputs:
dbname: str

Prescribed datakit name

Versions:
  • 2021-06-28 @ddalle: v1.0

  • 2021-07-15 @ddalle: Version 1.1; add modname arg

genr8_modnames(dbname=None)

Import first available module based on a DB name

This utilizes the following parameters:

  • DB_NAME

  • DB_NAME_REGEX_LIST

  • DB_NAME_REGEX_GROUPS

  • MODULE_NAME_TEMPLATE_LIST

Call:
>>> modnames = dkl.genr8_modnames(dbname=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

dbame: {None} | str

Database name parse (default: DB_NAME)

Outputs:
modnames: list[str]

Candidate module names

Versions:
  • 2021-10-22 @ddalle: v1.0

get_abspath(frel)

Get the full filename from path relative to MODULE_DIR

Call:
>>> fabs = dkl.get_abspath(frel)
>>> fabs = dkl.get_abspath(fabs)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to MODULE_DIR

fabs: str

Existing absolute path

Keys:
  • MODULE_DIR

Outputs:
fabs: str

Absolute path to file

Versions:
  • 2021-07-05 @ddalle: v1.0

get_db_filenames_by_type(ext)

Get list of file names for a given data file type

Call:
>>> fnames = dkl.get_db_filenames_by_type(ext)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

ext: str

File extension type

Outputs:
fnames: list[str]

List of datakit file names; one for each suffix

Versions:
  • 2021-07-01 @ddalle: v1.0

get_db_suffixes_by_type(ext)

Get list of suffixes for given data file type

Call:
>>> suffixes = dkl.get_db_suffixes_by_type(ext)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

ext: str

File extension type

Keys:
  • DB_SUFFIXES_BY_TYPE

Outputs:
suffixes: list[str | None]

List of additional suffixes (if any) for ext type

Versions:
  • 2021-07-01 @ddalle: v1.0

get_dbdir(ext)

Get containing folder for specified datakit file type

Call:
>>> fdir = dkl.get_dbdir(ext)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

ext: str

File type

Outputs:
fdir: str

Absolute folder to ext datakit folder

Keys:
  • MODULE_DIR

  • DB_DIR

  • DB_DIRS_BY_TYPE

See Also:
Versions:
  • 2021-07-07 @ddalle: v1.0

get_dbdir_by_type(ext)

Get datakit directory for given file type

Call:
>>> dkl.get_db_dir_by_type(ext)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

ext: str

File extension type

Keys:
  • MODULE_DIR

  • DB_DIR

  • DB_DIRS_BY_TYPE

Outputs:
fdir: str

Absolute path to ext datakit folder

Versions:
  • 2021-06-29 @ddalle: v1.0

get_dbfile(fname, ext)

Get a file name relative to the datakit folder

Call:
>>> fabs = dkl.get_dbfile(fname, ext)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: None | str

Name of file relative to DB_DIRS_BY_TYPE for ext

ext: str

File type

Outputs:
fabs: str

Absolute path to file

Keys:
  • MODULE_DIR

  • DB_DIR

  • DB_DIRS_BY_TYPE

Versions:
  • 2021-07-07 @ddalle: v1.0

get_dbfiles(dbname, ext)

Get list of datakit filenames for specified type

Call:
>>> fnames = dkl.get_dbfiles(dbname, ext)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

dbname: None | str

Database name (default if None)

ext: str

File type

Outputs:
fnames: list[str]

Absolute path to files for datakit

Keys:
  • MODULE_DIR

  • DB_DIR

  • DB_DIRS_BY_TYPE

  • DB_SUFFIXES_BY_TYPE

Versions:
  • 2021-07-07 @ddalle: v1.0

get_rawdata_opt(opt, remote='origin', vdef=None)

Get a rawdata/datakit-sources.json setting

Call:
>>> v = dkl.get_rawdata_opt(opt, remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

opt: str

Name of option to read

remote: {"origin"} | str

Name of remote from which to read opt

vdef: {None} | any

Default value if opt not present

Outputs:
v: {vdef} | any

Value from JSON file if possible, else vdef

Versions:
  • 2021-09-01 @ddalle: v1.0

  • 2022-01-26 @ddalle: Version 1.1; add substitutions

get_rawdata_ref(remote='origin')

Get optional SHA-1 hash, tag, or branch for raw data source

Call:
>>> ref = dkl.get_rawdata_ref(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Outputs:
ref: {"HEAD"} | str

Valid git reference name

Versions:
  • 2021-09-01 @ddalle: v1.0

get_rawdata_remotelist()

Get list of remotes from rawdata/datakit-sources.json

Call:
>>> remotes = dkl.get_rawdata_remotelist()
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

Outputs:
remotes: list[str]

List of remotes

Versions:
  • 2021-09-02 @ddalle: v1.0

get_rawdata_sourcecommit(remote='origin')

Get the latest used SHA-1 hash for a remote

Call:
>>> sha1 = dkl.get_rawdata_sourcecommit(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote from which to read opt

Outputs:
sha1: None | str

40-character SHA-1 hash if possible from datakit-sources-commit.json

Versions:
  • 2021-09-02 @ddalle: v1.0

get_rawdatadir()

Get absolute path to module’s raw data folder

Call:
>>> fdir = dkl.get_rawdatadir()
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

Outputs:
fdir: str

Absolute path to raw data folder

Keys:
  • MODULE_DIR

  • RAWDATA_DIR

Versions:
  • 2021-07-08 @ddalle: v1.0

get_rawdatafilename(fname, dvc=False)

Get a file name relative to the datakit folder

Call:
>>> fabs = dkl.get_rawdatafilename(fname, dvc=False)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: None | str

Name of file relative to DB_DIRS_BY_TYPE for ext

dvc: True | {False}

Option to pull DVC file where fabs doesn’t exist

Outputs:
fabs: str

Absolute path to raw data file

Keys:
  • MODULE_DIR

  • RAWDATA_DIR

Versions:
  • 2021-07-07 @ddalle: v1.0

get_rawdataremote_git(remote='origin', f=False)

Get full URL and SHA-1 hash for raw data source repo

Call:
>>> url, sha1 = dkl.get_rawdataremote_git(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

f: True | {False}

Option to override dkl.rawdata_remotes if present

Outputs:
url: None | str

Full path to valid git repo, if possible

sha1: None | str

40-character hash of specified commit, if possible

Versions:
  • 2021-09-01 @ddalle: v1.0

get_rawdataremote_gitfiles(remote='origin')

List all files in candidate raw data remote source

Call:
>>> fnames = dkl.get_rawdataremote_gitfiles(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Outputs:
fnames: list[str]

List of files to be copied from remote repo

Versions:
  • 2021-09-01 @ddalle: v1.0

get_rawdataremote_rsync(remote='origin')

Get full URL for rsync raw data source repo

If several options are present, this function checks for the first with an extant folder.

Call:
>>> url = dkl.get_rawdataremote_rsync(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Outputs:
url: None | str

Full path to valid git repo, if possible

Versions:
  • 2021-09-02 @ddalle: v1.0

get_rawdataremote_rsyncfiles(remote='origin')

List all files in candidate remote folder

Call:
>>> fnames = dkl.get_rawdataremote_rsyncfiles(remote)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Outputs:
fnames: list[str]

List of files to be copied from remote repo

Versions:
  • 2021-09-02 @ddalle: v1.0

import_db_name(dbname=None)

Import first available module based on a DB name

This utilizes the following parameters:

  • DB_NAME

  • DB_NAME_REGEX_LIST

  • DB_NAME_REGEX_GROUPS

  • MODULE_NAME_TEMPLATE_LIST

Call:
>>> mod = dkl.import_db_name(dbname=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

dbame: {None} | str

Database name parse (default: DB_NAME)

Outputs:
mod: module

Module with DB_NAME equal to dbname

Versions:
  • 2021-07-15 @ddalle: v1.0

list_rawdataremote_git(remote='origin')

List all files in candidate raw data remote source

Call:
>>> ls_files = dkl.list_rawdataremote_git(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Outputs:
ls_files: list[str]

List of all files tracked by remote repo

Versions:
  • 2021-09-01 @ddalle: v1.0

list_rawdataremote_rsync(remote='origin')

List all files in candidate raw data remote folder

Call:
>>> ls_files = dkl.list_rawdataremote_rsync(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Outputs:
ls_files: list[str]

List of all files in remote source folder

Versions:
  • 2021-09-02 @ddalle: v1.0

make_db_name()

Retrieve or create database name from module name

This utilizes the following parameters:

  • MODULE_NAME_REGEX_LIST

  • MODULE_NAME_REGEX_GROUPS

  • DB_NAME_TEMPLATE_LIST

Call:
>>> dbname = dkl.make_db_name()
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

Outputs:
dbname: str

Prescribed datakit name

Versions:
  • 2021-06-28 @ddalle: v1.0

prep_dirs(frel)

Prepare folders needed for file if needed

Any folders in frel that don’t exist will be created. For example "db/csv/datakit.csv" will create the folders db/ and db/csv/ if they don’t already exist.

Call:
>>> dkl.prep_dirs(frel)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to MODULE_DIR

fabs: str

Existing absolute path

Keys:
  • MODULE_DIR

See also:
Versions:
  • 2021-07-07 @ddalle: v1.0

prep_dirs_rawdata(frel)

Prepare folders relative to rawdata/ folder

Call:
>>> dkl.prep_dirs_rawdata(frel)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

frel: str

Name of file relative to rawdata/ folder

fabs: str

Existing absolute path

Keys:
  • MODULE_DIR

See also:
Versions:
  • 2021-09-01 @ddalle: v1.0

read_db_csv(cls=None, **kw)

Read a datakit using .csv file type

Call:
>>> db = dkl.read_db_csv(fname, cls=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

Versions:
  • 2021-07-03 @ddalle: v1.0

read_db_mat(cls=None, **kw)

Read a datakit using .mat file type

Call:
>>> db = dkl.read_db_mat(fname, cls=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

Versions:
  • 2021-07-03 @ddalle: v1.0

read_db_name(dbname=None)

Read datakit from first available module based on a DB name

This utilizes the following parameters:

  • DB_NAME

  • DB_NAME_REGEX_LIST

  • DB_NAME_REGEX_GROUPS

  • MODULE_NAME_TEMPLATE_LIST

Call:
>>> db = dkl.read_db_name(dbname=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

dbame: {None} | str

Database name parse (default: DB_NAME)

Outputs:
db: DataKit

Output of read_db() from module with DB_NAME equal to dbname

Versions:
  • 2021-09-10 @ddalle: v1.0

read_dbfile(fname, ext, **kw)

Read a databook file from DB_DIR

Call:
>>> db = dkl.read_dbfile_mat(self, ext, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: None | str

Name of file to read from raw data folder

ext: str

Database file type

ftype: {"mat"} | None | str

Optional specifier to predetermine file type

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

Keys:
  • MODULE_DIR

  • DB_DIR

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

Versions:
  • 2021-06-25 @ddalle: v1.0

read_dbfile_csv(fname, **kw)

Read a .mat file from DB_DIR

Call:
>>> db = dkl.read_dbfile_mat(fname, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file to read from raw data folder

ftype: {"mat"} | None | str

Optional specifier to predetermine file type

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

kw: dict

Additional keyword arguments passed to cls

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

Versions:
  • 2021-06-25 @ddalle: v1.0

read_dbfile_csv_rbf(fname, **kw)

Read a .mat file from DB_DIR

Call:
>>> db = dkl.read_dbfile_mat(fname, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file to read from raw data folder

ftype: {"mat"} | None | str

Optional specifier to predetermine file type

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

kw: dict

Additional keyword arguments passed to cls

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

Versions:
  • 2021-06-25 @ddalle: v1.0

read_dbfile_mat(fname, **kw)

Read a .mat file from DB_DIR

Call:
>>> db = dkl.read_dbfile_mat(fname, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file to read from raw data folder

ftype: {"mat"} | None | str

Optional specifier to predetermine file type

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

kw: dict

Additional keyword arguments passed to cls

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

Versions:
  • 2021-06-25 @ddalle: v1.0

read_rawdata_json(fname='datakit-sources.json', f=False)

Read datakit-sources.json from package’s raw data folder

Call:
>>> dkl.read_rawdata_json(fname=None, f=False)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: {"datakit-sources.json"} | str

Relative or absolute file name (rel. to rawdata/)

f: True | {False}

Reread even if dkl.rawdata_sources is nonempty

Effects:
dkl.rawdata_sources: dict

Settings read from JSON file

Versions:
  • 2021-09-01 @ddalle: v1.0

read_rawdatafile(fname, ftype=None, cls=None, **kw)

Read a file from the RAW_DATA folder

Call:
>>> db = dkl.read_rawdatafile(fname, ftype=None, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fname: str

Name of file to read from raw data folder

ftype: {None} | str

Optional specifier to predetermine file type

cls: {None} | type

Class to read fname other than dkl[“DATAKIT_CLS”]

kw: dict

Additional keyword arguments passed to cls

Outputs:
db: dkl[“DATAKIT_CLS”] | cls

DataKit instance read from fname

See Also:
Versions:
update_rawdata(**kw)

Update raw data using rawdata/datakit-sources.json

The settings for zero or more “remotes” are read from that JSON file in the package’s rawdata/ folder. Example contents of such a file are shown below:

{
    "hub": [
        "/nobackup/user/",
        "pfe:/nobackupp16/user/git",
        "linux252:/nobackup/user/git"
    ],
    "remotes": {
        "origin": {
            "url": "data/datarepo.git",
            "type": "git-show",
            "glob": "aero_STACK*.csv",
            "regex": [
                "aero_CORE_no_[a-z]+\.csv",
                "aero_LSRB_no_[a-z]+\.csv",
                "aero_RSRB_no_[a-z]+\.csv"
            ],
            "commit": null,
            "branch": "main",
            "tag": null,
            "destination": "."
        }
    }
}
Call:
>>> dkl.update_rawdata(remote=None, remotes=None)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {None} | str

Name of single remote to update

remotes: {None} | list[str]

Name of multiple remotes to update

Versions:
  • 2021-09-02 @ddalle: v1.0

  • 2022-01-18 @ddalle: Version 1.1; remote(s) kwarg

update_rawdata_remote(remote='origin')

Update raw data for one remote

Call:
>>> dkl.update_rawdata_remote(remote="origin")
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

remote: {"origin"} | str

Name of remote

Versions:
  • 2021-09-02 @ddalle: v1.0

write_db_csv(readfunc, f=True, db=None, **kw)

Write (all) canonical db CSV file(s)

Call:
>>> db = dkl.write_db_csv(readfunc, f=True, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

readfunc: callable

Function to read source datakit if needed

f: {True} | False

Overwrite fmat if it exists

db: {None} | DataKit

Existing source datakit to write

cols: {None} | list

If dkl has more than one file, cols must be a list of lists specifying which columns to write to each file

dvc: True | {False}

Option to add and push data file using dvc

Outputs:
db: None | DataKit

If source datakit is read during execution, return it to be used in other write functions

Versions:
  • 2021-09-10 @ddalle: v1.0

write_db_mat(readfunc, f=True, db=None, **kw)

Write (all) canonical db MAT file(s)

Call:
>>> db = dkl.write_db_mat(readfunc, f=True, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

readfunc: callable

Function to read source datakit if needed

f: {True} | False

Overwrite fmat if it exists

db: {None} | DataKit

Existing source datakit to write

cols: {None} | list

If dkl has more than one file, cols must be a list of lists specifying which columns to write to each file

dvc: True | {False}

Option to add and push data file using dvc

Outputs:
db: None | DataKit

If source datakit is read during execution, return it to be used in other write functions

Versions:
  • 2021-09-10 @ddalle: v1.0

write_db_xlsx(readfunc, f=True, db=None, **kw)

Write (all) canonical db XLSX file(s)

Call:
>>> db = dkl.write_db_xlsx(readfunc, f=True, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

readfunc: callable

Function to read source datakit if needed

f: {True} | False

Overwrite fmat if it exists

db: {None} | DataKit

Existing source datakit to write

cols: {None} | list

If dkl has more than one file, cols must be a list of lists specifying which columns to write to each file

dvc: True | {False}

Option to add and push data file using dvc

Outputs:
db: None | DataKit

If source datakit is read during execution, return it to be used in other write functions

Versions:
  • 2022-12-14 @ddalle: v1.0

write_dbfile_csv(fcsv, readfunc, f=True, db=None, **kw)

Write a canonical db CSV file

Call:
>>> db = dkl.write_dbfile_csv(fcsv, readfunc, f=True, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fscv: str

Name of file to write

readfunc: callable

Function to read source datakit if needed

f: {True} | False

Overwrite fmat if it exists

db: {None} | DataKit

Existing source datakit to write

dvc: True | {False}

Option to add and push data file using dvc

Outputs:
db: None | DataKit

If source datakit is read during execution, return it to be used in other write functions

Versions:
  • 2021-09-10 @ddalle: v1.0

  • 2021-09-15 @ddalle: Version 1.1; check for DVC stub

  • 2021-09-15 @ddalle: Version 1.2; add dvc option

write_dbfile_mat(fmat, readfunc, f=True, db=None, **kw)

Write a canonical db MAT file

Call:
>>> db = dkl.write_dbfile_mat(fmat, readfunc, f=True, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fmat: str

Name of file to write

readfunc: callable

Function to read source datakit if needed

f: {True} | False

Overwrite fmat if it exists

db: {None} | DataKit

Existing source datakit to write

dvc: True | {False}

Option to add and push data file using dvc

Outputs:
db: None | DataKit

If source datakit is read during execution, return it to be used in other write functions

Versions:
  • 2021-09-10 @ddalle: v1.0

  • 2021-09-15 @ddalle: Version 1.1; check for DVC stub

  • 2021-09-15 @ddalle: Version 1.2; add dvc option

write_dbfile_xlsx(fxls, readfunc, f=True, db=None, **kw)

Write a canonical db XLSX file

Call:
>>> db = dkl.write_dbfile_xlsx(fmat, readfunc, f=True, **kw)
Inputs:
dkl: DataKitLoader

Tool for reading datakits for a specific module

fxlsx: str

Name of file to write

readfunc: callable

Function to read source datakit if needed

f: {True} | False

Overwrite fmat if it exists

db: {None} | DataKit

Existing source datakit to write

dvc: True | {False}

Option to add and push data file using dvc

Outputs:
db: None | DataKit

If source datakit is read during execution, return it to be used in other write functions

Versions:
  • 2022-12-14 @ddalle: v1.0