Code documentation¶
esgfetchini¶
platform: | Unix |
---|---|
synopsis: | Toolbox to prepare ESGF data for publication. |
-
esgprep.esgfetchini.
get_args
(args=None)[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
esgprep.fetchini.context.
ProcessingContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – The command-line arguments parser Returns: The processing context Return type: ProcessingContext
platform: | Unix |
---|---|
synopsis: | Fetches ESGF configuration files from GitHub repository. |
-
esgprep.fetchini.main.
make_outdir
(root)[source]¶ Build the output directory as follows:
Parameters: root (str) – The root directory
-
esgprep.fetchini.main.
run
(args)[source]¶ Main process that:
- Decide to fetch or not depending on file presence/absence and command-line arguments,
- Gets the GitHub file content from full API URL,
- Backups old file if desired,
- Writes response into INI file.
Parameters: args (ArgumentParser) – Parsed command-line arguments
esgfetchtables¶
platform: | Unix |
---|---|
synopsis: | Toolbox to prepare ESGF data for publication. |
-
esgprep.esgfetchtables.
get_args
()[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
esgprep.fetchtables.context.
ProcessingContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – The command-line arguments parser Returns: The processing context Return type: ProcessingContext
platform: | Unix |
---|---|
synopsis: | Fetches ESGF configuration files from GitHub repository. |
-
esgprep.fetchtables.main.
make_outdir
(tables_dir, repository, reference=None)[source]¶ Build the output directory.
Parameters: - tables_dir (str) – The CMOR tables directory submitted
- repository (str) – The GitHub repository name
- reference (str) – The GitHub reference name (tag or branch)
-
esgprep.fetchtables.main.
get_special_case
(f, url, repo, ref, auth)[source]¶ Get a dictionary of (filename -> file_info) pairs to be used for named files in place of the file info from the general API call done for the directory. file_info should contain at least the elements ‘sha’ and ‘download_url’
-
esgprep.fetchtables.main.
fetch_gh_ref
(url, outdir, auth, keep, overwrite, backup_mode, filter, special_cases=None)[source]¶ Fetch all files for a single reference (e.g. tag or branch) of a GitHub repository
-
esgprep.fetchtables.main.
run
(args)[source]¶ Main process that:
- Decide to fetch or not depending on file presence/absence and command-line arguments,
- Gets the GitHub file content from full API URL,
- Backups old file if desired,
- Writes response into table file.
Parameters: args (ArgumentParser) – Parsed command-line arguments
esgdrs¶
platform: | Unix |
---|---|
synopsis: | Toolbox to prepare ESGF data for publication. |
-
esgprep.esgdrs.
get_args
()[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this module. |
-
exception
esgprep.drs.custom_exceptions.
DuplicatedDataset
(path, version)[source]¶ Raised if a dataset already exists with submitted version.
-
exception
esgprep.drs.custom_exceptions.
OlderUpgrade
(version, latest)[source]¶ Raised if a dataset already exists with submitted version.
-
exception
esgprep.drs.custom_exceptions.
DuplicatedFile
(latest, upgrade)[source]¶ Raised if a NetCDF file already exists into submitted dataset version.
-
exception
esgprep.drs.custom_exceptions.
UnchangedTrackingID
(latest, latest_id, upgrade, upgrade_id)[source]¶ Raised if a NetCDF file already has the tracking ID of submitted file to upgrade.
-
exception
esgprep.drs.custom_exceptions.
NoVersionPattern
(regex, patterns)[source]¶ Raised if no version facet found in the destination format.
-
exception
esgprep.drs.custom_exceptions.
ReadAccessDenied
(user, path)[source]¶ Raised when user has no read access.
-
exception
esgprep.drs.custom_exceptions.
WriteAccessDenied
(user, path)[source]¶ Raised when user has not write access.
-
exception
esgprep.drs.custom_exceptions.
CrossMigrationDenied
(src, dst, mode)[source]¶ Raised when migration fails for cross-device link.
-
exception
esgprep.drs.custom_exceptions.
MigrationDenied
(src, dst, mode, reason)[source]¶ Raised when migration fails in another case.
-
exception
esgprep.drs.custom_exceptions.
InconsistentDRSPath
(project, path)[source]¶ Raised when DRS path doesn’t start with the project ID.
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
esgprep.drs.context.
ProcessingContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – Parsed command-line arguments Returns: The processing context Return type: ProcessingContext
platform: | Unix |
---|---|
synopsis: | Class to handle dataset directory for DRS management. |
-
class
esgprep.drs.handler.
File
(ffp)[source]¶ Handler providing methods to deal with file processing.
-
get
(key)[source]¶ Returns the attribute value corresponding to the key. The submitted key can refer to
File.key
orFile.attributes[key]
.Parameters: key (str) – The key Returns: The corresponding value Return type: str or list or dict depending on the key Raises: Error – If unknown key
-
load_attributes
(root, pattern, set_values)[source]¶ Loads DRS attributes catched from a regular expression match. The root facet is added by default. The dataset version is initially set to None. Can be overwrite by “set_values” pairs if submitted.
Parameters: - root (str) – The DRS tree root
- pattern (str) – The regular expression to match
- set_values (dict) – Key/value pairs of facet to set for the run
Raises: - Error – If regular expression matching fails
- Error – If invalid NetCDF file.
-
check_facets
(facets, config, set_keys)[source]¶ Checks each facet against the controlled vocabulary. If a DRS attribute is missing regarding the list of facets, The DRS attributes are completed from the configuration file maptables. In the case of non-standard attribute, it gets the most similar key among netCDF attributes names. Attributes can be directly mapped with “set_keys” pairs if submitted.
Parameters: - facets (list) – The list of facet to check
- config (ESGConfigParser.SectionParser) – The configuration parser
- set_keys (dict) – Key/Attribute pairs to map for the run
Raises: Error – If one facet checkup fails
-
-
class
esgprep.drs.handler.
DRSPath
(parts)[source]¶ Handler providing methods to deal with paths.
-
get
(key)[source]¶ Returns the attribute value corresponding to the key. The submitted key can refer to the DRS dataset parts of the DRS file parts.
Parameters: key (str) – The key Returns: The value Return type: str or list or dict depending on the key Raises: Error – If unknown key
-
items
(d_part=True, f_part=True, version=True, file_folder=False, latest=False, root=False)[source]¶ Itemizes the facet values along the DRS path. Flags can be combine to obtain different behaviors.
Parameters: - d_part (boolean) – True to append the dataset facets
- f_part (boolean) – True to append the file facets
- version (boolean) – True to append the version facet
- file_folder (boolean) – True to append the folder for physical files
- latest (boolean) – True to switch from upgrade to latest version
- root (boolean) – True to prepend the DRS root directory
Returns: The corresponding facet values
Return type: list
-
path
(**kwargs)[source]¶ Convert a list of facet values into path. The arguments are the same as
esgprep.drs.handler.DRSPath.items()
Returns: The path Return type: str
-
-
class
esgprep.drs.handler.
DRSLeaf
(src, dst, mode, origin)[source]¶ Handler providing methods to deal with DRS file.
-
upgrade
(todo_only=True, commands_file=None)[source]¶ Upgrade the DRS tree.
Parameters: - commands_file (str) – The file to write command-lines statement if submitted
- todo_only (boolean) – True to only print Unix command-lines to apply (i.e., as dry-run)
-
has_permissions
(root)[source]¶ Checks permissions for DRS leaf migration. Discards relative paths.
Parameters: root (str) – The DRS tree root Raises: Error – If missing user privileges
-
migration_granted
(root)[source]¶ Check if migration mode is allowed by filesystem. Bacially, copy or move will always succeed. Only hardlinks could fail depending on the filesystem partition.
Parameters: root (str) – The DRS tree root Raises: Error – If migration is disallowed by filesystem configuration
-
-
class
esgprep.drs.handler.
DRSTree
(root=None, version=None, mode=None, outfile=None)[source]¶ Handler providing methods to deal with DRS tree.
-
create_leaf
(nodes, leaf, label, src, mode, origin=None, force=False)[source]¶ Creates all upstream nodes to a DRS leaf. The
esgprep.drs.handler.DRSLeaf()
class is added to data leaf nodes.Parameters: - nodes (list) – The list of node tags to the leaf
- leaf (str) – The leaf name
- label (str) – The leaf label
- src (str) – The source of the leaf
- mode (str) – The migration mode (e.g., ‘copy’, ‘move’, etc.)
- origin (str) – The original file full path used for the DRSLeaf source
- force (boolean) – Overwrite node creation if True and node exists
-
check_uniqueness
()[source]¶ Check tree upgrade uniqueness. Each data version to upgrade has to be stricly different from the latest version if exists.
-
todo
()[source]¶ As a dry run
esgprep.drs.handler.DRSTree.upgrade()
that only prints command-lines to do.
-
-
esgprep.drs.handler.
print_cmd
(line, commands_file, todo_only, mode='a')[source]¶ Print unix command-line depending on the choosen output and DRS action.
Parameters: - line (str) – The command-line to write.
- commands_file (str) – The output file to write command-lines, None if not.
- todo_only (boolean) – True to only print Unix command-lines to apply (i.e., as dry-run)
- mode (str) – File open() mode
platform: | Unix |
---|---|
synopsis: | Manages the filesystem tree according to the project the Data Reference Syntax and versioning. |
-
esgprep.drs.main.
process
(collector_input)[source]¶ File process that:
- Handles files,
- Deduces facet key, values pairs from file attributes
- Checks facet values against CV,
- Applies the versioning
- Populates the DRS tree crating the appropriate leaves,
- Stores dataset statistics.
Parameters: source (str) – The file full path to process
-
esgprep.drs.main.
tree_builder
(fh)[source]¶ Builds the DRS tree accord to a source
Parameters: fh (esgprep.drs.handler.File) – The file handler object
-
esgprep.drs.main.
initializer
(keys, values)[source]¶ Initialize process context by setting particular variables as global variables.
Parameters:
-
esgprep.drs.main.
do_scanning
(ctx)[source]¶ Returns True if file scanning is necessary regarding command-line arguments
Parameters: ctx (esgprep.drs.context.ProcessingContext) – New processing context to evaluate Returns: True if file scanning is necessary Return type: boolean
-
esgprep.drs.main.
run
(args)[source]¶ Main process that:
- Instantiates processing context,
- Loads previous program instance,
- Parallelizes file processing with threads pools,
- Apply command-line action to the whole DRS tree,
- Evaluate exit status.
Parameters: args (ArgumentParser) – The command-line arguments parser
esgcheckvocab¶
platform: | Unix |
---|---|
synopsis: | Toolbox to prepare ESGF data for publication. |
-
esgprep.esgcheckvocab.
get_args
()[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
esgprep.checkvocab.context.
ProcessingContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – The command-line arguments parser Returns: The processing context Return type: ProcessingContext
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this module. |
platform: | Unix |
---|---|
synopsis: | Checks DRS vocabulary against configuration files. |
-
esgprep.checkvocab.main.
process
(collector_input)[source]¶ Data process that:
- Retrieve facet key, values pairs from file or directory attributes
Parameters: source (str) – The file full path to process or the dataset ID
-
esgprep.checkvocab.main.
initializer
(keys, values)[source]¶ Initialize process context by setting particular variables as global variables.
Parameters:
-
esgprep.checkvocab.main.
run
(args)[source]¶ Main process that:
- Instantiates processing context
- Parses the configuration files options and values,
- Deduces facets and values from directories or dataset lists,
- Compares the values of each facet between both,
- Print or log the checking.
Parameters: args (ArgumentParser) – The command-line arguments parser
esgmapfile¶
platform: | Unix |
---|---|
synopsis: | Toolbox to prepare ESGF data for publication. |
-
esgprep.esgmapfile.
get_args
()[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
esgprep.mapfile.context.
ProcessingContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – The command-line arguments parser Returns: The processing context Return type: ProcessingContext
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this module. |
-
exception
esgprep.mapfile.custom_exceptions.
InconsistentDatasetID
(project, dset_id)[source]¶ Raised when dataset ID doesn’t start with the project ID.
platform: | Unix |
---|---|
synopsis: | Class to handle files for mapfile generation. |
-
class
esgprep.mapfile.handler.
Source
(source)[source]¶ Handler providing methods to deal with file processing.
-
get
(key)[source]¶ Returns the attribute value corresponding to the key. The submitted key can refer to
File.key
orFile.attributes[key]
.Parameters: key (str) – The key Returns: The corresponding value Return type: str or list or dict depending on the key Raises: Error – If unknown key
-
load_attributes
(pattern)[source]¶ Loads DRS attributes catched from a regular expression match. The project facet is added in any case with lower case.
Parameters: pattern (str) – The regular expression to match Raises: Error – If regular expression matching fails
-
check_facets
(facets, config)[source]¶ Checks each facet against the controlled vocabulary. If a DRS attribute is missing regarding the list of facets, the DRS attributes are completed from the configuration file maptables.
Parameters: - facets (list) – The list of facet to check
- config (ESGConfigParser.SectionParser) – The configuration parser
Raises: Error – If one facet checkup fails
-
get_dataset_id
(dataset_format)[source]¶ Builds the dataset identifier from the dataset template interpolation.
Parameters: dataset_format (str) – The dataset template pattern Returns: The resulting dataset identifier Return type: str Raises: Error – If a facet key is missing
-
get_dataset_version
(no_version=False)[source]¶ Retrieve the dataset version. If the version facet cannot be deduced from full path, it follows the symlink to complete the DRS attributes.
Parameters: no_version (boolean) – True to not append version to the dataset ID Returns: The dataset version Return type: str
-
platform: | Unix |
---|---|
synopsis: | Generates ESGF mapfiles upon a local ESGF node or not. |
-
esgprep.mapfile.main.
get_output_mapfile
(outdir, attributes, mapfile_name, dataset_id, dataset_version, mapfile_drs=None, basename=False)[source]¶ Builds the mapfile full path depending on:
- the –mapfile name using tokens,
- an optional mapfile tree declared in configuration file with
mapfile_drs
, - the –outdir output directory.
Parameters: - outdir (str) – The output directory (default is current working directory)
- attributes (dict) – The facets values deduces from file full path
- mapfile_name (str) – An optional mapfile name from the command-line
- dataset_id (str) – The dataset id
- dataset_version (str) – The dataset version
- mapfile_drs (str) – The optional mapfile tree
- basename (boolean) – True to only get mapfile name without root directory
Returns: The mapfile full path
Return type: str
-
esgprep.mapfile.main.
mapfile_entry
(dataset_id, dataset_version, ffp, size, optional_attrs)[source]¶ Builds the mapfile entry corresponding to a processed file.
Parameters: - dataset_id (str) – The dataset id
- dataset_version (str) – The dataset version
- ffp (str) – The file full path
- size (str) – The file size
- optional_attrs (dict) – Optional attributes to append to mapfile lines
Returns: The mapfile line/entry
Return type: str
-
esgprep.mapfile.main.
write
(outfile, entry)[source]¶ Inserts a mapfile entry. It generates a lockfile to avoid that several threads write on the same file at the same time. A LockFile is acquired and released after writing. Acquiring LockFile is timeouted if it’s locked by other thread. Each process adds one line to the appropriate mapfile
Parameters: - outfile (str) – The output mapfile full path
- entry (str) – The mapfile entry to write
-
esgprep.mapfile.main.
process
(source)[source]¶ File process that:
- Handles file,
- Harvests directory attributes,
- Check DRS attributes against CV,
- Builds dataset ID,
- Retrieves file size,
- Does checksums,
- Deduces mapfile name,
- Writes the corresponding mapfile entry.
Any error leads to skip the file. It does not stop the process.
Parameters: source (str) – The source to process could be a path or a dataset ID Returns: The output mapfile full path Return type: str
utils¶
platform: | Unix |
---|---|
synopsis: | Useful functions to collect data from directories. |
-
class
esgprep.utils.collectors.
Collector
(sources, spinner=True)[source]¶ Base collector class to yield regular NetCDF files.
Parameters: sources (list) – The list of sources to parse Returns: The data collector Return type: iter
-
class
esgprep.utils.collectors.
PathCollector
(*args, **kwargs)[source]¶ Collector class to yield files from a list of directories to parse.
Parameters: dir_filter (str) – A regular expression to exclude directories from the collection
-
class
esgprep.utils.collectors.
VersionedPathCollector
(project, dir_format, *args, **kwargs)[source]¶ Collector class to yield files from a list of versioned directories to parse.
Parameters: dir_format (str) – The regular expression of the directory format
-
class
esgprep.utils.collectors.
DatasetCollector
(versioned=True, *args, **kwargs)[source]¶ Collector class to yield datasets from a list of files to read.
-
class
esgprep.utils.collectors.
FilterCollection
[source]¶ Regex dictionary with a call method to evaluate a string against several regular expressions. The dictionary values are 2-tuples with the regular expression as a string and a boolean indicating to match (i.e., include) or non-match (i.e., exclude) the corresponding expression.
platform: | Unix |
---|---|
synopsis: | Constants used in this package. |
platform: | Unix |
---|---|
synopsis: | Useful functions to use with this package. |
-
class
esgprep.utils.context.
GitHubBaseContext
(args)[source]¶ Base manager class for esgfetch* modules.
-
class
esgprep.utils.context.
MultiprocessingContext
(args)[source]¶ Base manager class for esgmapfile, esgdrs and esgcheckvocab modules.
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this package. |
-
exception
esgprep.utils.custom_exceptions.
InvalidNetCDFFile
(path)[source]¶ Raised when invalid or corrupted NetCDF file.
-
exception
esgprep.utils.custom_exceptions.
NoNetCDFAttribute
(attribute, path, variable=None)[source]¶ Raised when a NetCDF attribute is missing.
-
exception
esgprep.utils.custom_exceptions.
KeyNotFound
(key, keys=None)[source]¶ Raised when a class key is not found.
-
exception
esgprep.utils.custom_exceptions.
InvalidChecksumType
(client)[source]¶ Raised when checksum type in unknown.
-
exception
esgprep.utils.custom_exceptions.
ChecksumFail
(path, checksum_type=None)[source]¶ Raised when a checksum fails.
-
exception
esgprep.utils.custom_exceptions.
NoFileFound
(paths)[source]¶ Raised when frequency no file found.
-
exception
esgprep.utils.custom_exceptions.
GitHubException
(msg)[source]¶ Basic exception for GitHub errors.
Raised when no read access on GitHub repo.
-
exception
esgprep.utils.custom_exceptions.
GitHubAPIRateLimit
(reset_time)[source]¶ Raised when GitHub API rate limit exceeded.
-
exception
esgprep.utils.custom_exceptions.
GitHubFileNotFound
[source]¶ Raised when no file found on GitHub repo.
-
exception
esgprep.utils.custom_exceptions.
GitHubConnectionError
[source]¶ Raised when the GitHub request fails.
-
exception
esgprep.utils.custom_exceptions.
GitHubReferenceNotFound
(ref, refs)[source]¶ Raised when invalid GitHub reference requested.
platform: | Unix |
---|---|
synopsis: | Useful functions to use with this package. |
-
class
esgprep.utils.custom_print.
COLOR
(color=None)[source]¶ Define color object for print statements Default is no color (i.e., restore original color)
-
class
esgprep.utils.custom_print.
_TAGS
[source]¶ Tags strings for print statements These are evaluated as properties, in order to defer until after enable_colors or disable_colors has been called during initialisation
-
class
esgprep.utils.custom_print.
Print
[source]¶ Class to manage and dispatch print statement depending on log and debug mode.
platform: | Unix |
---|---|
synopsis: | Useful functions to use with this package. |
-
esgprep.utils.github.
gh_request_content
(url, auth=None)[source]¶ Gets the GitHub content of a file or a directory.
Parameters: - url (str) – The GitHub url to request
- auth (requests.auth.HTTPBasicAuth) – The authenticator object
Returns: The GitHub request content
Return type: requests.models.Response
Raises: - Error – If user not authorized to read GitHub repository
- Error – If user exceed the GitHub API rate limit
- Error – If the queried content does not exist
- Error – If the GitHub request fails for other reasons
-
esgprep.utils.github.
backup
(f, mode=None)[source]¶ Backup a local file following different modes:
- “one_version” renames the existing file in its source directory adding a “.bkp” extension to the filename.
- “keep_versions” moves the existing file in a child directory called “bkp” and add a timestamp to the filename.
Parameters: - f (str) – The file to backup
- mode (str) – The backup mode to follow
-
esgprep.utils.github.
write_content
(outfile, content)[source]¶ Write GitHub content into a file.
Parameters: - outfile (str) – The output file
- content (str) – The file content to write
-
esgprep.utils.github.
do_fetching
(f, remote_checksum, keep, overwrite)[source]¶ Returns True or False depending on decision schema
Parameters: - f (str) – The file to test
- remote_checksum (str) – The remote file checksum
- overwrite (boolean) – True if overwrite existing files
- keep (boolean) – True if keep existing files
Returns: True depending on the conditions
Return type: boolean
-
esgprep.utils.github.
githash
(outfile)[source]¶ Makes Git checksum (as called by “git hash-object”) of a file
Parameters: outfile – Returns: The SHA1 sum
platform: | Unix |
---|---|
synopsis: | Useful functions to use with this package. |
-
class
esgprep.utils.misc.
ProcessContext
(args)[source]¶ Encapsulates the processing context/information for child process.
Parameters: args (dict) – Dictionary of argument to pass to child process Returns: The processing context Return type: ProcessContext
-
class
esgprep.utils.misc.
ncopen
(path, mode='r')[source]¶ Properly opens a netCDF file
Parameters: path (str) – The netCDF file full path Returns: The netCDF dataset object Return type: netCDF4.Dataset
-
esgprep.utils.misc.
remove
(pattern, string)[source]¶ Removes a substring catched by a regular expression.
Parameters: - pattern (str) – The regular expression to catch
- string (str) – The string to test
Returns: The string without the catched substring
Return type: str
-
esgprep.utils.misc.
match
(pattern, string, inclusive=True)[source]¶ Validates a string against a regular expression. Only match at the beginning of the string. Default is to match inclusive regex.
Parameters: - pattern (str) – The regular expression to match
- string (str) – The string to test
- inclusive (boolean) – False if negative matching (i.e., exclude the regex)
Returns: True if it matches
Return type: boolean
-
esgprep.utils.misc.
load
(path)[source]¶ Loads data from Pickle file.
Parameters: path (str) – The Pickle file path Returns: The Pickle file content Return type: object
-
esgprep.utils.misc.
store
(path, data)[source]¶ Stores data into a Pickle file.
Parameters: - path (str) – The Pickle file path
- data (list) – A list of data objects to store
-
esgprep.utils.misc.
evaluate
(results)[source]¶ Evaluates a list depending on absence/presence of None values.
Parameters: results (list) – The list to evaluate Returns: True if no blocking errors Return type: boolean
-
esgprep.utils.misc.
checksum
(ffp, checksum_type, include_filename=False, human_readable=True)[source]¶ Does the checksum by the Shell avoiding Python memory limits.
Parameters: - ffp (str) – The file full path
- checksum_type (str) – Checksum type
- human_readable (boolean) – True to return a human readable digested message
- include_filename (boolean) – True to include filename in hash calculation
Returns: The checksum
Return type: str
Raises: Error – If the checksum fails
-
esgprep.utils.misc.
get_checksum_pattern
(checksum_type)[source]¶ Build the checksum pattern depending on the checksum type.
Parameters: checksum_type (str) – The checksum type Returns: The checksum pattern Return type: re.Object
-
esgprep.utils.misc.
get_tracking_id
(ffp, project)[source]¶ Get and validate tracking_id/PID string from netCDF global attributes of file
Parameters: - ffp (str) – The file full path
- project (str) – The project name
Returns: THe tracking_id string
-
esgprep.utils.misc.
is_uuid
(uuid_string, version=4)[source]¶ Returns True is validated string is a UUID.
Parameters: - uuid_string (str) – The string to validate
- version (int) – The UUID version to use, default is 4
Returns: True if uuid_string is a valid uuid
Return type: boolean
-
esgprep.utils.misc.
load_checksums
(checksum_file)[source]¶ Convert checksums file input as dictionary where (key: value) pairs respectively are the file path and its checksum.
Parameters: checksum_file (FileObject) – The submitted checksum file Returns: The loaded checksums Return type: dict
-
esgprep.utils.misc.
get_checksum
(ffp, checksum_type='sha256', checksums_from_file=None)[source]¶ Get file checksum. Allows to submit a list of checksums in a dictionary way {file: checksum}, to be used by –checksums-from flag.
Parameters: - checksum_type (str) – Checksum type
- checksums_from_file (dict) – Checksums from file
Returns: The checksum
Return type: str
Raises: Error – If the checksum fails
platform: | Unix |
---|---|
synopsis: | Class and methods used to parse command-line arguments. |
-
class
esgprep.utils.parser.
MultilineFormatter
(prog, default_columns=120)[source]¶ Custom formatter class for argument parser to use with the Python argparse module.
-
class
esgprep.utils.parser.
DirectoryChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Custom action class for argument parser to use with the Python argparse module.
-
class
esgprep.utils.parser.
VersionChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Custom action class for argument parser to use with the Python argparse module.
-
esgprep.utils.parser.
keyval_converter
(pair)[source]¶ Checks the key value syntax.
Parameters: pair (str) – The key/value pair to check Returns: The key/value pair Return type: list Raises: Error – If invalid pair syntax
-
esgprep.utils.parser.
regex_validator
(string)[source]¶ Validates a Python regular expression syntax.
Parameters: string (str) – The string to check Returns: The Python regex Return type: re.compile Raises: Error – If invalid regular expression
-
esgprep.utils.parser.
processes_validator
(value)[source]¶ Validates the max processes number.
Parameters: value (str) – The max processes number submitted Returns:
Module author: Levavasseur Guillaume (CNRS/IPSL) <glipsl@ipsl.fr>