Utilities¶
Client¶
Tasks related to connecting to a Tamr instance
- tamr_toolbox.utils.client.health_check(client)[source]¶
Query the health check API and check if each service is healthy (returns True)
- Parameters
client (
Client
) – the tamr client- Return type
- Returns
True if all services are healthy, False if unhealthy
- tamr_toolbox.utils.client.create(*, username, password, host, port=9100, protocol='http', base_path='/api/versioned/v1/', session=None, store_auth_cookie=False, enforce_healthy=False)[source]¶
Creates a Tamr client from the provided configuration values
- Parameters
username (
str
) – The username to log access Tamr aspassword (
str
) – the password for the userhost (
str
) – The ip address of Tamrport (
Union
[str
,int
,None
]) – The port of the Tamr UI. Pass a value of None to specify an address with no portprotocol (
str
) – https or httpbase_path (
str
) – Optional argument to specify a different base pathsession (
Optional
[Session
]) – Optional argument to pass an existing requests Sessionstore_auth_cookie (
bool
) – If true will allow Tamr authentication cookie to be stored and reusedenforce_healthy (
bool
) – If true will enforce a healthy state upon creation
- Return type
Client
- Returns
Tamr client
- tamr_toolbox.utils.client.create_with_jwt(*, token, host, port=9100, protocol='http', base_path='/api/versioned/v1/', session=None, store_auth_cookie=False, enforce_healthy=False)[source]¶
Creates a Tamr client from the provided configuration values using a JWT token instead of a username and password. Note that this feature is only available on v2022.010.0 or later.
- Parameters
token (
str
) – A JWT token to authenticate the clienthost (
str
) – The ip address of Tamrport (
Union
[str
,int
,None
]) – The port of the Tamr UI. Pass a value of None to specify an address with no portprotocol (
str
) – https or httpbase_path (
str
) – Optional argument to specify a different base pathsession (
Optional
[Session
]) – Optional argument to pass an existing requests Sessionstore_auth_cookie (
bool
) – If true will allow Tamr authentication cookie to be stored and reusedenforce_healthy (
bool
) – If true will enforce a healthy state upon creation
- Return type
Client
- Returns
Tamr client
- tamr_toolbox.utils.client.get_with_connection_retry(client, api_endpoint, *, timeout_seconds=600, sleep_seconds=20)[source]¶
- Will handle exceptions when attempting to connect to the Tamr API.
This is used to handle connection issues when Tamr restarts due to a restore.
- Parameters
- Return type
- Returns
A response object from API request.
- tamr_toolbox.utils.client.poll_endpoint(client, api_endpoint, *, poll_interval_seconds=3, polling_timeout_seconds=None, connection_retry_timeout_seconds=600)[source]¶
Waits until job has a state of Canceled, Succeeded, or Failed.
- Parameters
client (
Client
) – A Tamr client objectapi_endpoint (
str
) – Tamr API endpointpoll_interval_seconds (
int
) – Amount of time in between polls of job state.polling_timeout_seconds (
Optional
[int
]) – Amount of time before a timeout error is thrown.connection_retry_timeout_seconds (
int
) – Amount of time before timeout error is thrown during connection retry.
- Return type
- Returns
A response object from API request.
Configuration¶
Tasks related to loading and using configuration files
Custom Button¶
Helper functions related to creating & managing custom UI buttons as yaml files.
Due to how Tamr custom buttons are configured, these functions will need to be run on the actual server on which Tamr is installed to work as expected.
Important: Custom buttons are only available to versions 2022.008.0 and later
- tamr_toolbox.utils.custom_button.create_redirect_button(*, extension_name, button_id, button_text, page_names, redirect_url, open_in_new_tab, output_dir, button_name)[source]¶
Create yaml file with all required attributes for a ‘REDIRECT’ UI button. Yaml file is saved locally.
Button features are only available to versions 2022.008.0 and later.
- Parameters
extension_name (
str
) – Name of button extensionbutton_id (
str
) – A short identifier for the button to use in the, body of a POST call or a redirect URL path substitution.button_text (
str
) – The button label to display in the UI.page_names (
List
[str
]) – The pages of the UI on which to display the button.redirect_url (
str
) – The URL that the browser should loadopen_in_new_tab (
bool
) – If true, the specified URL opens in a new browser tab.output_dir (
str
) – Directory to save yaml file (absolute path)button_name (
str
) – Name of yaml file
- Return type
- Returns
Path to yaml file created
- tamr_toolbox.utils.custom_button.create_post_button(*, extension_name, button_id, button_text, page_names, post_url, post_body_keys, success_message, fail_message, display_response, output_dir, button_name)[source]¶
Create yaml file with all required attributes for a ‘POST’ UI button. Yaml file is saved locally.
Button features are only available to versions 2022.008.0 and later.
- Parameters
extension_name (
str
) – Name of button extensionbutton_id (
str
) – A short identifier for the button to use in the, body of a POST call or a redirect URL path substitution.button_text (
str
) – The button label to display in the UI.page_names (
List
[str
]) – The pages of the UI on which to display the button.post_url (
str
) – The target URL for a POST API callpost_body_keys (
List
[str
]) – Specifies the keys to request in the body of the POST callsuccess_message (
str
) – The message that displays to the user when the POST call succeeds.fail_message (
str
) – The message that displays to the user when the POST call fails.display_response (
bool
) – Whether the contents of the API response body should display to the user.output_dir (
str
) – Directory to save yaml file (absolute path)button_name (
str
) – Name of yaml file
- Return type
- Returns
Path to yaml file created
- tamr_toolbox.utils.custom_button.create_button_extension(*, extension_name, buttons, output_dir)[source]¶
Given a list of button yaml files, save it as a grouped extension yaml file. Yaml file is saved locally. Button features are only available to versions 2022.008.0 and later.
- tamr_toolbox.utils.custom_button.create_button_extension_from_list(*, extension_name, output_dir, buttons)[source]¶
Given a list of button dictionaries, save it as a grouped extension yaml file. Yaml file is saved locally.
Button features are only available to versions 2022.008.0 and later.
- Parameters
extension_name (
str
) – Name of button extension to saveoutput_dir (
str
) – directory in which to save yaml extension file (absolute path)buttons (
List
[dict
]) – List of button dictionaries. Either redirect or post.examples (Format) –
--- –
redirect –
{ – “buttonType”: “redirectButton”, “buttonId”: button_id, “buttonText”: button_text, “pageNames”: page_names, “redirectUrl”: redirect_url, “openInNewTab”: open_in_new_tab
} –
--- –
post –
{ – “buttonType”: “postButton”, “buttonId”: button_id, “buttonText”: button_text, “pageNames”: page_names, “postUrl”: post_url, “postBodyKeys”: post_body_keys, “successMessage”: success_message, “failMessage”: fail_message, “displayResponse”: display_response
} –
--- –
- Return type
- Returns
Path to yaml file created
- tamr_toolbox.utils.custom_button.register_buttons(*, tamr_client, buttons, tamr_install_dir, remote_client=None, impersonation_username=None, impersonation_password=None)[source]¶
Registers a list of button(s) in a Tamr instance. Requires Tamr restart to display buttons in UI.
- Important: If NOT running this function using a remote client, this function must
be run on the server on which Tamr is installed.
Runs in a remote environment if an ssh client is specified otherwise runs in the local shell. If an impersonation_username is provided, the command is run as the provided user. If an impersonation_password is provided, password authentication is used for impersonation, otherwise sudo is used. Button features are only available to versions 2022.008.0 and later.
- Version:
Requires Tamr 2022.008.0 or later
- Parameters
tamr_client (
Client
) – Tamr Client objectbuttons (
Union
[str
,List
[str
]]) – An individual string or a list of yaml files (absolute paths) with button configstamr_install_dir (
str
) – Full path to directory where Tamr is installedremote_client (
Optional
[SSHClient
]) – An ssh client providing a remote connectionimpersonation_username (
Optional
[str
]) – A bash user to run the command as, this should be the tamr install userimpersonation_password (
Optional
[str
]) – The password for the impersonation_username
Returns:
- tamr_toolbox.utils.custom_button.delete_buttons(*, button_files, tamr_install_dir)[source]¶
Given a list of button yaml files, delete them thus removing the button from UI.
- NB: Registered buttons are located in $TAMR_HOME/tamr/auxiliary-sevrices/conf
Requires restart of Tamr to register deletion. Button features are only available to versions 2022.008.0 and later.
- Parameters
Returns:
Logging¶
Tasks related to logging within scripts
- tamr_toolbox.utils.logger.create(name, *, log_to_terminal=True, log_directory=None, log_prefix='', date_format='%Y-%m-%d')[source]¶
Return logger object with pre-defined format. Log file will be located under log_directory with file name <log_prefix>_<date>.log, quashing extra separating underscores. Defaults to <date>.log.
For use in scripts only. To log in module files, use the standard library logging module with a module-level logger and enable package logging. See https://docs.python.org/3/howto/logging.html#advanced-logging-tutorial
>>> log = logging.getLogger(__name__)
- Parameters
name (
str
) – This sets the name of your logger instance. It does not affect the file name. To change the filename use log_prefixlog_to_terminal (
bool
) – Boolean indicating whether or not to log messages to the terminal.log_directory (
Optional
[str
]) – The directory to place log files insidelog_prefix (
str
) – The string to prepend to the date in the log file name.date_format (
str
) – format string for date suffix on log file name
- Return type
- Returns
Logger object
- tamr_toolbox.utils.logger.set_logging_level(logger_name, level)[source]¶
A useful method for setting logging level for all a given logger and its handlers.
- tamr_toolbox.utils.logger.enable_package_logging(package_name, *, log_to_terminal=True, log_directory=None, level=None, log_prefix='', date_format='%Y-%m-%d')[source]¶
A helper function to enable package logging for any package following python best practices for logging names (i.e. logger name == package.module.submodule).
- Parameters
package_name (
str
) – the name of the package for which to enable logginglog_to_terminal (
bool
) – Boolean indicating whether or not to log messages to the terminallog_directory (
Optional
[str
]) – optional log directory which the package will write logslevel (
Optional
[str
]) – optional level to specify, default is WARNING (inherited from base logging package)log_prefix (
str
) – Optional prefix for log files, if None will be blank stringdate_format (
str
) – Optional date format for log file
- Return type
- tamr_toolbox.utils.logger.enable_toolbox_logging(*, log_to_terminal=True, log_directory=None, level=None, log_prefix='', date_format='%Y-%m-%d')[source]¶
A simple wrapper to enable_package_logging to give friendly call for users.
- Parameters
log_to_terminal (
bool
) – Boolean indicating whether or not to log messages to the terminallog_directory (
Optional
[str
]) – optional directory to which to write tamr_toolbox logslevel (
Optional
[str
]) – Optional logging level to specify, default is WARNING (inherited from base logging package)log_prefix (
str
) – Optional prefix for log files, if None will be blank stringdate_format (
str
) – Optional date format for log file
- Return type
Operation¶
Tasks related to Tamr operations (or jobs)
- tamr_toolbox.utils.operation.enforce_success(operation)[source]¶
Raises an error if an operation fails
- tamr_toolbox.utils.operation.from_resource_id(tamr, *, job_id)[source]¶
Create an operation from a job id
- tamr_toolbox.utils.operation.get_latest(tamr)[source]¶
Get the latest operation
- Parameters
tamr (
Client
) – A Tamr client- Return type
- Returns
The latest job
- tamr_toolbox.utils.operation.get_details(*, operation)[source]¶
Return a text describing the information of a job
- tamr_toolbox.utils.operation.wait(operation, *, poll_interval_seconds=3, timeout_seconds=None)[source]¶
Continuously polls for this operation’s server-side state.
- Parameters
- Raises
TimeoutError – If operation takes longer than timeout_seconds to resolve.
- Return type
- tamr_toolbox.utils.operation.monitor(operation, *, poll_interval_seconds=1, timeout_seconds=300)[source]¶
Continuously polls for this operation’s server-side state and returns operation when there is a state change
- Parameters
- Raises
TimeoutError – If operation takes longer than timeout_seconds to resolve.
- Return type
Testing¶
Tasks related to testing code
- tamr_toolbox.utils.testing.mock_api(*, response_logs_dir=None, enforce_online_test=False, asynchronous=False)[source]¶
Decorator for pytest tests that mocks API requests by reading a file of pre-generated responses. Will generate responses file based on a real connection if pre-generated responses are not found.
- Parameters
- Return type
- Returns
Decorated function
Downstream¶
- tamr_toolbox.utils.downstream.datasets(dataset, *, include_dependencies_by_name=False)[source]¶
Returns a dataset’s downstream datasets.
- Parameters
dataset (
Dataset
) – The target dataset.include_dependencies_by_name (
bool
) – Whether to include datasets based on name similarity. No dependencies will be found by name if the dataset is not an unified dataset either based on backened pipeline (if project still exists) or based on regex (dataset name has suffix ‘unified_dataset’).
- Return type
- Returns
- List of Dataset objects ordered by number of its downstream dependencies.
Note that there can be bidirectional dependency so datasets with same number of dependencies can depend on each other.
- tamr_toolbox.utils.downstream.projects(dataset, *, include_dependencies_by_name=False)[source]¶
Return list of downstream project_list for a dataset.
- Parameters
dataset (
Dataset
) – The target dataset.include_dependencies_by_name (
bool
) – Whether to include datasets based on name similarity. No dependencies will be found by name if the dataset is not an unified dataset either based on backened pipeline (if project still exists) or based on regex (dataset name has suffix ‘unified_dataset’).
- Return type
- Returns
- List of downstream project_list in order,
including the project the target dataset is part of.
Upstream¶
Functions related to projects upstream of a specified project
- tamr_toolbox.utils.upstream.datasets(dataset)[source]¶
Check for upstream datasets associated with a specified dataset
Version¶
Tasks related to the version of Tamr instances
- tamr_toolbox.utils.version.current(client)[source]¶
Gets the version of Tamr for provided client
- Parameters
client (
Client
) – Tamr client- Return type
- Returns
String representation of Tamr version
- tamr_toolbox.utils.version.is_version_condition_met(*, tamr_version, min_version, max_version=None, exact_version=False, raise_error=False)[source]¶
Check if Tamr version is valid.
- Parameters
tamr_version (
str
) – The version of Tamr being consideredmin_version (
str
) – The earliest version of Tamrmax_version (
Optional
[str
]) – The latest version of Tamr. Default None, in which case no max version is tested for.exact_version (
bool
) – Compare against only one release of Tamr. Default is Falseraise_error (
bool
) – If True, raise an error if the version condition is not met. Default is False.
- Raises
ValueError – if min_version is greater than max_version
EnvironmentError – if raise_error is True, and the condition is not met
Notes
Patch versions (major.minor.patch) are excluded from the comparison If exact_version is True, max_version will be ignored
See also
utils.version.func_requires_tamr_version
- Return type
- tamr_toolbox.utils.version.requires_tamr_version(min_version, max_version=None, exact_version=False)[source]¶
Pie decorator for Tamr version checking
- Parameters
Examples
>>> @requires_tamr_version(min_version="2021.002") >>> def refresh_dataset(tamr_dataset, *args, **kwargs): >>> return tamr_dataset.refresh()
- Raises
ValueError – if min_version is greater than max_version
EnvironmentError – if raise_error is True, and the condition is not met
Notes
This decorator only inspects the Tamr version of arguments going into the function, and not new instances of Tamr referred to within functional code
Patch versions (major.minor.patch) are excluded from the comparison
See also
utils.version.is_version_condition_met
- Return type
Callable
- tamr_toolbox.utils.version.enforce_after_or_equal(client, *, compare_version)[source]¶
- Raises an exception if the version of the Tamr client is before the provided compare version
Will be deprecated in favour of raise_warn_tamr_version()
- Parameters
client (
Client
) – Tamr clientcompare_version (
str
) – String representation of Tamr version
- Return type
- Returns
None
See also
raise_warn_tamr_version ensure_tamr_version