Common

Movement

Tasks related to project movement as part of Tamr projects

tamr_toolbox.project._common.movement.export_artifacts(*, project, artifact_directory_path, exclude_artifacts=None, asynchronous=False)[source]

Export project artifacts for project movement

Requires Tamr 2021.005.0 or later

Parameters
  • project (Project) – a tamr project object

  • artifact_directory_path (str) – export directory for project artifacts

  • exclude_artifacts (Optional[List[str]]) – list of artifacts to exclude

  • asynchronous (bool) – flag to run function asynchronously

Return type

Operation

Returns

operation for project export api call

tamr_toolbox.project._common.movement.import_artifacts(*, project_artifact_path, tamr_client, target_project=None, new_project_name=None, new_unified_dataset_name=None, exclude_artifacts=None, include_additive_artifacts=None, include_destructive_artifacts=None, fail_if_not_present=False, asynchronous=False, overwrite_existing=False)[source]

Import project artifacts into a tamr instance

Requires Tamr 2021.005.0 or later

Parameters
  • tamr_client (Client) – a tamr client

  • project_artifact_path (str) – project artifacts zip filepath

  • target_project (Optional[Project]) – an optional target project for migration

  • new_project_name (Optional[str]) – new project name

  • new_unified_dataset_name (Optional[str]) – new unified dataset name

  • exclude_artifacts (Optional[List[str]]) – list of artifacts to exclude in import

  • include_additive_artifacts (Optional[List[str]]) – list of artifacts to import only additively

  • include_destructive_artifacts (Optional[List[str]]) – list of artifacts to import destructively

  • fail_if_not_present (bool) – flag to fail project if not already present in instance

  • asynchronous (bool) – flag to run function asynchronously

  • overwrite_existing (bool) – flag to overwrite existing project artifacts

Return type

Operation

Returns

operation for project import api call

Transformations

Tasks related to transformations with Tamr projects

class tamr_toolbox.project._common.transformations.InputTransformation(transformation, datasets=<factory>)[source]

A transformation scoped to input datasets

Version:

Requires Tamr 2020.009.0 or later

Parameters
  • transformation (str) – The text of a transformations script

  • datasets (List[Dataset]) – The list of input datasets that the script should be applied to

class tamr_toolbox.project._common.transformations.TransformationGroup(input_scope=<factory>, unified_scope=<factory>)[source]

A group of input transformations and unified transformations

Version:

Requires Tamr 2020.009.0 or later

Parameters
  • input_scope (List[InputTransformation]) – A list of transformation to apply to input datasets

  • unified_scope (List[str]) – A list of transformation scripts to apply to the unified dataset

tamr_toolbox.project._common.transformations.get_all(project)[source]

Get the transformations of a Project

Version:

Requires Tamr 2020.009.0 or later

Parameters

project (Project) – Project containing transformations

Return type

TransformationGroup

Returns

All input transformations and unified transformations of a project

tamr_toolbox.project._common.transformations.set_all(project, tx, *, allow_overwrite=True)[source]

Set the transformations of a Project

Version:

Requires Tamr 2020.009.0 or later

Parameters
  • project (Project) – Project to place transformations within

  • tx (TransformationGroup) – Transformations to put into project

  • allow_overwrite – Whether existing transformations can be overwritten

Return type

Response

Returns

Response object created when transformations of a project are replaced

Raises
  • RuntimeError – if allow_overwrite is set to False but transformations already exists in project

  • ValueError – if provided tx are invalid

tamr_toolbox.project._common.transformations.get_all_unified(project)[source]

Get the unified transformations of a Project

Version:

Requires Tamr 2020.009.0 or later

Parameters

project (Project) – Project containing transformations

Return type

List[str]

Returns

All unified transformations of a project

tamr_toolbox.project._common.transformations.set_all_unified(project, tx, *, allow_overwrite=True)[source]

Set the unified transformations of a Project. Any input transformations will not be altered

Version:

Requires Tamr 2020.009.0 or later

Parameters
  • project (Project) – Project to place transformations within

  • tx (List[str]) – Unified transformations to put into project

  • allow_overwrite – Whether existing unified transformations can be overwritten

Return type

Response

Returns

Response object created when transformations of a project are replaced

Raises

RuntimeError – if allow_overwrite is set to False but transformations already exists in project

Schema

Tasks related to schema mapping as part of Tamr projects

tamr_toolbox.project._common.schema.map_attribute(project, *, source_attribute_name, source_dataset_name, unified_attribute_name)[source]

Maps source_attribute in source_dataset to unified_attribute in unified_dataset. If the mapping already exists it will log a warning and return the existing AttributeMapping from the project’s collection.

Parameters
  • source_attribute_name (str) – Source attribute name to map

  • source_dataset_name (str) – Source dataset containing the source attribute

  • unified_attribute_name (str) – Unified attribute to which to map the source attribute

  • project (Project) – The project in which to perform the mapping

Return type

AttributeMapping

Returns

The created AttributeMapping

Raises

ValueError – if input variables source_attribute_name or source_dataset_name or unified_attribute_name are set to empty strings; or if the dataset source_dataset_name is not found on Tamr; or if source_attribute_name is missing from the attributes of source_attribute_name

tamr_toolbox.project._common.schema.unmap_attribute(project, *, source_attribute_name, source_dataset_name, unified_attribute_name)[source]

Unmaps a source attribute.

Parameters
  • source_attribute_name (str) – the name of the source attribute to unmap

  • source_dataset_name (str) – the name of the source dataset containing that source attribute

  • unified_attribute_name (str) – the unified attribute from which to unmap

  • project (Project) – the project in which to unmap the attribute

Return type

None

Returns

None

tamr_toolbox.project._common.schema.bootstrap_dataset(project, *, source_dataset, force_add_dataset_to_project=False)[source]

Bootstraps a dataset (i.e. maps all source columns to themselves)

Parameters
  • source_dataset (Dataset) – the source dataset (a Dataset object not a string)

  • project (Project) – the project to do the mapping ing

  • force_add_dataset_to_project (bool) – boolean whether to add the dataset to the project if it is not already a part of it

Return type

List[AttributeMapping]

Returns

List of the AttributeMappings generated

Raises

RuntimeError – if source_dataset is not part of the given project, set ‘force_add_dataset_to_project’ flag to True to automatically add it

tamr_toolbox.project._common.schema.unmap_dataset(project, *, source_dataset, remove_dataset_from_project=False, skip_if_missing=False)[source]

Wholly unmaps a dataset and optionally removes it from a project.

Parameters
  • source_dataset (Dataset) – the source dataset (Dataset object not a string) to unmap

  • project (Project) – the project in which to unmap the dataset

  • remove_dataset_from_project (bool) – boolean to also remove the dataset from the project

  • skip_if_missing (bool) – boolean to skip if dataset is not in project. If set to false and dataset is not in project will raise a RuntimeError

Return type

None

Returns

None

Raises

RuntimeError – if source_dataset is not in project and skip_if_missing not set to True