Overview of Data Migration features

Admins and Editors can create new projects.
Project names must be unique.
Users are added to a project as a 'member', what they can do in a project depends on their role and any specific project restrictions applied.
Only Admins can delete a project.
Projects maybe duplicated.
Projects can be given a status to Active, Complete, or Archived.

Files can be added to a project by:
- Using the in project Upload Files feature (small amounts and under 1gb).
- Added a connected SFTP service to the platform (speak to customer success) and then bulk importing them using the SFPT utilities.
Files are used in the creation of project assets and as data sources for conversions.
Files can be deleted by users with appropriate permissions.
Files can be organized into folders and renamed.
Files must be categorized (Schema, Data, Config, Unknown), which is used to help restrict the use of them within Zengines.
CSVs are the preferred file type and format for Schema and Data
Files that are categorized as Data can be previewed in the platform depending on the user role and project restrictions.

Defining Sources and Targets

Sources and Targets contain the schema, related meta data and associated data, they are defined by either:
- Creating using a schema definition file which can contain multiple tables and has data type definitions as well and other constraints and metadata.
- Using a data file that represents a single table, metadata is inferred and should be reviewed before use.

Source and Target must have unique and SQL compliant names.
Source and Target schema can be edited:
- Tables and fields can be modified .
- Tables and Fields can be added.
- Tables can be duplicated.

- Tables and fields can be deleted depending on the user's role and specific project restrictions.
- Batch updates can be made (excluding delete actions).

Sources and Targets have 2 statuses: Draft and Ready,
- In Draft status, they will not be used in Zengines matching processes (although they can be manually set in Mapping).
- In Ready status they will be include in matching processes and conversion jobs.

Associating Data with a Source or Target

Data files can be associated with a table, as long as the Data file headers (column names) match the table field names.
- Files paired with a table can optionally be set as example data to display on in the schema/table definition. Example data is a random sample of 10 values extracted from the Data file.
Associated data files can have their use restricted, so that no part of their content is sent to LLMs.
Associated data files can be profiled and a report generated (statistics for value per field).
Associated data files can be validated for conformity against the schema (data types, constraints).
Associated data files appear as the default selectable items for data sources in Conversion jobs.
A table can have many associated data files.
A data file can only be associated with 1 table.

Managing Sources and Targets

Sources and targets can be duplicated
Sources and targets can be deleted depending on the user's role and specific project restrictions.

All mappings (how the source data is used for the target) are directly link to a Target field.
A Target field has a 'mapping status' to help manage workflow and testing. A field can be set by a user to:
- Not started (no assigned Sources, no action taken on it)
- In progress (is automatically set when a Source is assigned)
- Testing (user can set)
- Complete (user can set, this triggers a validation check)
- Blocked (user can set)
A mapping can be Locked/unlocked to prevent any editing, depending on the user's role and specific project restrictions.
A mapping can have a simple text based explanation added to it so that complex transformation rules can be easily understand by non technical observers.
Mappings can be reset (have assigned Sources removes and and custom transformations cleared).

Assigning Sources

When matches are generated for a Target Zengines analyzes all Sources set to 'ready', scores the individual fields and surfaces star rated recommendations for the best Source field matches.
Assigning a Source field to a Target field is done in the 'Select Sources' view of a selected field on the Mapping grid, and source field can be used, whether recommended or not.
A single assigned Source field can be 'auto-transformed' (this will simply try to ensure any Target field data type constraints are met). A 'custom transformation' can be added to override this.
Multiple Source fields from different Sources and their tables can be assigned to a Target field, this will require a 'custom transformation' to use in conversions.
It is not necessary to use all assigned Source fields, although assigned Source fields are used to inform JOIN statement requirements when defining a Conversion Job.

Custom transformations

Using the Zengines Transformation Language, custom SQL based rules can be written to manipulate the assigned Source fields so that the output value satisfies the Target field requirements.
All rules can be tested using the example data set on the Source fields, or directly input into the example data area.
Any custom transformations must be 'valid' to be set the mapping to 'complete'.
The Zengines AI 'Rule generator' can be used to created the code for SQL transformation rule.

Conversions are managed through the creation of Conversion Jobs, a job consists of the following:
- A Target, and a selected set of its tables to generate the data for in this job.
- A set of JOIN and optional 'data filters' defining the row level data relationships across the Source tables used to populate the Target (the Sources assigned in the Mappings).
- A set of data files to use for each of the Source tables used in the Mappings.
Conversion jobs can be modified and deleted, depending on the user's role and specific project restrictions.
Conversion jobs can be run multiple times, each run contains the following for each Target table it includes:
- A data file named the same as the target table.
- A process log txt file explaining the what was done.
- If an error occurs, an error log txt file.

High level summary of the Data Migration features