Delete Duplicates

This article describes an example of creating and executing an ETL task with deleting duplicate values at the output.

The repository must contain three tables: T_Source, T_Destination and T_Duplicate. Tables must be identical in structure, the field with the Value identifier is present, this field is used to check for duplicates. The repository must also contain an ETL task with the ETLTASKS identifier. On executing the example presented below four objects are created in the ETL task: repository source, the Delete Duplicates converter and two consumers (unique data goes to the first one , and duplicates - to the second one). The required properties and links are set for all objects:

After objects are created and saved, the ETL task is executed. The similar code applied to different objects is placed into separate procedures or functions.

Example

See also:

Examples | IEtlPlainDataDeduplicate