In this article:

Checked Fields

Deduplicator

Deleting Duplicates

The Delete Duplicates transformer is an object that deletes duplicate data. There is one provider at the object's input and two consumers at the object's output. This operation deletes duplicate values based on a specified index. A condition, based on which the records are selected, is generated to select records to be deleted.

To ensure efficient duplicate deletion, provider data should be ordered by index. After executing this operation the data remains ordered.

On using the Delete Duplicates transformer the table below:

Key Date Value
4 Summer 1111
1 Winter 2222
5 Summer 3333
2 Winter 4444
4 Summer 1111
6 Summer 5555
5 Summer 3333
3 Winter 6666

 can be converted into a table without duplicates:

Key Date Value
4 Summer 1111
1 Winter 2222
5 Summer 3333
2 Winter 4444
6 Summer 5555
3 Winter 6666

 and a table that contains deleted duplicates:

Key Date Value
4 Summer 1111
5 Summer 3333

Thus, duplicates are deleted, if values of all fields were equal at the same time.

Checked Fields

Set the input fields. which values should be checked for duplicates, on the Checked Fields page.

To create a list of checked fields:

Click the Delete button to delete a selected field form the list of checked fields.

If no checked field is defined, an attempt to go to the next page brings up a confirmation dialog box.

Deduplicator

Set a condition, based on which records to be deleted are selected, on the Deduplicator page.

Condition is formed in the editor, dialog box, which opens on clicking the button.

The rule of duplicate selection is determined by radio button in the Selection Rules group:

See also:

Data Transformers