The Delete Duplicates transformer is an object that deletes duplicate data. There is one provider at the object's input and two consumers at the object's output. This operation deletes duplicate values based on a specified index. A condition, based on which the records are selected, is generated to select records to be deleted.
To ensure efficient duplicate deletion, provider data should be ordered by index. After executing this operation the data remains ordered.
On using the Delete Duplicates transformer the table below:
Key | Date | Value |
4 | Summer | 1111 |
1 | Winter | 2222 |
5 | Summer | 3333 |
2 | Winter | 4444 |
4 | Summer | 1111 |
6 | Summer | 5555 |
5 | Summer | 3333 |
3 | Winter | 6666 |
can be converted into a table without duplicates:
Key | Date | Value |
4 | Summer | 1111 |
1 | Winter | 2222 |
5 | Summer | 3333 |
2 | Winter | 4444 |
6 | Summer | 5555 |
3 | Winter | 6666 |
and a table that contains deleted duplicates:
Key | Date | Value |
4 | Summer | 1111 |
5 | Summer | 3333 |
Thus, duplicates are deleted, if values of all fields were equal at the same time.
Set the input fields. which values should be checked for duplicates, on the Checked Fields page.
To create a list of checked fields:
Drag selected field from the Source Fields list to the Selected Fields list.
Select a field in the Source Fields list, and an input in the Selected Fields list. Click the Add button.
Click the Delete button to delete a selected field form the list of checked fields.
If no checked field is defined, an attempt to go to the next page brings up a confirmation dialog box.
Set a condition, based on which records to be deleted are selected, on the Deduplicator page.
Condition is formed in the editor, dialog box, which opens on clicking the button.
The rule of duplicate selection is determined by radio button in the Selection Rules group:
Record Satisfies Condition. The first duplicate record meeting the specified condition is passed to the consumer.
Record does not Satisfy Condition. The first duplicate record not meeting the specified condition is passed to the consumer.
See also: