In this article:
The Join transformer is an object used to merge data; before the data is merged, it is checked for correspondence to specified conditions. Using this transformer, it is possible to get in the resulting set the records, which values in one of link fields match, or all records of one provider are supplemented with values from other fields if values of these records match by the key fields.
Each object has several inputs and a single output. All providers should have unique indexes, number and field types determined. The specified attributes should be identical for all providers. Duplicate records are is not allowed (in this case it is necessary to delete duplicates beforehand).
The following conditions ensure effective performance of this algorithm:
Data providers cannot contain duplicates with the same unique index.
All providers must be ordered by their unique index. The order of fields in unique index must be the same for all providers.
On using the Join transformer with the External Connection type, the tables:
Key | Date | Value |
1 | Summer | 2222 |
2 | Winter | 4444 |
3 | Spring | 5555 |
and:
Key | Date | Value2 |
1 | Summer | 1111 |
2 | Winter | 3333 |
can be converted to the following table:
Key | Date | Value | Value2 |
1 | Summer | 2222 | 1111 |
2 | Winter | 4444 | 3333 |
3 | Spring | 5555 |
In basic properties object name, identifier and comment are set.
The page is used to determine connection type.
The following connection types are available:
Internal Connection. The consumer gets only those records of the providers that have matching values of indexed fields.
External Connection. The consumer gets all records from the provider specified in the Main Input field, data from other providers supplement them. Provider objects are linked by a specified index.
To set a list of fields and output link, use the Output Edit page.
The following settings are available on the page:
NOTE. The screenshot represents edit wizard for the Repository data provider.
The Edit Outputs with Error Records page is used to set a link with a consumer, to which information about error records skipped by the transformer is exported:
Specify output identifier with errors and select available link with the consumer object, to which information about error records is exported.
The Settings button enables the user to determine advanced settings used on occurring error records:
Specify the maximum number of error records, information about which is exported. By default the -1 value is set, in this case information about all error records is exported.
NOTE. If there is a large number of error records, information export may slow down ETL task runtime.
If the maximum number of output records is set, select the action executed if this number is exceeded. By default, records output is not stopped.
NOTE. The screenshot represents edit wizard for the Split data transformer.
The Index page is used to determine indexed fields for each of the inputs. Provider data is mapped by the values of indexed fields.
To select an indexed field:
Drag selected field from the Source Fields list to corresponding input in the Selected Fields list.
Select a field in the Source Fields list, and an input in the Selected Fields list. Click the Add button.
Going to the next page the user is prompted to fill in the list of fields from the linked output.
Set correspondence between input and output fields on the Fields Linking page.
Operations on field links:
Creating field links automatically. It enables the user to create automatically links between all input and output fields with matching identifiers. Click the Automatically button to create links automatically.
Creating field links. It enables the user to link input and output fields with different identifiers. To create a link, drag required field from the Source Fields area to the Input and Output Fields Mapping area.
Editing field links. It enables the user to edit transformation formula of output value using editor. To edit link in the Input and Output Fields Mapping area:
Select the required link and click the Edit button.
Double-click the required link with the main mouse button.
Delete Links. To delete link between input and output fields, click the Clear button.
See also: