Library of Methods and Models > Transformations > Aggregation (Basic and Extended)

Aggregation (Basic and Extended)

Aggregation calculates data of an upper level based on lower level data. For example, data calculated for months, is summed to get quarterly data.

Basic and extended aggregation are calculated based on the time series data of the time series database. Each time series is a time series containing some data. A data series is a set of observations or series points. Each observation is described by the time of observation and its value, as well as by a number of additional characteristics defined when the time series database is created.

When calculating basic aggregation, input data are time data series included into the same aggregation group. When calculating extended aggregation, input data are time series obtained after a specified expression was calculated.

Consider:

Time series: TS(TS1, TS2, …, TSn).
Time series of weights: W(W1, W2, …, Wn).

NOTE. If the Level function is used, value of the aggregated time series Y for the first aggregation period is considered to be equal to 100, that is, Y1=100. The value Yt for each following aggregation period starting from the second is calculated using the formula (except when an individual formula is defined for a method).

Available aggregation methods:

Sum.
Weighted Sum.
Average.
Weighted Mean.
Mean Calculated Using Percentage Change.

If the Level function is used:

Weighted Mean Calculated Using Percentage Change.

If the Level function is used:

Mean Calculated Using Logarithms Difference.

If the Level function is used:

Weighted Mean Calculated Using Logarithms Difference.

If the Level function is used:

Median.
Median calculated using percentage change.

If the Level function is used:

Mode.
Weighed Geometric Mean. , where Π is a product.
Percentile. The user defines a percentage value between 0 and 100. Empty values are excluded from aggregation. All the series values used for aggregation are ranked ascending each moment of time. Depending on the defined percentage value, values of the corresponding rank are selected at each period, these values are the aggregation result. If the percentile equals to 0, the method returns empty values.
The corresponding rank is calculated by the formula: Rang = Round(p*N), where:
- Round. Rounding up.
- P. Percentage value.
- N. The number of aggregated series.
Weights. Based on the expression that takes all the components into account that satisfy the aggregation expression and have data.

In the numerator, k, m, z are numbers of the elements that have data. The denominator has the sum of all weights.
Number of Observations. , where:
- ZeroIfNoData(TSi) = 1. If the TSi observation contains data.
- ZeroIfNoData(TSi) = 0. If the observation TSi does not contain any data.
Amount of Missing Data. , where:
- IsObserved(TSi) = 0. If the TSi observation contains data.
- IsObserved(TSi) = 1. If the observation TSi does not contain any data.
Composition Relevance. Aggregation considering the missing data. It is calculated as follows:
- The Comparator set of series is defined.
- The threshold value is defined (in the range from 0 to 100).
- If the sum of values of the Comparator elements divided into the sum of all aggregated values is greater that the threshold value, aggregation is calculated, if it is less, aggregation is not calculated.

See also:

Modeling Container: The Aggregation (Basic), Aggregation (Extended) Models | Time Series Analysis: Aggregation (Extended)