Categories:

Aggregate functions (Counting Distinct Values) , Window functions (Semi-structured Data Aggregation)

ARRAY_UNION_AGG

Returns an ARRAY that contains the union of the distinct values from the input ARRAYs in a column. You can use this to aggregate distinct values in ARRAYs produced by ARRAY_UNIQUE_AGG.

See also:

ARRAY_UNIQUE_AGG , Using Arrays to Compute Distinct Values for Hierarchical Aggregations

Syntax

ARRAY_UNION_AGG( <column> )
Copy

Arguments

column

The column containing the ARRAYs with the distinct values (the ARRAYs produced by ARRAY_UNIQUE_AGG).

Returns

The function returns an ARRAY containing the distinct values from the ARRAYs in column. The values in the ARRAY are in no particular order, and the order is not deterministic.

Note that this function uses multiset semantics (link removed), which means that the maximum number of occurrences of an individual value in a single input ARRAY determines the number of occurrences of that value in the output ARRAY. See Examples.

The function ignores NULL values in column and in the ARRAYs in column. If column contains only NULL values or the table containing column is empty, the function returns an empty ARRAY.

Usage notes

  • This function can be used as either of the following types of functions:

  • When this function is called as a window function, it does not support explicit window frames.

  • When you pass a structured array to the function, the function returns a structured array of the same type.

Examples

Aggregation: Union of arrays

The following example illustrates how the function returns the union of distinct values from two ARRAYs:

CREATE TABLE union_test(a array);

INSERT INTO union_test
    SELECT PARSE_JSON('[ 1, 1, 2]')
    UNION ALL
    SELECT PARSE_JSON('[ 1, 2, 3]');

SELECT ARRAY_UNION_AGG(a) FROM union_test;
+-------------------------+
| ARRAY_UNION_AGG(A)      |
+-------------------------+
| [ 1, 1, 2, 3]           |
+-------------------------+
Copy

The operation uses multiset (link removed) semantics. The value 1 appears twice in the output because it appears twice in one of the input arrays.

See Using Arrays to Compute Distinct Values for Hierarchical Aggregations.

Language: English