Description
The UNION
function calculates a new table that contains all the rows from each of two or more table expressions.
Usage
UNION(<Table Expression>, <Table Expression2>, <Table Expression3>)
The union function calculates a new table that contains all the rows from each of the two table expressions.
- Each table must have the same number of columns.
- Columns are combined by position in their respective tables.
- The column names in the return table will match the column names in the first table argument
- Duplicate rows are retained.
- The returned table has lineage where possible. For example, if the first column of each table_expression has lineage to the same base column C1 in the model, the first column in the
UNION result will have lineage to C1. However, if combined columns have lineage to different base columns, or if
there is an extension column, the resulting column in UNION will have no lineage. - When data types differ, the resulting data type is determined based on the rules for data type coercion.
- The returned table will not contain columns from related tables.
Examples
Counting unique beneficiaries across multiple activities
The UNION function is useful for combining similar information that is stored in multiple forms. For example, if you are managing a program that provides training and loans to the same group of people, you might be interested in knowing how many unique individuals you have supported each quarter.
If the details about the training and the loans are stored in different forms, then you will need the UNION function to first combine the list of recipients, and then find the number of distinct beneficiaries.
Then you need to combine the date and beneficiary ID from the two forms using union. For example:
UNION(
SELECTCOLUMNS(participants,
"date", @parent.date,
"beneficiary", participant),
SELECTCOLUMNS(loans,
"date", disbursement_date,
"beneficiary", recipient)) |>
COUNTDISTINCTX(beneficiary)
In the example above, we have to first reshape the two forms, Participants and Loans so that they have the same fields in the same order. For the training participants, the relevant date actually comes from the parent form and is associated with the training. For the Loans, we choose to use the disbursement date.
The UNION function then gives us one big long list of beneficiary IDs and the COUNTDISTINCTX counts the number of unique beneficiary ids.