Description
The Split and Unfold activity transforms a single text column containing delimited values into multiple binary (1/0) columns. Each unique value from the original column becomes a separate column, where:
- 1 indicates that the value was present in the original field for that row.
- 0 indicates absence.
This is useful when you have multi-tag fields (e.g., “red, blue, green”) and need to perform one-hot encoding or binary expansion.
If the separator is left empty, each character is treated as a separate value.
Example: Given a column named Tags
with values like "A, B, C"
and "A, C"
, this activity creates new columns Tags_A
, Tags_B
, and Tags_C
, and marks them with 1 or 0 based on whether that tag was present.
- Data – Required
Requires tabular structured input with at least one column containing delimited string values.
Output
Output Type | Format | Description |
---|
Data | Tabular | Input with binary columns replacing the original tag column. |
Configuration Fields
Field Name | Description |
---|
Column To Split | The name of the input column to split and unfold. Required. |
Separator | The delimiter used to split the string (e.g., , , ; , space). Optional. |
Sample Configuration
Field | Value |
---|
ColumnToSplit | Tags |
Separator | , |
Sample Output
ID | Tags_A | Tags_B | Tags_C |
---|
1 | 1 | 1 | 0 |
2 | 0 | 1 | 1 |
3 | 1 | 0 | 1 |