Description
The Find Text activity extracts specific portions of text from selected columns based on a user-defined regex pattern. It is useful for parsing structured tokens, keywords, codes, or patterns from unstructured text.
Use Case
Extract keywords such as product codes, IDs, or tags from a sentence or description field using regular expressions.
| Type | Description |
|---|
| Data | Input dataset containing text columns |
Output
| Type | Description |
|---|
| Transformed Data | New columns with extracted values from patterns. |
Configuration Fields
| Field Name | Required | Description |
|---|
| Columns To Find | Yes | Column(s) from which the text will be extracted using regex. |
| Pattern | Yes | Regular expression used to extract matching portions from the column text. |
| Output Columns Prefix | Yes | Prefix used when creating new output columns for extracted matches. |
| Include Original | No | If enabled, original columns will be included in the output. |
| ID | Description |
|---|
| 1 | This contains ABC and XYZ |
| 2 | Find CODE inside this text |
| 3 | No pattern matches here |
| 4 | Extract INFO and DATA points |
| 5 | SAMPLE test for extraction |
Sample Configuration
| Field | Value |
|---|
| Columns To Find | Description |
| Pattern | ([A-Z]{3,}) |
| Output Columns Prefix | Column_ |
| Include Original | Enabled |
Explanation: This regex extracts all words with 3 or more uppercase letters.
Sample Output
| ID | Description | Column_1 | Column_2 |
|---|
| 1 | This contains ABC and XYZ | ABC | XYZ |
| 2 | Find CODE inside this text | CODE | |
| 3 | No pattern matches here | | |
| 4 | Extract INFO and DATA points | INFO | DATA |
| 5 | SAMPLE test for extraction | SAMPLE | |
Use grouping patterns like (\d{4}) to extract numeric codes, or #(\w+) to extract hashtags.