Split http query
Description
The Split HTTP Query activity is used to extract and convert HTTP query strings embedded within URLs into structured, column-based data. This is particularly useful in cases where user input, system requests, or web-tracking data are passed as URL query parameters and stored in a single column.
This activity scans each value in the selected column to detect standard query string structures—those typically following a ? in a URL, such as: https://domain.com/resource?key1=value1&key2=value2
Once detected, it parses each parameter (key-value pair) and creates new columns in the dataset, assigning values from the query string accordingly. This enables users to access and manipulate data like id, name, department, etc., as individual fields rather than dealing with them inside an encoded string.
Example Use Case:
You may have a dataset of employee links where each URL includes parameters like?id=E003&name=Carlos+Gomez&department=Engineering. This activity will extractE003,Carlos+Gomez, andEngineeringinto separate columns namedKey_id,Key_name, andKey_department, making the data easier to read, filter, or aggregate.
Input
| Type | Description |
|---|---|
| Data | A dataset containing a column with HTTP URLs or raw query strings. The column must follow the typical format with key-value pairs separated by &. |
Output
| Type | Description |
|---|---|
| Transformed Data | A dataset where the selected column’s query parameters are expanded into individual columns. |
Configuration Fields
| Field Name | Description |
|---|---|
| Column Name | The column to be parsed. It should contain complete URLs or raw query strings containing key-value pairs separated by &. Only a single column can be selected. |
| Prefix | A custom prefix for the newly created columns. Each extracted key will be appended to this prefix. For example, a key id with prefix Key_ becomes Key_id. This is useful to prevent naming collisions or to group similar fields. |
| Include Original | Determines whether the original input row is retained. - Enabled: Keeps all the original columns, including the raw HTTP query. - Disabled: Only shows the parsed query parameter columns. |
Sample Input
| employee_id | name | http_query |
|---|---|---|
| E001 | John Doe | https://company.com/employee?id=E001&name=John+Doe&department=Sales |
| E002 | Marie Dupont | https://company.com/employee?id=E002&name=Marie+Dupont&department=Marketing |
| E003 | Carlos Gómez | https://company.com/employee?id=E003&name=Carlos+Gómez&department=Engineering |
Sample Configuration
| Field | Value |
|---|---|
| Column Name | http_query |
| Prefix | Key_ |
| Include Original | Enabled |
Sample Output
| employee_id | name | http_query | Key_id | Key_name | Key_department |
|---|---|---|---|---|---|
| E001 | John Doe | https://company.com/employee?id=E001&name=John+Doe&department=Sales | E001 | John+Doe | Sales |
| E002 | Marie Dupont | https://company.com/employee?id=E002&name=Marie+Dupont&department=Marketing | E002 | Marie+Dupont | Marketing |
| E003 | Carlos Gómez | https://company.com/employee?id=E003&name=Carlos+Gómez&department=Engineering | E003 | Carlos+Gómez | Engineering |