Description
The Extract HTML activity extracts tabular data from HTML files and converts it into a structured dataset. This is especially useful for processing reports, web-scraped content, or embedded tables from web pages or system-generated HTML files.
Use case:
Ideal for scenarios where data is embedded in HTML tables, such as downloaded web reports, email digests, or content management system exports.
| Type | Description |
|---|
| File | HTML document (.html, .htm) |
Output
| Type | Description |
|---|
| Data | Structured tabular data extracted from HTML |
Configuration Fields
| Field Name | Required | Description |
|---|
| Add HTML Extract | Yes | Defines extraction rule(s) to identify and parse one or more HTML tables. |
Not applicable — input is provided via uploaded HTML files.
Sample Configuration
| Field | Value |
|---|
| Add HTML Extract | Table selector for parsing table |
Sample Output
| Name | Age | Country |
|---|
| John Doe | 28 | USA |
| Alice | 31 | Canada |
| Bob | 25 | Australia |