Column Uniqueness Check
The Column Uniqueness Check rule ensures that all values in a specified column are distinct within a dataset.
This rule is commonly used to:
- Validate primary key or identifier columns
- Ensure unique fields like email addresses, SKUs, or serial numbers are not duplicated
- Maintain data integrity for critical business attributes
Example Usage:
- Ensure all
ProductCodevalues are distinct - Verify
SerialNumberis unique for each product entry - Confirm
Emailaddresses have no duplicates in a user database
Configuration Fields
Success Criteria Configuration
This section defines how the rule’s outcome is measured against expected thresholds.
| Field Name | Description | Required | Options / Format |
|---|---|---|---|
| Operator | Comparison operation for the unique value count | Yes | GreaterThan, LessThan, EqualTo, Between |
| Threshold Value | Value for comparison (single value for most operators) | Conditional | Number |
| Threshold Min | Minimum value (for Between operator) | Conditional | Number |
| Threshold Max | Maximum value (for Between operator) | Conditional | Number |
| Is Percentage | Whether the threshold represents a percentage of total rows | No | true / false (default: false) |
| Allow Nulls | Whether null values should count as unique | No | true / false (default: false) |
Sample Input Data
| ID | ProductCode | SerialNumber |
|---|---|---|
| 1 | PC-100 | SN-001 |
| 2 | PC-101 | SN-001 |
| 3 | PC-100 | SN-002 |
| 4 | PC-102 | NULL |
| 5 | PC-103 | NULL |
| 6 | NULL | SN-003 |
Sample Configurations
Example 1: Strict Uniqueness Check
| Configuration Field | Value |
|---|---|
| Column | ProductCode |
| Operator | EqualTo |
| Threshold Value | 4 |
| Is Percentage | false |
| Allow Nulls | false |
Explanation:
Validates that the ProductCode column contains exactly 4 unique values (PC-100, PC-101, PC-102, PC-103). Null values are treated as non-unique.
Example 2: Percentage-Based Uniqueness Check
| Configuration Field | Value |
|---|---|
| Column | SerialNumber |
| Operator | GreaterThan |
| Threshold Value | 50 |
| Is Percentage | true |
| Allow Nulls | true |
Explanation:
Ensures that over 50% of SerialNumber values are unique, with null values being considered unique.
Sample Output
| Column Name | Rule Name | Success Count | Failure Count | Null Count | Within Threshold |
|---|---|---|---|---|---|
| ProductCode | Column Uniqueness Check | 3 | 2 | 1 | No |
| SerialNumber | Column Uniqueness Check | 4 | 2 | 0 | Yes |