Consistent Casing
In data quality, Consistent Casing refers to ensuring that text data is standardized in terms of letter casing (uppercase vs. lowercase) across a dataset. This consistency is important for data integrity, as variations in casing (e.g., “John Doe” vs “john doe” or “USA” vs “usa”) could lead to errors in data processing, matching, and analysis.
Rule configurations
A value is marked as a success when it matches with the Case type. If the value is unique and fits within the defined set, the rule is considered passed
Case type In data quality, case type refers to the specific formatting of text where letter casing is applied in distinct patterns. These patterns dictate how characters are capitalized or formatted in a string to ensure consistency and improve readability across datasets.
Upper Case
All letters are capitalized.
Lower Case
All letters are in lowercase.
Title Case
The first letter of each word is capitalized.
Sentance Case
Only the first letter of the first word is capitalized.
camel Case
The first word is lowercase, and each subsequent word starts with an uppercase letter without spaces.
Pascal Case
Similar to camel case, but the first letter of the first word is also capitalized.
Kabab Case
Words are lowercase and separated by hyphens.
Snake Case
Words are lowercase and separated by underscores.
Success criteria
- The success condition depends on how the
Case Type
is configured. - For example When
Case Type
is set toPascal Case
, only inputs with each word starting with an uppercase letter are valid, e.g., “DropDown” (but not dropDown).
Configuration fields
-
Operator options
Greater than
Less than
Equal to
Between
(requires specifying a start and end range) -
Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).
-
Value The threshold value used for success criteria. Required for
Greater than
,Less than
, andEqual to
operators. -
Value range Required only when the
Between
operator is selected, specifying thestart
andend
range. -
Threshold type Indicates whether the
Value
orValue Range
to be considered as percentage or an absolute count. -
Allow null values Determines if null values are permitted.
-
Check for match Determines if data values align with predefined standards, formats, or reference values to ensure accuracy, consistency, and integrity
Sample Input
ID | Customer | Country |
---|---|---|
1 | Fallon | greatBritain |
2 | FranklynFryer | France |
3 | Kathleen | unitedStates |
4 | JudieGreen | |
5 | JohnDoe | France |
Sample rule configuration
Case type Pascal Case
Sample success criteria configuration
- Operator Greater than
- Value 75%
- Threshold type Absolute Count
- Allow null values False
- Check for match True
Sample output
Column Name | Rule Name | Success Count | Failure Count | Within Threshold | Null Count |
---|---|---|---|---|---|
Customer | Consistent Casing check | 5 | 0 | Yes | 0 |
Country | Consistent Casing check | 2 | 3 | No | 1 |