Skip to content

Consistent Casing

In data quality, Consistent Casing refers to ensuring that text data is standardized in terms of letter casing (uppercase vs. lowercase) across a dataset. This consistency is important for data integrity, as variations in casing (e.g., “John Doe” vs “john doe” or “USA” vs “usa”) could lead to errors in data processing, matching, and analysis.

Rule configurations

A value is marked as a success when it matches with the Case type. If the value is unique and fits within the defined set, the rule is considered passed

Case type In data quality, case type refers to the specific formatting of text where letter casing is applied in distinct patterns. These patterns dictate how characters are capitalized or formatted in a string to ensure consistency and improve readability across datasets.

Upper Case All letters are capitalized.

Lower Case All letters are in lowercase.

Title Case The first letter of each word is capitalized.

Sentance Case Only the first letter of the first word is capitalized.

camel Case The first word is lowercase, and each subsequent word starts with an uppercase letter without spaces.

Pascal Case Similar to camel case, but the first letter of the first word is also capitalized.

Kabab Case Words are lowercase and separated by hyphens.

Snake Case Words are lowercase and separated by underscores.

Success criteria

  • The success condition depends on how the Case Type is configured.
  • For example When Case Type is set to Pascal Case , only inputs with each word starting with an uppercase letter are valid, e.g., “DropDown” (but not dropDown).

Configuration fields

  • Operator options

    Greater than

    Less than

    Equal to

    Between (requires specifying a start and end range)

  • Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).

  • Value The threshold value used for success criteria. Required for Greater than, Less than, and Equal to operators.

  • Value range Required only when the Between operator is selected, specifying the start and end range.

  • Threshold type Indicates whether the Value or Value Range to be considered as percentage or an absolute count.

  • Allow null values Determines if null values are permitted.

  • Check for match Determines if data values align with predefined standards, formats, or reference values to ensure accuracy, consistency, and integrity

Sample Input

IDCustomerCountry
1FallongreatBritain
2FranklynFryerFrance
3KathleenunitedStates
4JudieGreen
5JohnDoeFrance

Sample rule configuration

Case type Pascal Case

Sample success criteria configuration

  • Operator Greater than
  • Value 75%
  • Threshold type Absolute Count
  • Allow null values False
  • Check for match True

alt text

Sample output

Column NameRule NameSuccess CountFailure CountWithin ThresholdNull Count
CustomerConsistent Casing check50Yes0
CountryConsistent Casing check23No1