Skip to content

Numerical Binning

Description

The Numerical Binning activity groups numerical data into “bins” or “intervals” based on a chosen rule. This helps to make large sets of continuous numbers easier to analyze by simplifying them into smaller, predefined ranges.

Input

Data Only

Output

Transformed Data

Configuration Fields

  • Column

Choose the column with numerical data that you want to bin.

  • Binning mode This is how the system decides how to split the numbers into bins. You can choose from the following

    • Sturges Good for smaller datasets; it calculates the number of bins based on a formula.
    • Freedman diaconis Works well when there are outliers in the data, calculates bins using a different formula.
    • Scott This method minimizes binning errors and is based on bin width.
    • Square Root A simple method where the number of bins is the square root of the data size.
    • Fixed Size Intervals You define how wide the bins should be, and it uses that size for each bin.
  • Custom You define the exact bins yourself.

  • Output column The new column that will store the binned values.

  • Include original If you want to keep the original numbers alongside the new binned values.

    • Enabled Keep the original values.
    • Disabled Only show the binned values.
  • Number Of Bins

How many bins you want to divide the data into.

  • Minimum Value (rendered only when mode is fixed intervals)

The smallest number that should be included in the bins.

  • Maximum Value (rendered only when mode is fixed intervals) The largest number that should be included in the bins.

Sample Input

Age
23
35
47
59
72
89

Sample Configuration

alt text

Sample Output

AgeAge_Binned
2320-40
3520-40
4740-60
5940-60
7260-80
8980-100