Binning Numeral Columns in Power BI without writing DAX
Don’t play it until you get it right, play it until you can’t play it wrong — Unknown
Every time I practice, I uncover new aspects of the Power BI interface. I built a report on the USA Police shootings data after studying data analysis in Power BI in February. I never settled on a design for the Report, so in the spirit of “completing abandoned projects,” I decided to recreate the entire analysis.
While at it, I discovered a new approach for handling number columns, i.e., binning. As a tutorial, I’ll share my recent findings on binning a numerical column in Power BI without coding DAX in this post!
What is Binning?
In data manipulation, binning means grouping continuous data into categories. Better explained, generating categorical values from continuous data. Binning helps contextualize the data into clusters and also makes data visualization easier.
For context, there is a column in the data that houses the ages of the victims. Visualizing that column would mean 4894 different ages from 6 to 90 on one chart!
Beginning the binning process, I drafted my ideal bins on paper. We will find out that the data has more victims in their adulthood, so I decided to split the data into the following categories:
My first instinct after this was to look for a DAX equivalent that would create a column with the specifications above. However, a quick Google search taught me two (2) in-built Power BI features for data binning. Let’s find out!
Alternative — 1
- Right-click on the column of interest.
- Select New Group from the drop-down menu.
3. Then, you could specify the name of the new column to be created in the Name field on your left.
4. You can now specify the Size of bins or the Number of bins for the column to be generated.
Specifying the Size of bins
When you specify the bin size, Power BI picks the lowest and maximum values and creates bins on the data points in between. The bins will be multiples of the bin size specified.
Specifying the Number of Bins
The picture below means that the data will be divided into 34 sections with an interval of 2.5 between the data points.
Specifying the number and size of bin methods does not simplify the data for visualization. Let’s explore a better alternative that gives me more control over my bins.
Alternative — 2
- On the home tab of the Power BI interface, select Transform data. The click leads you to the Power Query Editor.
2. In the Power Query Editor, select the “Add Column tab. Then click on the Conditional Column section within the menu.
3. The box below appears.
- Type the column name in the New Column Name field.
2. Select the column under scrutiny from the Column Name drop-down.
3. Select the desired operator. The options are varied and include greater than, less than, equal to, etc.
4. Specify the numerical value for that condition in the Value field.
5. Input a name or tag for that category in the output field.
6. You can specify the Else condition for outliers that may not satisfy any of the conditions.
7. Lock in those conditions with the OK button.
My final conditional column dialogue box looked like this:
These were the bins created after I clicked on the OK button.
4. Remember to click the Close & Apply button to effect the changes made to the data.
Final Look at the binned column :D
Without binning the data, is there a way to visualise this information? You can share your knowledge with me. In my first attempt, I used a bar chart, and then I filtered on the top 9 number of deaths with regard to age.
Resources
- US Police Shooting Dataset