Member-only story

“From Choices to Insights: Unleash the Power of Encoding”

DataMantra
11 min readAug 25, 2024

--

“Transforming Categorical Features for Enhanced Machine Learning Performance by turning Your ‘Red, Green, and Blue’ into Powerful Predictions ”

Introduction:-

Almost all real-world datasets come with multiple types of features.

These primarily include:

  • Categorical
  • Numerical

The first thing you do when you’re making any kind of machine learning program is usually pre-processing or Data massaging.

Data Cleaning – The Last Byte Blog – Medium

While numerical features can be directly used in most ML models without any additional preprocessing, categorical features require encoding to be represented as numerical values.

By that, I mean preparing data to be analyzed by your program. After all, you can’t just throw a spreadsheet or some pictures into your program and expect it to know what to do.

A big part of the preprocessing is something encoding. This means representing each piece of data in a way that the computer can

--

--

DataMantra
DataMantra

Written by DataMantra

DataMantra is an Edtech platform and founded by Tarun Sachdeva who is based out of Belgium and specialised in Tableau, SQL, ML, Python, Deep Learning & Gen AI.

No responses yet