Skip to content

Concepts

Encoders

One-Hot encoder

One-Hot Encode is a process where categorical variables are converted to a new categorical column while assigning a binary value of 1 or 0 to those columns.

Before One-Hot Encode > After One-Hot Encode

Color > Yellow Green Red
Yellow > 1 0 0
Green > 0 1 0
Red > 0 0 1

Label encoder

Label encoding refers to converting labels of a column into a numeric form to follow a machine-readable form. The label encoder can normalize labels. It can also be used to transform non-numerical labels into numerical labels as long as the non-numerical labels are hashable and comparable.

Before Label Encoder > After Label Encoder

Color > Color
Yellow > 1
Green > 2
Red > 3

Run length encoder

Run-length encoding (RLE) refers to the type of data compression which takes a string of identical values and replaces it with codes to indicate the value and the number of times it occurs in the string. In particular, RLE is lossless, which refers to the idea that when decompressed, all of the original data (string) will be recovered when decoded. For example: FFFQQQC -> 3F3Q1C.

Note

Classification tasks

Binary

Binary classification refers to the task that has two class labels. A single class label is predicted for each example in this classification task.

In other words, a single column with 0/1 values.

Multi-class

Multi-class classification refers to the task that has more than two class labels. A single class label is predicted for each example in this classification task, but they're different here from binary classification in that more than two class labels exist.

In other words, multiple columns where one column has to be 1.

Multi-label

Multi-label classification refers to the task with two or more class labels, where you may predict one or more class labels for each example. This classification contrasts to binary and multi-class classification, where a single class label is predicted for each example.

In other words, multiple columns where any column can be 0/1.


Back to top