Concepts¶
Encoders¶
One-Hot encoder¶
One-Hot Encode is a process where categorical variables are converted to a new categorical column while assigning a binary value of 1 or 0 to those columns.
Before One-Hot Encode > After One-Hot Encode
Color | > | Yellow | Green | Red |
---|---|---|---|---|
Yellow |
> | 1 |
0 |
0 |
Green |
> | 0 |
1 |
0 |
Red |
> | 0 |
0 |
1 |
Label encoder¶
Label encoding refers to converting labels of a column into a numeric form to follow a machine-readable form. The label encoder can normalize labels. It can also be used to transform non-numerical labels into numerical labels as long as the non-numerical labels are hashable and comparable.
Before Label Encoder > After Label Encoder
Color | > | Color |
---|---|---|
Yellow |
> | 1 |
Green |
> | 2 |
Red |
> | 3 |
Run length encoder¶
Run-length encoding (RLE) refers to the type of data compression which takes a string of identical values and replaces it with codes to indicate the value and the number of times it occurs in the string. In particular, RLE is lossless, which refers to the idea that when decompressed, all of the original data (string) will be recovered when decoded. For example: FFFQQQC
-> 3F3Q1C
.
Note
-
For more informasion, see Run-Length Encoding (RLE).
-
To learn how to decode RLE's, see Run Length Decoding - Quick Start.
Classification tasks¶
Binary¶
Binary classification refers to the task that has two class labels. A single class label is predicted for each example in this classification task.
In other words, a single column with 0/1 values.
Multi-class¶
Multi-class classification refers to the task that has more than two class labels. A single class label is predicted for each example in this classification task, but they're different here from binary classification in that more than two class labels exist.
In other words, multiple columns where one column has to be 1.
Multi-label¶
Multi-label classification refers to the task with two or more class labels, where you may predict one or more class labels for each example. This classification contrasts to binary and multi-class classification, where a single class label is predicted for each example.
In other words, multiple columns where any column can be 0/1.
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai