Import dataset settings¶
Below are all the import dataset settings that need to be defined when importing a dataset for one of the supported problem types in H2O Hydrogen Torch.
Note
To learn how to import a dataset to H2O Hydrogen Torch, see Import dataset.
Image regression¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the images to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load images from this folder.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load images from this folder when testing the model.
Label columns¶
Defines the name(s) of the dataframe column(s) that refer to the target value(s) H2O Hydrogen Torch will aim to predict.
Note
It can be more than one label column, and therefore, the target value to predict can be single or multi-column.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Image classification¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the images to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load images from this folder.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load images from this folder when testing the model.
Label columns¶
Defines the name(s) of the dataframe column(s) that refer to the target value(s) H2O Hydrogen Torch will aim to predict.
Note
-
It can be more than one label column, and therefore, the target value to predict can be single or multi-columns.
-
Image classification supports multiclass and multilabel classification.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Image metric learning¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the images to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load images from this folder.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load images from this folder when testing the model.
Label columns¶
Defines the name of the dataframe column that refers to the target value H2O Hydrogen Torch will aim to predict.
Note
It can be more than one label column, and therefore, the target value to predict can be single or multi-columns.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Image object detection¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the images to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load images from this folder.
Validation dataframe¶
A .pq
file containing a dataframe with validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
A .pq
file containing a dataframe with test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load images from this folder when testing the model.
Class name column¶
Defines the dataset column containing a list of class names that H2O Hydrogen Torch will use for each bounding box.
X Min column¶
Defines the dataset column containing a list of minimum X positions H2O Hydrogen Torch will use for each bounding box.
Y Min column¶
Defines the dataset column containing a list of minimum Y positions H2O Hydrogen Torch will use for each bounding box.
X Max column¶
Defines the dataset column containing a list of maximum X positions H2O Hydrogen Torch will use for each bounding box.
Y Max column¶
Defines the dataset column containing a list of maximum Y positions H2O Hydrogen Torch will use for each bounding box.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Image semantic segmentation¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the images to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load images from this folder.
Validation dataframe¶
A .pq
file containing a dataframe with validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
A .pq
file containing a dataframe with test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load images from this folder when testing the model.
Class name column¶
Defines the dataset column containing a list of class names that H2O Hydrogen Torch will use during model training.
RLE mask column¶
Defines the dataset column containing a list of run-length encoded (RLE) masks that H2O Hydrogen Torch will use for each class.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Image instance segmentation¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the images to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load images from this folder.
Validation dataframe¶
A .pq
file containing a dataframe with validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
A .pq
file containing a dataframe with test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load images from this folder when testing the model.
Class name column¶
Defines the dataset column containing a list of class names that H2O Hydrogen Torch will use for each instance mask.
RLE mask column¶
Defines the dataset column containing a list of run-length encoded (RLE) masks that H2O Hydrogen Torch will use for instance class.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Text regression¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Label columns¶
Defines the name(s) of the dataframe column(s) that refer to the target value(s) H2O Hydrogen Torch will aim to predict.
Note
It can be more than one label column, and therefore, the target value to predict can be single or multi-columns.
Text column¶
Defines the dataset column containing the input text H2O Hydrogen Torch will use during model training.
Text classification¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Label columns¶
Defines the name(s) of the dataframe column(s) that refer to the target value(s) H2O Hydrogen Torch will aim to predict.
Note
It can be more than one label column, and therefore, the target value to predict can be single or multi-columns.
Text column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during model training.
Text sequence to sequence¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Label columns¶
Defines the name of the dataframe column that refers to the target value H2O Hydrogen Torch will aim to predict.
Text column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during model training.
Text span prediction¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Question column¶
Defines the dataset column containing the question text H2O Hydrogen Torch will use during model training.
Context column¶
Defines the dataset column containing text that answers the question in the question column; H2O Hydrogen Torch will use the context column during model training.
Answer column¶
Defines the dataset column containing the answer text that H2O Hydrogen Torch will use during model training.
Answer start column¶
Defines the dataset column, which describes the start of the answer text in the context column. If not set, H2O Hydrogen Torch will choose the first occurrence of the answer text found in the context text as the start of the answer text in the context column.
Text token classification¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model. The records will be combined into mini-batches when training the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Label columns¶
Defines the name of the dataframe column that refers to the target value H2O Hydrogen Torch will aim to predict.
Note
It can be more than one label column, and therefore, the target value to predict can be single or multi-columns.
Text column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during model training.
Text metric learning¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Label columns¶
Defines the name of the dataframe column that refers to the target value H2O Hydrogen Torch will aim to predict.
Text column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during model training.
Audio regression¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the audio files to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load audios from this folder.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the audio files H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load audios from this folder when testing the model.
Label columns¶
Defines the name(s) of the dataframe column(s) that refer to the target value(s) H2O Hydrogen Torch will aim to predict.
Note
It can be more than one label column, and therefore, the target value to predict can be single or multi-column.
Audio column¶
Defines the dataframe column storing the names of the audio files H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
Audio classification¶
Dataset name¶
Name of the dataset.
Problem type¶
The problem type the dataset aims to perform.
Train dataframe¶
A .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch will use to train the model. The records will be combined into mini-batches when training the model.
Data folder¶
Defines the folder location of the audio files to use for the experiment. When the experiment is running, H2O Hydrogen Torch will load audios from this folder.
Validation dataframe¶
Defines the .csv
or .pq
file containing validation records that H2O Hydrogen Torch will use to evaluate the model during training.
Test dataframe¶
Defines the .csv
or .pq
file containing test records that H2O Hydrogen Torch will use to test the model.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Defines the folder location of the audio files H2O Hydrogen Torch will use to test the model. H2O Hydrogen Torch will load audios from this folder when testing the model.
Label columns¶
Defines the name(s) of the dataframe column(s) that refer to the target value(s) H2O Hydrogen Torch will aim to predict.
Note
-
It can be more than one label column, and therefore, the target value to predict can be single or multi-columns.
-
Audio Classification supports multiclass and multilabel classification.
Audio column¶
Defines the dataframe column storing the names of the audio files H2O Hydrogen Torch will load from the data folder and data folder test when training and testing the model.
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai