Prediction settings: Dataset, prediction, and environment settings¶
Image regression¶
Dataset settings¶
Dataset¶
Specifies the dataset to use for scoring.
Test Dataframe¶
Specifies a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Specifies the folder location of the images H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load images from this folder.
Image column¶
Specifies the dataframe column storing the names of images that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Specifies the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Test Time Augmentations¶
Specifies the test time augmentation(s) to apply during inference. Test time augmentations are applied when the model makes predictions on new data. The final prediction is an average of the predictions for all the augmented versions of an image.
Note
This technique can improve the model accuracy.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Image classification¶
Dataset settings¶
Dataset¶
Specifies the dataset to use for scoring.
Test Dataframe¶
Specifies a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Specifies the folder location of the images H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load images from this folder.
Image column¶
Specifies the dataframe column storing the names of images that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Specifies the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Probability Threshold¶
Specifies a threshold for threshold-dependent classification metrics (e.g. F1). For multi-class classification argmax will be used.
Note
This threshold is used as a default threshold when showing all other threshold-dependent metrics in the validation plots.
Test Time Augmentations¶
Defines the test time augmentation(s) to apply during inference. Test time augmentations are applied when the model makes predictions on new data. The final prediction is an average of the predictions for all the augmented versions of an image.
Note
This technique can improve the model accuracy.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Image metric learning¶
Dataset settings¶
Dataset¶
Specifies the dataset to use for scoring.
Test Dataframe¶
Specifies a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Data folder test¶
Specifies the folder location of the images H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load images from this folder.
Image column¶
Specifies the dataframe column storing the names of images that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Specifies the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Top K Similar¶
Specifies the number (k) of similar predictions to keep for each record during the training model.
Note
Defining this setting impacts output predictions and metrics (metrics that rely on some top-k selection) but not the training process.
Test Time Augmentations¶
Defines the test time augmentation(s) to apply during inference. Test time augmentations are applied when the model makes predictions on new data. The final prediction is an average of the predictions for all the augmented versions of an image.
Note
This technique can improve the model accuracy.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Image object detection¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load images from this folder.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Metric Iou Threshold¶
Defines the Intersection Over Union (IoU) threshold to calculate the selected metric for Object Detection.
Note
When calculating metrics, predicted bounding boxes with an IoU (with the true boxes) above the specified IoU threshold will be treated as true positives.
Nms Iou Threshold¶
Defines the Intersection Over Union (IoU) threshold when calculating post-processing NMS.
Note
Non-maximum suppression (NMS) is a post-processing step that reduces the number of bounding boxes predicted by the model. The NMS algorithm will remove boxes that overlap with each other based on the selected IoU threshold. NMS will keep the higher scoring box.
Max Det Per Image¶
Defines the maximum number of detections per image to be returned by the model.
Probability Threshold¶
Only the predicted boxes with confidence larger than the defined Probability Threshold will be added to the validation or test .csv files that come with the model predictions.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Image semantic segmentation¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load images from this folder.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Probability Threshold¶
Defines the probability threshold; a predicted pixel will be treated as positive if its probability is larger than the probability threshold
Test Time Augmentations¶
Defines the test time augmentation(s) to apply during inference. Test time augmentations are applied when the model makes predictions on new data. The final prediction is an average of the predictions for all the augmented versions of an image.
Note
This technique can improve the model accuracy.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Image instance segmentation¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Defines the folder location of the images H2O Hydrogen Torch will use for scoring. During scoring, Hydrogen Torch will load images from this folder.
Image column¶
Defines the dataframe column storing the names of images that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy. An appropriate graph will be available when the experiment is running based on the selected evaluation metric.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Probability Threshold¶
Defines the probability threshold; a predicted pixel will be treated as positive if its probability is larger than the probability threshold.
Max Instances¶
Defines the maximum number of instances to use during the evaluation.
Test Time Augmentations¶
Defines the test time augmentation(s) to apply during inference. Test time augmentations are applied when the model makes predictions on new data. The final prediction is an average of the predictions for all the augmented versions of an image.
Note
This technique can improve the model accuracy.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Text regression¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Text Column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Text classification¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Text Column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Probability Threshold¶
Define a threshold for threshold-dependent Classification Metrics (e.g. F1). For multi-class classification argmax will be used.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Text sequence to sequence¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Text Column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Max Length¶
Defines the max length value H2O Hydrogen Torch will use for the generated text.
Note
-
Similar to the Max Length setting in the Tokenizer Settings section, this setting specifies the maximum number of tokens to predict for a given prediction sample.
-
This setting impacts predictions and the evaluation metrics and should depend on the dataset and average output sequence length that is expected to be predicted.
Do Sample¶
Determines whether to sample from the next token distribution instead of choosing the token with the highest probability. If turned On, the next token in a predicted sequence is sampled based on the probabilities. If turned Off, the highest probability is always chosen.
Num Beams¶
Defines the number of beams to use for beam search. Num Beams default value is 1 (a single beam); no beam search.
Note
The selection of various beams increases prediction runtime while potentially improving accuracy.
Temperature¶
Defines the temperature to use for sampling from the next token distribution during validation and inference. In other words, the defined temperature controls the randomness of predictions by scaling the logits before applying softmax. A higher temperature makes the distribution more random.
Note
-
Modify the temperature value if you have the Do Sample setting enabled (On).
-
To learn more about this setting, refer to the following article: How to generate text: using different decoding methods for language generation with Transformers.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Text span prediction¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Text token classification¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Text Column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Text metric learning¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require a label column.
Text Column¶
Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Top K Similar¶
Defines the number (k) of similar predictions to keep for each record during the model training.
Note
Defining this setting impacts output predictions and metrics (metrics that rely on some top-k selection) but not the training process.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Audio regression¶
Dataset settings¶
Dataset¶
Specifies the dataset to use for scoring.
Test dataframe¶
Specifies a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Specifies the folder location of the audios H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load audios from this folder.
Audio column¶
Specifies the dataframe column storing the names of audios that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Specifies the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
Audio classification¶
Dataset settings¶
Dataset¶
Defines the dataset to use for scoring.
Test Dataframe¶
Defines a .csv
or .pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
Note
The test dataframe should have the same format as the train dataframe but does not require label columns.
Data folder test¶
Defines the folder location of the audios H2O Hydrogen Torch will use for scoring. During scoring, H2O Hydrogen Torch will load audios from this folder.
Audio column¶
Defines the dataframe column storing the names of audios that H2O Hydrogen Torch will load from the Data folder test during scoring.
Prediction settings¶
Metric¶
Defines the evaluation metric to use to evaluate the model's accuracy.
Note
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Probability Threshold¶
Specifies a threshold for threshold-dependent classification metrics (e.g. F1). For multi-class classification argmax will be used.
Note
This threshold is used as a default threshold when showing all other threshold-dependent metrics in the validation plots.
Environment settings¶
GPUs¶
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai