mNLU API

Training Config Sample

Here is a training config sample:

{
  "task": "intents",
  "model_name": "best_model_ever",
  "training_arguments": {
    "num_train_epochs": 1,
    "per_device_train_batch_size": 32,
    "per_device_eval_batch_size": 32,
    "learning_rate": 0.0001,
    "warmup_ratio": 0.1,
    "lr_scheduler_type": "cosine",
    "label_smoothing_factor": 0.15,
    "do_eval": true,
    "save_strategy": "epoch",
    "evaluation_strategy": "epoch",
    "load_best_model_at_end": true,
    "metric_for_best_model": "eval_loss",
    "greater_is_better": false,
    "no_cuda": false,
    "report_to": "none"
  },
  "transformer_model_name": "bert-base-multilingual-cased",
  "separator": ",",
  "do_split": false,
  "split_ratio": 0.2,
  "placeholders": false
}

Details

Let's break down the different components of this JSON configuration:

task (string) This field specifies the task for which the model is being trained. In this example, it is set to "intents."
model_name (string) The name of the model being trained. In this case, it is set to "best_model_ever."
training_arguments (object) This section contains various training-related arguments. These arguments are further defined within the TrainingArguments data class.
num_train_epochs (integer): The number of training epochs.
per_device_train_batch_size (integer): The batch size for training.
per_device_eval_batch_size (integer): The batch size for evaluation.
learning_rate (float): The learning rate for training.
warmup_ratio (float): The warmup ratio used in training.
lr_scheduler_type (string): The type of learning rate scheduler.
label_smoothing_factor (float): Factor for label smoothing.
do_eval (boolean): Whether to perform evaluation during training.
save_strategy (string): Strategy for saving model checkpoints.
evaluation_strategy (string): Strategy for model evaluation during training.
load_best_model_at_end (boolean): Whether to load the best model at the end of training.
metric_for_best_model (string): The metric used to determine the best model.
greater_is_better (boolean): Whether higher values of the metric are considered better.
no_cuda (boolean): Whether to disable CUDA support for training.
report_to (string): Reporting strategy for training progress.
transformer_model_name (string, optional) The name of the transformer-based model to be used. If not provided, it defaults to "bert-base-multilingual-cased."
separator (string, optional) A separator character, typically used in data processing. If not provided, it defaults to ",".
do_split (boolean, optional) A boolean flag indicating whether data should be split. If not provided, it defaults to false.
split_ratio (float, optional) The ratio at which data should be split, if splitting is enabled. If not provided, it defaults to 0.2.
placeholders (boolean, optional) A boolean flag indicating whether placeholders are used. If not provided, it defaults to false.

This JSON configuration allows you to specify various settings and parameters for training your NLP model, making it a flexible and customizable tool for your natural language processing tasks.