contemmcm's picture
End of training
07719d0 verified
metadata
library_name: transformers
license: apache-2.0
base_model: google/umt5-xl
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: bb84e5956fa4843c4c1fa82337692137
    results: []

bb84e5956fa4843c4c1fa82337692137

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0099
  • Data Size: 1.0
  • Epoch Runtime: 205.5781
  • Bleu: 8.8433

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 5.2927 0 13.5941 2.2924
No log 1 367 4.3096 0.0078 15.5771 5.5344
No log 2 734 3.5125 0.0156 23.7748 10.6850
No log 3 1101 2.9758 0.0312 31.8804 14.5616
No log 4 1468 2.6384 0.0625 40.5064 15.9381
0.1899 5 1835 2.3644 0.125 52.6170 6.5791
2.7147 6 2202 2.1766 0.25 75.6631 7.1049
2.4397 7 2569 2.0553 0.5 117.8008 7.3811
2.1857 8.0 2936 1.9766 1.0 206.8776 8.0779
1.9886 9.0 3303 1.9395 1.0 206.6883 8.4148
1.761 10.0 3670 1.9255 1.0 205.8516 8.6401
1.6096 11.0 4037 1.9284 1.0 203.4900 8.7878
1.5012 12.0 4404 1.9467 1.0 205.5317 8.7877
1.4092 13.0 4771 1.9821 1.0 205.3684 8.8410
1.2512 14.0 5138 2.0099 1.0 205.5781 8.8433

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1