492afd7004f78277b6d22c05796c196e

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1946
  • Data Size: 1.0
  • Epoch Runtime: 1573.6064
  • Bleu: 15.9065

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 4.9540 0 109.0665 2.6147
No log 1 3177 2.5131 0.0078 120.3525 17.7039
0.0599 2 6354 1.8654 0.0156 142.3404 24.2251
2.0544 3 9531 1.5550 0.0312 166.5225 10.5876
1.815 4 12708 1.4500 0.0625 209.0970 11.4431
1.6555 5 15885 1.3777 0.125 304.3525 12.2800
1.5215 6 19062 1.2914 0.25 487.2187 13.1397
1.3872 7 22239 1.2222 0.5 852.6690 14.0849
1.254 8.0 25416 1.1497 1.0 1578.2169 14.9966
1.1105 9.0 28593 1.1241 1.0 1574.6522 15.4534
1.0375 10.0 31770 1.1162 1.0 1577.5926 15.6800
0.9277 11.0 34947 1.1180 1.0 1581.7450 15.7313
0.8186 12.0 38124 1.1412 1.0 1586.2745 15.8256
0.7382 13.0 41301 1.1606 1.0 1581.3892 15.8825
0.6694 14.0 44478 1.1946 1.0 1573.6064 15.9065

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
7
Safetensors
Model size
0.9B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/492afd7004f78277b6d22c05796c196e

Base model

google/umt5-xl
Finetuned
(32)
this model