3048e4c2ab0d9efc3daa857ab2714c1b

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [es-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8672
  • Data Size: 1.0
  • Epoch Runtime: 220.4552
  • Bleu: 8.1452

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 5.4846 0 15.2137 1.9323
No log 1 419 4.0330 0.0078 18.1488 5.8458
No log 2 838 3.3093 0.0156 23.2445 9.1132
0.1266 3 1257 2.8515 0.0312 29.8031 11.9876
0.1266 4 1676 2.3942 0.0625 35.7858 14.4652
0.1895 5 2095 2.0690 0.125 53.0679 5.8884
0.3167 6 2514 1.9319 0.25 73.9145 6.4234
2.1187 7 2933 1.8440 0.5 123.6452 7.0374
1.896 8.0 3352 1.7652 1.0 221.2216 7.6377
1.7204 9.0 3771 1.7368 1.0 216.4819 7.9181
1.5795 10.0 4190 1.7352 1.0 217.6422 8.0089
1.3851 11.0 4609 1.7597 1.0 216.3156 8.1047
1.2893 12.0 5028 1.7809 1.0 216.2436 8.1628
1.1339 13.0 5447 1.8185 1.0 217.2292 8.2207
1.0763 14.0 5866 1.8672 1.0 220.4552 8.1452

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
7
Safetensors
Model size
0.9B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/3048e4c2ab0d9efc3daa857ab2714c1b

Base model

google/umt5-xl
Finetuned
(32)
this model