d371d4d8be4be5eb4e90652fdaac9fee

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6929
  • Data Size: 1.0
  • Epoch Runtime: 410.6452
  • Bleu: 11.5005

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 5.7046 0 28.5595 1.1816
No log 1 806 4.0089 0.0078 32.0883 5.4944
No log 2 1612 3.1023 0.0156 42.0395 9.7847
No log 3 2418 2.6106 0.0312 52.0374 14.2410
0.1211 4 3224 2.2476 0.0625 64.4661 7.1959
2.5793 5 4030 1.9795 0.125 87.4268 7.8127
2.2415 6 4836 1.8527 0.25 131.6558 8.7641
2.0493 7 5642 1.7388 0.5 225.0196 9.4379
1.8549 8.0 6448 1.6425 1.0 412.5192 10.3146
1.6668 9.0 7254 1.6054 1.0 412.9333 10.7869
1.5133 10.0 8060 1.5867 1.0 412.3394 11.0634
1.4061 11.0 8866 1.5829 1.0 412.4480 11.2502
1.276 12.0 9672 1.6010 1.0 411.0898 11.3751
1.1394 13.0 10478 1.6242 1.0 411.5901 11.4016
1.0515 14.0 11284 1.6554 1.0 411.0146 11.4973
0.9416 15.0 12090 1.6929 1.0 410.6452 11.5005

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
4
Safetensors
Model size
0.9B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/d371d4d8be4be5eb4e90652fdaac9fee

Base model

google/umt5-xl
Finetuned
(32)
this model