fd96cd898bba8dbe3bfa5a5b7b1a56c3

This model is a fine-tuned version of google/umt5-xl on the Helsinki-NLP/opus_books [en-no] dataset. It achieves the following results on the evaluation set:

Loss: 1.8440
Data Size: 1.0
Epoch Runtime: 56.4380
Bleu: 11.8935

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	5.7519	0	3.8324	2.6667
No log	1	87	5.3766	0.0078	4.5080	3.6279
No log	2	174	4.5978	0.0156	9.4975	6.6975
No log	3	261	3.9958	0.0312	14.2208	9.8415
No log	4	348	3.2172	0.0625	21.8229	15.4143
0.2201	5	435	2.8475	0.125	23.4467	17.9165
0.8228	6	522	2.3381	0.25	28.2647	20.7722
0.9515	7	609	2.0036	0.5	42.3805	9.1580
1.3167	8.0	696	1.7552	1.0	62.3441	10.7515
1.7823	9.0	783	1.7107	1.0	59.5657	11.3962
1.5709	10.0	870	1.6947	1.0	56.8749	11.6802
1.3463	11.0	957	1.7127	1.0	61.4083	11.8069
1.1895	12.0	1044	1.7487	1.0	54.1224	11.9382
1.0372	13.0	1131	1.7962	1.0	58.4049	11.7635
0.9109	14.0	1218	1.8440	1.0	56.4380	11.8935

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 9

Safetensors

Model size

0.9B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/fd96cd898bba8dbe3bfa5a5b7b1a56c3

Base model

google/umt5-xl

Finetuned

(32)

this model

contemmcm
/

fd96cd898bba8dbe3bfa5a5b7b1a56c3

fd96cd898bba8dbe3bfa5a5b7b1a56c3

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for contemmcm/fd96cd898bba8dbe3bfa5a5b7b1a56c3

Evaluation results