dewata_bert_gelu
This model is a fine-tuned version of pijarcandra22/dewata_bert_gelu on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3142
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 80
- eval_batch_size: 80
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 3.0632 | 1.0 | 2153 | 2.7589 |
| 2.6534 | 2.0 | 4306 | 2.5378 |
| 2.4108 | 3.0 | 6459 | 2.4042 |
| 2.2169 | 4.0 | 8612 | 2.2892 |
| 2.0708 | 5.0 | 10765 | 2.2492 |
| 1.9505 | 6.0 | 12918 | 2.1956 |
| 1.8635 | 7.0 | 15071 | 2.1887 |
| 1.7806 | 8.0 | 17224 | 2.1658 |
| 1.7143 | 9.0 | 19377 | 2.1419 |
| 1.6472 | 10.0 | 21530 | 2.1155 |
| 1.5848 | 11.0 | 23683 | 2.1398 |
| 1.5304 | 12.0 | 25836 | 2.1151 |
| 1.4656 | 13.0 | 27989 | 2.1178 |
| 1.4215 | 14.0 | 30142 | 2.1389 |
| 1.375 | 15.0 | 32295 | 2.1276 |
| 1.3319 | 16.0 | 34448 | 2.1258 |
| 1.2921 | 17.0 | 36601 | 2.1140 |
| 1.2578 | 18.0 | 38754 | 2.1110 |
| 1.2087 | 19.0 | 40907 | 2.1127 |
| 1.17 | 20.0 | 43060 | 2.1248 |
| 1.1395 | 21.0 | 45213 | 2.1059 |
| 1.1081 | 22.0 | 47366 | 2.0860 |
| 1.0752 | 23.0 | 49519 | 2.1249 |
| 1.0412 | 24.0 | 51672 | 2.1316 |
| 1.0119 | 25.0 | 53825 | 2.1182 |
| 0.9852 | 26.0 | 55978 | 2.1331 |
| 0.9573 | 27.0 | 58131 | 2.1385 |
| 0.9291 | 28.0 | 60284 | 2.1583 |
| 0.9052 | 29.0 | 62437 | 2.1286 |
| 0.8812 | 30.0 | 64590 | 2.1635 |
| 0.8592 | 31.0 | 66743 | 2.1526 |
| 0.8311 | 32.0 | 68896 | 2.1488 |
| 0.8095 | 33.0 | 71049 | 2.1478 |
| 0.794 | 34.0 | 73202 | 2.1496 |
| 0.7727 | 35.0 | 75355 | 2.1671 |
| 0.7533 | 36.0 | 77508 | 2.1542 |
| 0.7335 | 37.0 | 79661 | 2.1502 |
| 0.714 | 38.0 | 81814 | 2.1448 |
| 0.6985 | 39.0 | 83967 | 2.1553 |
| 0.684 | 40.0 | 86120 | 2.1365 |
| 0.6648 | 41.0 | 88273 | 2.1962 |
| 0.652 | 42.0 | 90426 | 2.1734 |
| 0.6348 | 43.0 | 92579 | 2.1742 |
| 0.6184 | 44.0 | 94732 | 2.1718 |
| 0.6066 | 45.0 | 96885 | 2.1597 |
| 0.5907 | 46.0 | 99038 | 2.1823 |
| 0.5777 | 47.0 | 101191 | 2.1935 |
| 0.7347 | 48.0 | 103344 | 0.6094 |
| 0.7113 | 49.0 | 105497 | 0.6030 |
| 0.6928 | 50.0 | 107650 | 0.6279 |
| 0.6682 | 51.0 | 109803 | 0.6459 |
| 0.6589 | 52.0 | 111956 | 0.6777 |
| 0.639 | 53.0 | 114109 | 0.7150 |
| 0.6256 | 54.0 | 116262 | 0.7474 |
| 0.6079 | 55.0 | 118415 | 0.7618 |
| 0.5924 | 56.0 | 120568 | 0.7921 |
| 0.5813 | 57.0 | 122721 | 0.8130 |
| 0.5693 | 58.0 | 124874 | 0.8569 |
| 0.557 | 59.0 | 127027 | 0.8834 |
| 0.5445 | 60.0 | 129180 | 0.8872 |
| 0.5325 | 61.0 | 131333 | 0.8912 |
| 0.519 | 62.0 | 133486 | 0.9274 |
| 0.5115 | 63.0 | 135639 | 0.9433 |
| 0.5035 | 64.0 | 137792 | 0.9394 |
| 0.4905 | 65.0 | 139945 | 0.9600 |
| 0.4867 | 66.0 | 142098 | 0.9607 |
| 0.4712 | 67.0 | 144251 | 0.9685 |
| 0.4671 | 68.0 | 146404 | 0.9959 |
| 0.4574 | 69.0 | 148557 | 0.9948 |
| 0.4467 | 70.0 | 150710 | 1.0065 |
| 0.4391 | 71.0 | 152863 | 1.0075 |
| 0.4325 | 72.0 | 155016 | 1.0138 |
| 0.4233 | 73.0 | 157169 | 1.0218 |
| 0.4159 | 74.0 | 159322 | 1.0246 |
| 0.4127 | 75.0 | 161475 | 1.0275 |
| 0.4059 | 76.0 | 163628 | 1.0405 |
| 0.3993 | 77.0 | 165781 | 1.0482 |
| 0.3912 | 78.0 | 167934 | 1.0344 |
| 0.3858 | 79.0 | 170087 | 1.0452 |
| 0.3812 | 80.0 | 172240 | 1.0316 |
| 0.3778 | 81.0 | 174393 | 1.0595 |
| 0.3719 | 82.0 | 176546 | 1.0594 |
| 0.3658 | 83.0 | 178699 | 1.0671 |
| 0.3641 | 84.0 | 180852 | 1.0492 |
| 0.3574 | 85.0 | 183005 | 1.0752 |
| 0.3502 | 86.0 | 185158 | 1.0538 |
| 0.3491 | 87.0 | 187311 | 1.0669 |
| 0.3428 | 88.0 | 189464 | 1.0670 |
| 0.3409 | 89.0 | 191617 | 1.0697 |
| 0.3381 | 90.0 | 193770 | 1.0716 |
| 0.3344 | 91.0 | 195923 | 1.0750 |
| 0.4222 | 92.0 | 198076 | 0.3014 |
| 0.4101 | 93.0 | 200229 | 0.3137 |
| 0.4044 | 94.0 | 202382 | 0.3089 |
| 0.4022 | 95.0 | 204535 | 0.2992 |
| 0.3985 | 96.0 | 206688 | 0.3129 |
| 0.3979 | 97.0 | 208841 | 0.3149 |
| 0.3938 | 98.0 | 210994 | 0.3168 |
| 0.3923 | 99.0 | 213147 | 0.3147 |
| 0.3935 | 100.0 | 215300 | 0.3142 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.9.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 32
Model tree for pijarcandra22/dewata_bert_gelu
Unable to build the model tree, the base model loops to the model itself. Learn more.