t5-small-pirate-style π΄ββ οΈ
This model is a fine-tuned version of t5-small on the KafeisM/pirate-speak-dataset. It specializes in Style Transfer, rewriting standard Modern English text into stereotypical Pirate English.
It achieves the following results on the evaluation set:
- Loss: 0.1254
- Rouge1: 0.8776
- Bleu: 0.8680
Model Description
- Model type: T5 (Text-to-Text Transfer Transformer)

- Language(s): English (Modern & Pirate Style)
- Task: Sequence-to-Sequence (Style Transfer)
- Finetuning approach: Supervised Fine-Tuning with Seq2SeqTrainer.
This model was developed as an academic project for a "Deep Learning for NLP" course. It demonstrates how a small, general-purpose model like T5-small can be adapted to a specific niche domain using a small, high-quality synthetic dataset.
Intended Uses & Limitations
How to use
You must use the prefix translate English to Pirate: for the model to work correctly.
from transformers import pipeline
pipe = pipeline("text2text-generation", model=("KafeisM/t5-small-pirate-style"))
def translate_to_pirate(text):
input_text = "translate English to Pirate: " + text
inputs = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(inputs, max_length=64, num_beams=4, early_stopping=True)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translate_to_pirate("The server is down."))
print(translate_to_pirate("Hello friend, can you help me?"))
print(translate_to_pirate("I need to sleep now."))
# Output: "The server is down, matey."
# Output: "Hallo, friend, can ye help me?"
# Output: "I need to sleep now, matey.
Limitations and Bias
- Synthetic Data: The model was trained on ~500 synthetic examples generated by an LLM. It mimics stereotypical pirate speech (Hollywood style), not historical maritime dialect.
- Modern Vocabulary: As observed in stress testing, the model struggles with complex modern terms (e.g., "Quantum mechanics", "drivers"). It tends to employ a "conservative copying" strategy: preserving the noun and appending a pirate suffix (e.g., ", matey").
- Repetition: The model has a learned bias towards ending sentences with specific catchphrases like ", matey" or ", arr".
Training and Evaluation Data
The model was trained on KafeisM/pirate-speak-dataset, a corpus of 500 English-Pirate pairs generated specifically for this project to ensure domain consistency.
- Train split: 450 examples (90%).
- Test split: 50 examples (10%).
Training Procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: AdamW
- lr_scheduler_type: linear
- num_epochs: 5
- fp16: False (Trained on T4 GPU)
Training results
| Epoch | Training Loss | Validation Loss | Rouge1 | Bleu |
|---|---|---|---|---|
| 1.0 | No log | 0.6660 | 0.0* | 0.0* |
| 2.0 | No log | 0.2254 | 0.7016 | 0.6968 |
| 3.0 | No log | 0.1544 | 0.8005 | 0.8027 |
| 4.0 | No log | 0.1326 | 0.8701 | 0.8647 |
| 5.0 | No log | 0.1254 | 0.8776 | 0.8680 |
Note: The 0.0 scores in Epoch 1 were due to an initial configuration behavior where the model generated empty strings or padding tokens, which was resolved as the model converged in subsequent epochs.
Framework versions
- Transformers 4.57.1
- Pytorch 2.9.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 16
Model tree for KafeisM/t5-small-pirate-style
Base model
google-t5/t5-small