t5-small-pirate-style 🏴‍☠️

This model is a fine-tuned version of t5-small on the KafeisM/pirate-speak-dataset. It specializes in Style Transfer, rewriting standard Modern English text into stereotypical Pirate English.

It achieves the following results on the evaluation set:

Loss: 0.1254
Rouge1: 0.8776
Bleu: 0.8680

Model Description

Model type: T5 (Text-to-Text Transfer Transformer)
Language(s): English (Modern & Pirate Style)
Task: Sequence-to-Sequence (Style Transfer)
Finetuning approach: Supervised Fine-Tuning with Seq2SeqTrainer.

This model was developed as an academic project for a "Deep Learning for NLP" course. It demonstrates how a small, general-purpose model like T5-small can be adapted to a specific niche domain using a small, high-quality synthetic dataset.

Intended Uses & Limitations

How to use

You must use the prefix translate English to Pirate: for the model to work correctly.

from transformers import pipeline

pipe = pipeline("text2text-generation", model=("KafeisM/t5-small-pirate-style"))

def translate_to_pirate(text):
    input_text = "translate English to Pirate: " + text
    inputs = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

    outputs = model.generate(inputs, max_length=64, num_beams=4, early_stopping=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)


print(translate_to_pirate("The server is down."))
print(translate_to_pirate("Hello friend, can you help me?"))
print(translate_to_pirate("I need to sleep now."))

# Output: "The server is down, matey."
# Output: "Hallo, friend, can ye help me?"
# Output: "I need to sleep now, matey.

Limitations and Bias

Synthetic Data: The model was trained on ~500 synthetic examples generated by an LLM. It mimics stereotypical pirate speech (Hollywood style), not historical maritime dialect.
Modern Vocabulary: As observed in stress testing, the model struggles with complex modern terms (e.g., "Quantum mechanics", "drivers"). It tends to employ a "conservative copying" strategy: preserving the noun and appending a pirate suffix (e.g., ", matey").
Repetition: The model has a learned bias towards ending sentences with specific catchphrases like ", matey" or ", arr".

Training and Evaluation Data

The model was trained on KafeisM/pirate-speak-dataset, a corpus of 500 English-Pirate pairs generated specifically for this project to ensure domain consistency.

Train split: 450 examples (90%).
Test split: 50 examples (10%).

Training Procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: AdamW
lr_scheduler_type: linear
num_epochs: 5
fp16: False (Trained on T4 GPU)

Training results

Epoch	Training Loss	Validation Loss	Rouge1	Bleu
1.0	No log	0.6660	0.0*	0.0*
2.0	No log	0.2254	0.7016	0.6968
3.0	No log	0.1544	0.8005	0.8027
4.0	No log	0.1326	0.8701	0.8647
5.0	No log	0.1254	0.8776	0.8680

Note: The 0.0 scores in Epoch 1 were due to an initial configuration behavior where the model generated empty strings or padding tokens, which was resolved as the model converged in subsequent epochs.

Framework versions

Transformers 4.57.1
Pytorch 2.9.0+cu126
Datasets 4.0.0
Tokenizers 0.22.1

Downloads last month: 16

Safetensors

Model size

60.5M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KafeisM/t5-small-pirate-style

Base model

google-t5/t5-small

Finetuned

(2210)

this model

Dataset used to train KafeisM/t5-small-pirate-style

Evaluation results

Metadata error: specify a dataset to view leaderboard