t5-small-pirate-style πŸ΄β€β˜ οΈ

This model is a fine-tuned version of t5-small on the KafeisM/pirate-speak-dataset. It specializes in Style Transfer, rewriting standard Modern English text into stereotypical Pirate English.

It achieves the following results on the evaluation set:

  • Loss: 0.1254
  • Rouge1: 0.8776
  • Bleu: 0.8680

Model Description

  • Model type: T5 (Text-to-Text Transfer Transformer) esquemat5
  • Language(s): English (Modern & Pirate Style)
  • Task: Sequence-to-Sequence (Style Transfer)
  • Finetuning approach: Supervised Fine-Tuning with Seq2SeqTrainer.

This model was developed as an academic project for a "Deep Learning for NLP" course. It demonstrates how a small, general-purpose model like T5-small can be adapted to a specific niche domain using a small, high-quality synthetic dataset.

Intended Uses & Limitations

How to use

You must use the prefix translate English to Pirate: for the model to work correctly.

from transformers import pipeline

pipe = pipeline("text2text-generation", model=("KafeisM/t5-small-pirate-style"))

def translate_to_pirate(text):
    input_text = "translate English to Pirate: " + text
    inputs = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

    outputs = model.generate(inputs, max_length=64, num_beams=4, early_stopping=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)


print(translate_to_pirate("The server is down."))
print(translate_to_pirate("Hello friend, can you help me?"))
print(translate_to_pirate("I need to sleep now."))

# Output: "The server is down, matey."
# Output: "Hallo, friend, can ye help me?"
# Output: "I need to sleep now, matey.

Limitations and Bias

  • Synthetic Data: The model was trained on ~500 synthetic examples generated by an LLM. It mimics stereotypical pirate speech (Hollywood style), not historical maritime dialect.
  • Modern Vocabulary: As observed in stress testing, the model struggles with complex modern terms (e.g., "Quantum mechanics", "drivers"). It tends to employ a "conservative copying" strategy: preserving the noun and appending a pirate suffix (e.g., ", matey").
  • Repetition: The model has a learned bias towards ending sentences with specific catchphrases like ", matey" or ", arr".

Training and Evaluation Data

The model was trained on KafeisM/pirate-speak-dataset, a corpus of 500 English-Pirate pairs generated specifically for this project to ensure domain consistency.

  • Train split: 450 examples (90%).
  • Test split: 50 examples (10%).

Training Procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: AdamW
  • lr_scheduler_type: linear
  • num_epochs: 5
  • fp16: False (Trained on T4 GPU)

Training results

Epoch Training Loss Validation Loss Rouge1 Bleu
1.0 No log 0.6660 0.0* 0.0*
2.0 No log 0.2254 0.7016 0.6968
3.0 No log 0.1544 0.8005 0.8027
4.0 No log 0.1326 0.8701 0.8647
5.0 No log 0.1254 0.8776 0.8680

Note: The 0.0 scores in Epoch 1 were due to an initial configuration behavior where the model generated empty strings or padding tokens, which was resolved as the model converged in subsequent epochs.

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.9.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
16
Safetensors
Model size
60.5M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for KafeisM/t5-small-pirate-style

Base model

google-t5/t5-small
Finetuned
(2210)
this model

Dataset used to train KafeisM/t5-small-pirate-style