Floppa-12B-Gemma3 (Uncensored)

Floppa-12B is a fine-tuned multimodal model based on Google's Gemma 3 12B Instruct.

It has been specialized for high-fidelity uncensored translation (Japanese <-> English) and unrestricted image description. Unlike the base model, Floppa does not refuse to describe explicit, violent, or "spicy" imagery, and it translates slang, profanity, and cultural nuances with zero sanitization.

Model Description

Base Model: google/gemma-3-12b-it
Architecture: 12B Dense Transformer with SigLIP Vision Encoder (Hybrid Attention).
Training Method: 4-bit QLoRA with Unsloth, merged to bfloat16.
Objective: Behavior modification (Removal of refusal guardrails + Preservation of Vision).

Capabilities

"Spicy" Vision: Capable of describing NSFW, violent, or explicit anime illustrations without triggering safety refusals.
Uncensored Translation: Translates dialogue exactly as written. It will use slurs, profanity, and aggressive language if the source text demands it.
Multimodal Context: Can translate text inside an image or describe a scene to help with translation context.

Training Data (The "Floppa Mix")

The model was fine-tuned on a balanced dataset (~10.5k rows) designed to break refusals while maintaining intelligence:

20% Toxic/Uncensored Text: Custom dataset of explicit dialogue and "harmful" instruction following.
20% Translation Skill: Unbabel/TowerBlocks-v0.2 (High-quality multilingual pairs).
40% General Reasoning: mlabonne/FineTome-100k (Logic and conversation).
20% Vision Anchors: merve/vqav2-small + Custom "Spicy" Anime Dataset SmilingWolf/camie-tagger-vs-wd-tagger-val to prevent visual catastrophic forgetting.

Usage (vLLM)

This model is optimized for vLLM.

from vllm import LLM, SamplingParams
from PIL import Image

# Load Model
llm = LLM(
    model="YOUR_USERNAME/Floppa-12B-Gemma3-Uncensored",
    trust_remote_code=True,
    dtype="bfloat16"
)

# Prepare Input
image = Image.open("test_image.jpg").convert("RGB")
prompt = "<image>\nDescribe this image in detail."

# Generate
inputs = {"prompt": prompt, "multi_modal_data": {"image": image}}
params = SamplingParams(temperature=1.0, max_tokens=512)

outputs = llm.generate([inputs], sampling_params=params)
print(outputs[0].outputs[0].text)

License & Safety

This model is built upon Gemma 3 technology from Google. Use of this model is subject to the Gemma Terms of Use.

Disclaimer: This model produces uncensored content. It may generate output that is offensive, explicit, or factually incorrect. User discretion is advised. This model is intended for research, translation assistance, and creative writing workflows where content filtering is undesirable.

Downloads last month: 17

Safetensors

Model size

12B params

Tensor type

BF16

Model tree for Ryex/Floppa-12B-Gemma3-Uncensored

Base model

google/gemma-3-12b-pt

Finetuned

google/gemma-3-12b-it

Finetuned

(122)

this model

Quantizations

3 models