Model Card **

A lightweight Qwen2.5-0.5B model fine-tuned using Unsloth + LoRA (PEFT) for efficient text-generation tasks. This model is optimized for low-VRAM systems, fast inference, and rapid experimentation.

Model Details

Model Description

This model is a parameter-efficient fine-tuned version of the base model:

Base model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
Fine-tuning method: LoRA (PEFT)
Quantization: 4-bit (bnb-4bit)
Pipeline: text-generation
Library: PEFT, Transformers, TRL, Unsloth

It is intended as a compact research model for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.

Developer: @Sriramdayal
Repository: https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1
License: Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
Languages: English (primary), multilingual capability inherited from Qwen2.5
Finetuned from: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit

Model Sources

GitHub Repo (Training Code): https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1
Base Model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit

Uses

Direct Use

Instruction-style text generation
Chatbot prototyping
Educational or research experiments
Low-VRAM inference (4–6 GB GPU)
Fine-tuning starter model for custom tasks

Downstream Use

Domain-specific SFT
Dataset distillation
RLHF training
Task-specific adapters (classifiers, generators, reasoning tasks)

Out-of-Scope / Avoid

High-accuracy medical/legal decisions
Safety-critical systems
Long-context reasoning competitive with large LLMs
Harmful or malicious use cases

Bias, Risks & Limitations

This model inherits all biases from Qwen2.5 training data and may generate:

Inaccurate or hallucinated information
Social, demographic, or political biases
Unsafe or harmful recommendations if misused

Recommendations

Users must implement:

Output filtering
Safety moderation
Human verification for critical tasks

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel

base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
adapter = "black279/Qwen_LeetCoder"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was trained using custom datasets prepared through:

Instruction datasets
Synthetic Q&A
Formatting for chat templates

(Replace with your actual dataset if you want more accuracy.)

Training Procedure

Framework: Unsloth + TRL + PEFT
Training type: Supervised Fine-Tuning (SFT)
Precision: bnb-4bit quantization during training
LoRA Ranks: (insert your actual values if different)
- r=16, alpha=32, dropout=0.05

Hyperparameters

Batch size: 2–8 (depending on VRAM)
Gradient Accumulation: 8–16
LR: 2e-4
Epochs: 1–3
Optimizer: AdamW / paged optimizers (Unsloth)

Speeds & Compute

Hardware: 1× RTX 4090 / A100 / local GPU
Training Time: 1–3 hours (approx)
Checkpoint Size: Tiny (LoRA weights only)

Evaluation

(You can update this later after running eval benchmarks.)

Model evaluated on small reasoning + text-generation samples
Performs well for short instructions
Limited long-context and deep reasoning

Environmental Impact

Hardware: 1 GPU (consumer or cloud)
Carbon estimate: Low (small model + LoRA)

Technical Specs

Architecture: Qwen2.5 0.5B
Objective: Causal LM
Adapters: LoRA (PEFT)
Quantization: bnb 4-bit

Citation

@misc{Sriramdayal2025QwenLoRA,
  title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
  author={Sriram Dayal},
  year={2025},
  howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
}

Model Card Author

@Sriramdayal

Framework versions

PEFT 0.18.0

Downloads last month: 147

Safetensors

Model size

0.5B params

Tensor type

F32

F16