Model Card **
A lightweight Qwen2.5-0.5B model fine-tuned using Unsloth + LoRA (PEFT) for efficient text-generation tasks. This model is optimized for low-VRAM systems, fast inference, and rapid experimentation.
Model Details
Model Description
This model is a parameter-efficient fine-tuned version of the base model:
- Base model:
unsloth/qwen2.5-0.5b-unsloth-bnb-4bit - Fine-tuning method: LoRA (PEFT)
- Quantization: 4-bit (bnb-4bit)
- Pipeline: text-generation
- Library: PEFT, Transformers, TRL, Unsloth
It is intended as a compact research model for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.
- Developer: @Sriramdayal
- Repository: https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1
- License: Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
- Languages: English (primary), multilingual capability inherited from Qwen2.5
- Finetuned from:
unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
Model Sources
GitHub Repo (Training Code): https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1
Base Model:
unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
Uses
Direct Use
- Instruction-style text generation
- Chatbot prototyping
- Educational or research experiments
- Low-VRAM inference (4โ6 GB GPU)
- Fine-tuning starter model for custom tasks
Downstream Use
- Domain-specific SFT
- Dataset distillation
- RLHF training
- Task-specific adapters (classifiers, generators, reasoning tasks)
Out-of-Scope / Avoid
- High-accuracy medical/legal decisions
- Safety-critical systems
- Long-context reasoning competitive with large LLMs
- Harmful or malicious use cases
Bias, Risks & Limitations
This model inherits all biases from Qwen2.5 training data and may generate:
- Inaccurate or hallucinated information
- Social, demographic, or political biases
- Unsafe or harmful recommendations if misused
Recommendations
Users must implement:
- Output filtering
- Safety moderation
- Human verification for critical tasks
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel
base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
adapter = "black279/Qwen_LeetCoder"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
The model was trained using custom datasets prepared through:
- Instruction datasets
- Synthetic Q&A
- Formatting for chat templates
(Replace with your actual dataset if you want more accuracy.)
Training Procedure
Framework: Unsloth + TRL + PEFT
Training type: Supervised Fine-Tuning (SFT)
Precision: bnb-4bit quantization during training
LoRA Ranks: (insert your actual values if different)
r=16,alpha=32,dropout=0.05
Hyperparameters
- Batch size: 2โ8 (depending on VRAM)
- Gradient Accumulation: 8โ16
- LR: 2e-4
- Epochs: 1โ3
- Optimizer: AdamW / paged optimizers (Unsloth)
Speeds & Compute
- Hardware: 1ร RTX 4090 / A100 / local GPU
- Training Time: 1โ3 hours (approx)
- Checkpoint Size: Tiny (LoRA weights only)
Evaluation
(You can update this later after running eval benchmarks.)
- Model evaluated on small reasoning + text-generation samples
- Performs well for short instructions
- Limited long-context and deep reasoning
Environmental Impact
- Hardware: 1 GPU (consumer or cloud)
- Carbon estimate: Low (small model + LoRA)
Technical Specs
- Architecture: Qwen2.5 0.5B
- Objective: Causal LM
- Adapters: LoRA (PEFT)
- Quantization: bnb 4-bit
Citation
@misc{Sriramdayal2025QwenLoRA,
title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
author={Sriram Dayal},
year={2025},
howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
}
Model Card Author
@Sriramdayal
Framework versions
- PEFT 0.18.0
- Downloads last month
- 147