view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face Feb 11, 2025 • 93
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 209
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 480
view article Article From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease Oct 21, 2022 • 42
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention Oct 7, 2024 • 64
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 267