view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 4 days ago • 48
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published Jan 29 • 59
Kandinsky 5.0 Video Lite Collection Kandinsky 5.0 Video Lite is a lightweight 2B model that generates up to 10-second SD videos from English and Russian prompts with high visual quality. • 9 items • Updated 13 days ago • 7
Kandinsky 5.0 Video Lite Diffusers Collection Kandinsky 5.0 Video Lite is a lightweight 2B model that generates up to 10-second SD videos from English and Russian prompts with high visual quality. • 8 items • Updated 13 days ago • 4
Kandinsky 5.0 Video Pro Diffusers Collection Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 2 items • Updated 13 days ago • 6
Kandinsky 5.0 Video Pro Collection Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 5 items • Updated 13 days ago • 15
Kandinsky 5.0 Image Lite Collection Kandinsky 5.0 Image Lite is a 6B DiT-based model that generates and edits HD images from English and Russian text prompts with high visual quality. • 4 items • Updated 13 days ago • 13
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism Paper • 2511.11373 • Published 23 days ago • 12
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published Oct 29 • 76
Learning to Reason as Action Abstractions with Scalable Mid-Training RL Paper • 2509.25810 • Published Sep 30 • 5