view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 2 days ago • 36
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 5 days ago • 223
MobileLLM-R1 Collection MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated 14 days ago • 27
view article Article huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning +2 Oct 27 • 69
Jan-v2-VL Collection Jan-v2-VL: an 8B VLM focused on reliable, many-step task execution. • 6 items • Updated 23 days ago • 36
view article Article Optimizing Mixture-of-Experts Training: A Cost-Effective, Two-Sided Approach Sep 30 • 3