view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 Jun 3 • 96
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28 • 103
view article Article How to train a new language model from scratch using Transformers and Tokenizers Feb 14, 2020 • 56
view article Article Introducing HELMET: Holistically Evaluating Long-context Language Models +5 Apr 16 • 40
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control +2 Feb 4 • 185
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper • 2505.20411 • Published May 26 • 91
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference Paper • 2504.10326 • Published Apr 14 • 25
Running 29 Llama-4-Maverick-03-26-Experimental Battles 🔥 29 Display and filter chat conversations between models