RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published 26 days ago • 13
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 5 days ago • 223
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 137
view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation Sep 16 • 15
An efficient probabilistic hardware architecture for diffusion-like models Paper • 2510.23972 • Published Oct 28 • 3
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published Oct 29 • 44
view article Article huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning +2 Oct 27 • 69
gpt-oss-safeguard Collection gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated Oct 29 • 58
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs Paper • 2402.12030 • Published Feb 19, 2024 • 3
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 247
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper • 2510.21614 • Published Oct 24 • 22