Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective Paper • 2509.22613 • Published Sep 26 • 9
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! Paper • 2509.26495 • Published Sep 30 • 10
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention Paper • 2509.23610 • Published Sep 28 • 13
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published Sep 30 • 19
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30 • 32
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training Paper • 2509.25758 • Published Sep 30 • 22
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning Paper • 2509.23873 • Published Sep 28 • 67