SDPO: Segment-Level Direct Preference Optimization for Social Agents Paper • 2501.01821 • Published Jan 3 • 20
Psychology_LLM_GGUF Collection GGUF-format psychological counseling LLMs, convenient for PC use • 2 items • Updated Mar 17 • 1
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published Feb 26 • 63
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws Paper • 2404.05405 • Published Apr 8, 2024 • 10
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction Paper • 2309.14316 • Published Sep 25, 2023 • 8
Physics of Language Models: Part 3.2, Knowledge Manipulation Paper • 2309.14402 • Published Sep 25, 2023 • 7
Physics of Language Models: Part 1, Context-Free Grammar Paper • 2305.13673 • Published May 23, 2023 • 7