ReviewScore: Misinformed Peer Review Detection with Large Language Models Paper • 2509.21679 • Published Sep 25 • 63
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning Paper • 2402.12842 • Published Feb 20, 2024
Offline-GRPO Collection Collection of LLMs continually post-trained via offline GRPO to enhance mathematical reasoning capabilities. • 3 items • Updated Aug 7
PromptKD Collection distilled model checkpoints with PromptKD (https://arxiv.org/abs/2402.12842) method. • 7 items • Updated Jun 19, 2024