1 11 4

zuijiang

AI & ML interests

None yet

Recent Activity

upvoted a paper 15 days ago

Coupled Variational Reinforcement Learning for Language Model General Reasoning

upvoted a paper 29 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

upvoted a paper 4 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

View all activity

Organizations

upvoted a paper 15 days ago

Coupled Variational Reinforcement Learning for Language Model General Reasoning

Paper • 2512.12576 • Published 20 days ago • 2

upvoted a paper 29 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 244

upvoted a paper 4 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 83

upvoted 2 papers 7 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks

Paper • 2504.12764 • Published Apr 17, 2025 • 41

upvoted a paper 9 months ago

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

authored 4 papers 9 months ago

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

Paper • 2502.04675 • Published Feb 7, 2025 • 1

Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models

Paper • 2503.18034 • Published Mar 23, 2025 • 1

SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency

Paper • 2502.02458 • Published Feb 4, 2025 • 1

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

Paper • 2504.00502 • Published Apr 1, 2025 • 26

liked a Space 10 months ago

The Ultra-Scale Playbook

🌌

3.62k

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 11 months ago

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Paper • 2502.01142 • Published Feb 3, 2025 • 24

upvoted a paper 12 months ago

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

Paper • 2501.01830 • Published Jan 3, 2025 • 17

commented a paper 12 months ago

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

Paper • 2501.01830 • Published Jan 3, 2025 • 17 •

authored a paper about 1 year ago

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

Paper • 2411.11504 • Published Nov 18, 2024 • 24

upvoted a paper about 1 year ago

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

Paper • 2411.11504 • Published Nov 18, 2024 • 24

updated 2 datasets over 1 year ago

zuijiang/alpaca-alpaca-clean

Viewer • Updated Aug 26, 2024 • 51.8k • 14

zuijiang/mistral-alpaca-clean

Viewer • Updated Aug 25, 2024 • 51.8k • 46

liked a dataset over 1 year ago

AIcell/MOSSBench

Updated Mar 4, 2025 • 1.1k • 5

liked a Space over 1 year ago

Voice Clone

🗣

2.56k

Clone a voice to say custom text