-
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
Paper • 2508.15804 • Published • 15 -
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Paper • 2510.02209 • Published • 53 -
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper • 2511.16043 • Published • 105
Tobias Völzing
wumingshi
·
AI & ML interests
None yet
Recent Activity
updated
a collection
5 days ago
Reasoning
updated
a collection
14 days ago
Agents
updated
a collection
22 days ago
REL
Organizations
None yet
LLM
-
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery
Paper • 2310.18356 • Published • 24 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 50
Training
3D
Small
Fundamental
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 16 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 16
RAG
FLLM
Code Generation
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 8 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 73 -
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
Paper • 2401.00788 • Published • 23 -
mistralai/Codestral-22B-v0.1
22B • Updated • 7.8k • 1.32k
Fine-Tuning
-
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Paper • 2310.17752 • Published • 15 -
Instruction-tuning Aligns LLMs to the Human Brain
Paper • 2312.00575 • Published • 15 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 28
REL
-
Controlled Decoding from Language Models
Paper • 2310.17022 • Published • 15 -
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 18 -
Inpainting-Guided Policy Optimization for Diffusion Large Language Models
Paper • 2509.10396 • Published • 15 -
The Path Not Taken: RLVR Provably Learns Off the Principals
Paper • 2511.08567 • Published • 32
Reverse Engineering
Hallucination
Reasoning
-
On the Diagram of Thought
Paper • 2409.10038 • Published • 13 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 128 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 184
Agents
-
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
Paper • 2508.15804 • Published • 15 -
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Paper • 2510.02209 • Published • 53 -
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper • 2511.16043 • Published • 105
FLLM
LLM
-
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery
Paper • 2310.18356 • Published • 24 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 50
Code Generation
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 8 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 73 -
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
Paper • 2401.00788 • Published • 23 -
mistralai/Codestral-22B-v0.1
22B • Updated • 7.8k • 1.32k
Training
Fine-Tuning
-
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Paper • 2310.17752 • Published • 15 -
Instruction-tuning Aligns LLMs to the Human Brain
Paper • 2312.00575 • Published • 15 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 28
3D
REL
-
Controlled Decoding from Language Models
Paper • 2310.17022 • Published • 15 -
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 18 -
Inpainting-Guided Policy Optimization for Diffusion Large Language Models
Paper • 2509.10396 • Published • 15 -
The Path Not Taken: RLVR Provably Learns Off the Principals
Paper • 2511.08567 • Published • 32
Small
Reverse Engineering
Fundamental
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 16 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 16
Hallucination
RAG
Reasoning
-
On the Diagram of Thought
Paper • 2409.10038 • Published • 13 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 128 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 184