9 131 289

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

liked a model 9 days ago

QwQZh/gated_attention

liked a model 16 days ago

Tile-AI/DeepSeek-V3.2-Exp-TileRT

upvoted a paper 16 days ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

View all activity

Organizations

liked a model 9 days ago

QwQZh/gated_attention

Updated May 10 • 16

liked a model 16 days ago

Tile-AI/DeepSeek-V3.2-Exp-TileRT

685B • Updated 16 days ago • 8

upvoted a paper 16 days ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published 25 days ago • 104

upvoted a paper 19 days ago

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

Paper • 2511.07384 • Published 26 days ago • 16

upvoted a collection 19 days ago

Retrofitting Recurrence

Collection

40 items • Updated 25 days ago • 6

liked a model 24 days ago

mlfoundations/fasttext-oh-eli5

Updated Aug 1, 2024 • 28

upvoted a paper 25 days ago

DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

Paper • 2511.06307 • Published 27 days ago • 50

liked 2 datasets 27 days ago

tokyotech-llm/swallow-math-v2

Viewer • Updated about 1 month ago • 17.4M • 70.6k • 14

tokyotech-llm/swallow-code-v2

Viewer • Updated 28 days ago • 147M • 144k • 21

liked 2 models 30 days ago

inclusionAI/LLaDA2.0-mini-preview

Text Generation • 16B • Updated 10 days ago • 3.06k • 82

google/switch-c-2048

Updated Jan 11, 2024 • 188 • 293

liked 5 models about 1 month ago

upvoted a collection about 1 month ago

LLaDA 2.0

Collection

4 items • Updated 10 days ago • 20

liked a model about 1 month ago

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated 23 days ago • 205k • • 1.37k

liked a model about 2 months ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4 • 5.48M • 2.93k

upvoted a paper about 2 months ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 6

YangWang92

AI & ML interests

Recent Activity

Organizations

yangwang92's activity