Eugene Oskin

eoskin

AI & ML interests

None yet

Recent Activity

liked a model 10 days ago

google/gemma-3n-E2B-it

upvoted a collection 10 days ago

Gemma 3n

liked a Space 11 days ago

Supertone/supertonic

View all activity

Organizations

None yet

upvoted a collection 10 days ago

Gemma 3n

Collection

4 items • Updated Jul 10 • 248

upvoted 2 papers 26 days ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 151

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 219

upvoted a paper 28 days ago

Gated Delta Networks: Improving Mamba2 with Delta Rule

Paper • 2412.06464 • Published Dec 9, 2024 • 14

upvoted a paper 29 days ago

TinyBERT: Distilling BERT for Natural Language Understanding

Paper • 1909.10351 • Published Sep 23, 2019 • 1

upvoted 2 papers about 1 month ago

The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 68

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 493

upvoted an article about 2 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

735

upvoted 2 papers about 2 months ago

Hybrid Architectures for Language Models: Systematic Analysis and Design Insights

Paper • 2510.04800 • Published Oct 6 • 36

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Paper • 2510.05034 • Published Oct 6 • 48

upvoted a collection about 2 months ago

BERT release

Collection

Regroups the original BERT models released by the Google team. Except for the models marked otherwise, the checkpoints support English. • 8 items • Updated Jul 10 • 35

upvoted 4 papers 4 months ago

OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use

Paper • 2508.04482 • Published Aug 6 • 9

Hidden Dynamics of Massive Activations in Transformer Training

Paper • 2508.03616 • Published Aug 5 • 18

InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization

Paper • 2508.05731 • Published Aug 7 • 25

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 192

upvoted 2 articles 4 months ago

Article

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

Mar 22, 2024

•

103

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

404

upvoted a collection 4 months ago

ModernBERT

Collection

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 151

upvoted an article 4 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

•

709

upvoted a paper 6 months ago

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Paper • 2506.09513 • Published Jun 11 • 100

Eugene Oskin

AI & ML interests

Recent Activity

Organizations

eoskin's activity

SmolLM3: smol, multilingual, long-context reasoner

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

You could have designed state of the art positional encoding

Finally, a Replacement for BERT: Introducing ModernBERT