AI & ML interests

Hugging Face Inference Endpoints Images repository allows AI Builders to collaborate and engage creating awesome inference deployments

Recent Activity

AdinaY 
posted an update about 1 month ago
view post
Post
3223
Kimi K2 Thinking is now live on the hub 🔥

moonshotai/Kimi-K2-Thinking

✨ 1T MoE for deep reasoning & tool use
✨ Native INT4 quantization = 2× faster inference
✨ 256K context window
✨ Modified MIT license
AdinaY 
posted an update about 1 month ago
view post
Post
642
Chinese open source AI in October wasn’t about bigger models, it was about real world impact 🔥

https://huggingface.co/collections/zh-ai-community/october-2025-china-open-source-highlights

✨ Vision-Language & OCR wave 🌊
- DeepSeek-OCR : 3B
- PaddleOCR-VL : 0.9B
- Qwen3-VL : 2B / 4B / 8B / 32B /30B-A3B
- Open-Bee: Bee-8B-RL
- http://Z.ai Glyph :10B

OCR is industrializing, the real game now is understanding the (long context) document, not just reading it.

✨ Text generation: scale or innovation?
- MiniMax-M2: 229B
- Antgroup Ling-1T & Ring-1T
- Moonshot Kimi-Linear : linear-attention challenger
- Kwaipilot KAT-Dev

Efficiency is the key.

✨ Any-to-Any & World-Model : one step forward to the real world
- BAAI Emu 3.5
- Antgroup Ming-flash-omni
- HunyuanWorld-Mirror: 3D

Aligning with the “world model” globally

✨ Audio & Speech + Video & Visual: released from entertainment labs to delivery platforms
- SoulX-Podcast TTS
- LongCat-Audio-Codec & LongCat-Video by Meituan delivery paltform
- xiabs DreamOmni 2

Looking forward to what's next 🚀
abidlabs 
posted an update about 1 month ago
view post
Post
8265
Why I think local, open-source models will eventually win.

The most useful AI applications are moving toward multi-turn agentic behavior: systems that take hundreds or even thousands of iterative steps to complete a task, e.g. Claude Code, computer-control agents that click, type, and test repeatedly.

In these cases, the power of the model is not how smart it is per token, but in how quickly it can interact with its environment and tools across many steps. In that regime, model quality becomes secondary to latency.

An open-source model that can call tools quickly, check that the right thing was clicked, or verify that a code change actually passes tests can easily outperform a slightly “smarter” closed model that has to make remote API calls for every move.

Eventually, the balance tips: it becomes impractical for an agent to rely on remote inference for every micro-action. Just as no one would tolerate a keyboard that required a network request per keystroke, users won’t accept agent workflows bottlenecked by latency. All devices will ship with local, open-source models that are “good enough” and the expectation will shift toward everything running locally. It’ll happen sooner than most people think.
·
AdinaY 
posted an update about 1 month ago
AdinaY 
posted an update about 1 month ago
view post
Post
1750
Ming-flash-omni Preview 🚀 Multimodal foundation model from AntGroup

inclusionAI/Ming-flash-omni-Preview

✨ Built on Ling-Flash-2.0: 10B total/6B active
✨ Generative segmentation-as-editing
✨ SOTA contextual & dialect ASR
✨ High-fidelity image generation
AdinaY 
posted an update about 1 month ago
view post
Post
1859

Glyph 🔥 a framework that scales context length by compressing text into images and processing them with vision–language models, released by Z.ai.

Paper:https://huggingface.co/papers/2510.17800
Model:https://huggingface.co/zai-org/Glyph

✨ Compresses long sequences visually to bypass token limits
✨ Reduces computational and memory costs
✨ Preserves meaning through multimodal encoding
✨ Built on GLM-4.1V-9B-Base
AdinaY 
posted an update about 2 months ago
view post
Post
2647
HunyuanWorld Mirror🔥a versatile feed forward model for universal 3D world reconstruction by Tencent

tencent/HunyuanWorld-Mirror

✨ Any prior in → 3D world out
✨ Mix camera, intrinsics, depth as priors
✨ Predict point clouds, normals, Gaussians & more in one pass
✨ Unified architecture for all 3D task
AdinaY 
posted an update about 2 months ago
view post
Post
681
PaddleOCR VL🔥 0.9B Multilingual VLM by Baidu

PaddlePaddle/PaddleOCR-VL

✨ Ultra-efficient NaViT + ERNIE-4.5 architecture
✨ Supports 109 languages 🤯
✨ Accurately recognizes text, tables, formulas & charts
✨ Fast inference and lightweight for deployment
AdinaY 
posted an update about 2 months ago
view post
Post
1810
Bee-8B 🐝 open 8B Multimodal LLM built on high quality data, released by
TencentHunyuan

Paper: Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs (2510.13795)
Model: https://huggingface.co/collections/Open-Bee/bee-8b-68ecbf10417810d90fbd9995

✨ Trained on Honey-Data-15M, a 15M-sample SFT corpus with dual-level CoT reasoning
✨ Backed by HoneyPipe, a transparent & reproducible open data curation suite
AdinaY 
posted an update about 2 months ago
AdinaY 
posted an update about 2 months ago
view post
Post
497
Ring-1T🔥 the trillion-parameter thinking model released by Ant group, the company behind Alipay

inclusionAI/Ring-1T

✨ 1T params (50B active)- MIT license
✨ 128K context (YaRN)
✨ RLVR, Icepop, and ASystem make trillion-scale RL stable
AdinaY 
posted an update about 2 months ago
view post
Post
516
KAT-Dev-72B-Exp🔥 Kuaishou's ( the company behind Kring AI ) new open model for software engineering

Kwaipilot/KAT-Dev-72B-Exp

✨ 72B - Apache2.0
✨ Redesigned attention kernel & training engine for efficient context-aware RL
✨ 74.6% accuracy on SWE-Bench Verified
AdinaY 
posted an update 2 months ago
view post
Post
4420
At the close of the National Holiday🇨🇳, Antgroup drops a new SoTA model.

Ling-1T 🔥 the trillion-parameter flagship of the Ling 2.0 series.

inclusionAI/Ling-1T

✨1T total / 50B active params per token
✨20T+ reasoning-dense tokens (Evo-CoT)
✨128K context via YaRN
✨FP8 training: 15%+ faster, same precision as BF16
✨Hybrid Syntax-Function-Aesthetics reward for front-end & visual generation
  • 1 reply
·
evijit 
posted an update 2 months ago
view post
Post
2561
AI for Scientific Discovery Won't Work Without Fixing How We Collaborate.

My co-author @cgeorgiaw and I just published a paper challenging a core assumption: that the main barriers to AI in science are technical. They're not. They're social.

Key findings:

🚨 The "AI Scientist" myth delays progress: Waiting for AGI devalues human expertise and obscures science's real purpose: cultivating understanding, not just outputs.
📊 Wrong incentives: Datasets have 100x longer impact than models, yet data curation is undervalued.
⚠️ Broken collaboration: Domain scientists want understanding. ML researchers optimize performance. Without shared language, projects fail.
🔍 Fragmentation costs years: Harmonizing just 9 cancer files took 329 hours.

Why this matters: Upstream bottlenecks like efficient PDE solvers could accelerate discovery across multiple sciences. CASP mobilized a community around protein structure, enabling AlphaFold. We need this for dozens of challenges.

Thus, we're launching Hugging Science! A global community addressing these barriers through collaborative challenges, open toolkits, education, and community-owned infrastructure. Please find all the links below!

Paper: AI for Scientific Discovery is a Social Problem (2509.06580)
Join: hugging-science
Discord: https://discord.com/invite/VYkdEVjJ5J
AdinaY 
posted an update 2 months ago