blanchon (Julien BLANCHON)

replied to Tonic's post 14 days ago

Hey @Tonic , I'm absolutly not related with the Liquid AI team. But happy to chat anytime (you can PM me on X maybe) !

reacted to vikhyatk's post with 🔥 18 days ago

Post

3528

Announcing RefCOCO-M, a refreshed RefCOCO with pixel-accurate masks and the problematic prompts removed.

moondream/refcoco-m

reacted to Kseniase's post with 🔥 about 1 month ago

Post

11127

11 Fascinating new Policy Optimization techniques

Policy optimization (PO) algorithms are central to training AI models with preference-based feedback. In recent weeks, numerous new PO methods have emerged that build on or replace the popular PPO and GRPO, solving their issues. Here are 11 of them:

1. BAlanced Policy Optimization (BAPO) → BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping (2510.18927)
Dynamically adjusting the clipping bounds in PPO-style updates to balance positive and negative gradients and prevent entropy collapse

2. Training-Free GRPO → Training-Free Group Relative Policy Optimization (2510.08191)
Instead of using numeric rewards, it compares rollouts semantically to distill useful knowledge as a token prior, which is then applied during inference to guide the model’s behavior

3. Asymmetric Importance Sampling Policy Optimization (ASPO) → ASPO: Asymmetric Importance Sampling Policy Optimization (2510.06062)
Fixes imbalanced token weighting in LLM training. It flips the importance sampling ratios for positive tokens to correct over- and under-updates, and adds a soft dual-clipping step to keep gradients stable

4. In-Context Steered Policy Optimization (ICPO) → https://arxiv.org/abs/2510.26519
Uses a model’s own in-context learning ability to guide training with existing data. It combines Mixed-Policy GRPO with Implicit Expert Forcing to expand exploration and adds Expert Region Reject Sampling and Annealed Expert-Bonus Reward Shaping to ensure stability and balanced expert influence

5. Graph-Enhanced Policy Optimization (GEPO) → https://arxiv.org/abs/2510.26270
Builds a graph of an agent’s experiences to understand how different states connect, guide exploration and assign rewards more effectively

6. Information Gain-based Policy Optimization (IGPO) → Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents (2510.14967)
Uses the model’s own belief updates to create dense, informative feedback for smoother multi-turn learning

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

2 replies

·

replied to piercus's post about 1 month ago

Amazing ! Any spaces to try this out quickly ?

reacted to piercus's post with 🔥👍 about 1 month ago

Post

3895

Starts erasing! 🎉 🎉 🎉
This is made with a one-step SD1.5 LBM [1] eraser !

Data is open. Data pipeline is open. Training code is open.
On our LBM fork : https://github.com/finegrain-ai/LBM

[1] LBM: Latent Bridge Matching for Fast Image-to-Image Translation (2503.07535)

1 reply

·

replied to mrfakename's post about 1 month ago

LAION data is all you need xd

reacted to mrfakename's post with 🔥 about 1 month ago

Post

5971

Trained a model for emotion-controllable TTS based on MiMo audio on LAION's dataset.

Still very early and does have an issue with hallucinating but results seem pretty good so far, given that it is very early into the training run.

Will probably kick off a new run later with some settings tweaked.

Put up a demo here: https://huggingface.co/spaces/mrfakename/EmoAct-MiMo

(Turn 🔊 on to hear audio samples)

5 replies

·

reacted to AdinaY's post with 🔥 about 2 months ago

Post

4420

At the close of the National Holiday🇨🇳, Antgroup drops a new SoTA model.

Ling-1T 🔥 the trillion-parameter flagship of the Ling 2.0 series.

inclusionAI/Ling-1T

✨1T total / 50B active params per token
✨20T+ reasoning-dense tokens (Evo-CoT)
✨128K context via YaRN
✨FP8 training: 15%+ faster, same precision as BF16
✨Hybrid Syntax-Function-Aesthetics reward for front-end & visual generation

1 reply

·

reacted to AdinaY's post with 🔥 4 months ago

Post

1710

Qwen is on fire this week 🔥
They just released Qwen3-MT 🌍 a translation model supports 92 languages.

Demo is available on the hub.
Qwen/Qwen3-MT-Demo

✨ Highly Customizable: Supports custom terms, domain prompts, and translation memory for accurate, context-aware results.
✨ Fast and affordable: $0.5 per million tokens.

reacted to AdinaY's post with 🚀 5 months ago

Post

2015

The Chinese Open Source Heatmap is live 🔥
You can now track the companies/ research labs/ communities powering China’s open source AI movement.

zh-ai-community/model-release-heatmap-zh

Some highlights:

✨Giant Tech are investing more in open source.
-Alibaba: Full stack open ecosystem
-Tecent: Hunyuan image/video/3D
-Bytedance: Catching up fast in 2025
-Baidu: New player in open LLM

✨New players emerging post–DeepSeek moment.
-Xiaomi
-Red Note
-Bilibili
-MiniMax
-Moonshot AI

✨Startup list is shifting fast! Those who find a direction aligned with their strengths are the ones who endure.
-DeepSeek
-MiniMax
-StepFun
-Moonshot AI
-Zhipu AI
-OpenBMB

✨Research Lab & Community are making key contributions.
-BAAI
-Shanghai AI Lab
-OpenMOSS
-MAP

4 replies

·

reacted to ArturoNereu's post with 🔥❤️ 7 months ago

Post

4463

I’ve been learning AI for several years (coming from the games industry), and along the way, I curated a list of the tools, courses, books, papers, and models that actually helped me understand things.

I turned this into a GitHub repo:
https://github.com/ArturoNereu/AI-Study-Group

If you’re just getting started, I recommend:

📘 Deep Learning – A Visual Approach: https://www.glassner.com/portfolio/deep-learning-a-visual-approach
🎥 Dive into LLMs with Andrej Karpathy: https://youtu.be/7xTGNNLPyMI?si=aUTq_qUzyUx36BsT
🧠 The 🤗 Agents course](https://huggingface.co/learn/agents-course/

The repo has grown with help from the community (Reddit, Discord, etc.) and I’ll keep updating it.

If you have any favorite resources, I’d love to include them.

replied to dylanebert's post 10 months ago

I really like the style of your 1 minute video. I still remember the one you did for 3DGS a long time ago

reacted to dylanebert's post with 🔥 10 months ago

Post

3394

I made a 1 minute video explaining the DeepSeek situation

R1: deepseek-ai/DeepSeek-R1
Janus Pro: deepseek-ai/Janus-Pro-7B

3 replies

·

reacted to hexgrad's post with 🔥 11 months ago

Post

21685

📣 Looking for labeled, high-quality synthetic audio/TTS data 📣 Have you been or are you currently calling API endpoints from OpenAI, ElevenLabs, etc? Do you have labeled audio data sitting around gathering dust? Let's talk! Join https://discord.gg/QuGxSWBfQy or comment down below.

If your data exceeds quantity & quality thresholds and is approved into the next hexgrad/Kokoro-82M training mix, and you permissively DM me the data under an effective Apache license, then I will DM back the corresponding voicepacks for YOUR data if/when the next Apache-licensed Kokoro base model drops.

What does this mean? If you've been calling closed-source TTS or audio API endpoints to:
- Build voice agents
- Make long-form audio, like audiobooks or podcasts
- Handle customer support, etc
Then YOU can contribute to the training mix and get useful artifacts in return. ❤️

More details at hexgrad/Kokoro-82M#21

25 replies

·

reacted to Xenova's post with 🔥❤️ 12 months ago

Post

4710

Introducing Moonshine Web: real-time speech recognition running 100% locally in your browser!
🚀 Faster and more accurate than Whisper
🔒 Privacy-focused (no data leaves your device)
⚡️ WebGPU accelerated (w/ WASM fallback)
🔥 Powered by ONNX Runtime Web and Transformers.js

Demo: webml-community/moonshine-web
Source code: https://github.com/huggingface/transformers.js-examples/tree/main/moonshine-web

5 replies

·

reacted to toshas's post with 😎🔥 12 months ago

Post

1430

Introducing ⇆ Marigold-DC — our training-free zero-shot approach to monocular Depth Completion with guided diffusion! If you have ever wondered how else a long denoising diffusion schedule can be useful, we have an answer for you!

Depth Completion addresses sparse, incomplete, or noisy measurements from photogrammetry or sensors like LiDAR. Sparse points aren’t just hard for humans to interpret — they also hinder downstream tasks.

Traditionally, depth completion was framed as image-guided depth interpolation. We leverage Marigold, a diffusion-based monodepth model, to reframe it as sparse-depth-guided depth generation. How the turntables! Check out the paper anyway 👇

🌎 Website: https://marigolddepthcompletion.github.io/
🤗 Demo: prs-eth/marigold-dc
📕 Paper: https://arxiv.org/abs/2412.13389
👾 Code: https://github.com/prs-eth/marigold-dc

Team ETH Zürich: Massimiliano Viola ( @mviola ), Kevin Qu ( @KevinQu7 ), Nando Metzger ( @nandometzger ), Bingxin Ke ( @Bingxin ), Alexander Becker, Konrad Schindler, and Anton Obukhov ( @toshas ). We thank
Hugging Face for their continuous support.

Julien BLANCHON PRO

AI & ML interests

Recent Activity

Organizations

Julien BLANCHON PRO

AI & ML interests

Recent Activity

Organizations

blanchon's activity