Shikhar Singh's picture

107 514

Shikhar Singh

AxAI

·

axe--

AI & ML interests

Commonsense & Language Grounding

Recent Activity

liked a Space about 5 hours ago

fffiloni/expression-editor

liked a Space about 6 hours ago

yanze/PuLID-FLUX

liked a dataset 1 day ago

juppy44/gbif-plants-raw

View all activity

Organizations

None yet

upvoted an article 6 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

8 days ago

•

224

upvoted an article 10 days ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

+10

Aug 5

•

509

upvoted a collection 12 days ago

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 398

upvoted an article 13 days ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

+3

Jul 29

•

202

upvoted a collection 14 days ago

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 12 items • Updated 9 days ago • 140

upvoted an article 19 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

+4

Feb 20

•

315

upvoted a paper about 1 month ago

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 95

upvoted an article about 1 month ago

Article

Fine-Tune a Semantic Segmentation Model with a Custom Dataset

Mar 17, 2022

•

30

upvoted 3 articles about 2 months ago

Article

Fine-Tune ViT for Image Classification with 🤗 Transformers

Feb 11, 2022

•

55

Article

PP-OCRv5 on Hugging Face: A Specialized Approach to OCR

Sep 10

•

108

Article

Supercharge your OCR Pipelines with Open Models

+5

Oct 21

•

273

upvoted an article 3 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21

•

234

upvoted a collection 3 months ago

August 29 Releases

40 items • Updated Sep 1 • 7

upvoted a paper 3 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 208

upvoted an article 4 months ago

Article

Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL

Jan 10, 2024

•

74

upvoted 2 collections 4 months ago

Qwen2.5-VL (All Versions)

All versions of Qwen2.5-VL including the new 32B version and 4-bit, 16-bit and more! • 16 items • Updated 6 days ago • 22

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 549

upvoted 3 articles 5 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

Mar 12

•

473

Article

Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub

Jun 27

•

29

Article

Vision Language Models (Better, faster, stronger)

+3

May 12

•

568