Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
jinaai
/
jina-vlm
like
37
Follow
Jina AI
1.49k
Image-Text-to-Text
Transformers
Safetensors
30 languages
jvlm
text-generation
multimodal
vlm
vision-language
qwen3
siglip2
conversational
custom_code
arxiv:
2512.04032
License:
cc-by-nc-4.0
🇪🇺 Region: EU
Model card
Files
Files and versions
xet
Community
2
Deploy
Use this model
fix-dtype
#1
by
florian-hoenicke
- opened
18 days ago
base:
refs/heads/main
←
from:
refs/pr/1
Discussion
Files changed
+436
-876
This PR is in
draft mode
florian-hoenicke
Jina AI org
18 days ago
No description provided.
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Publish this branch
This branch is in draft mode, publish it to be able to merge.
Comment
·
Sign up
or
log in
to comment