LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 17 days ago • 72
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published Nov 14, 2025 • 15
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow Paper • 2509.21789 • Published Sep 26, 2025 • 9
StrandDesigner: Towards Practical Strand Generation with Sketch Guidance Paper • 2508.01650 • Published Aug 3, 2025 • 6
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation Paper • 2504.18087 • Published Apr 25, 2025 • 5
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO Paper • 2503.16921 • Published Mar 21, 2025 • 6