Rongyao Fang's picture

Rongyao Fang

LucasFang

·

https://rongyaofang.github.io/

rongyaofang

AI & ML interests

Multimodal Large Language Model targeting AGI

Recent Activity

upvoted a paper 5 days ago

DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

upvoted a paper 6 days ago

Qwen3-VL Technical Report

liked a model about 1 month ago

Qwen/Qwen3-VL-32B-Thinking

View all activity

Organizations

None yet

authored a paper 3 months ago

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

Paper • 2509.09680 • Published Sep 11 • 43

authored a paper 7 months ago

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

Paper • 2505.17022 • Published May 22 • 27

authored a paper 9 months ago

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 53

authored a paper 12 months ago

StreamChat: Chatting with Streaming Video

Paper • 2412.08646 • Published Dec 11, 2024 • 18

authored 3 papers about 1 year ago

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Paper • 2303.05475 • Published Mar 9, 2023

RBGNet: Ray-based Grouping for 3D Object Detection

Paper • 2204.02251 • Published Apr 5, 2022 • 1

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Paper • 2410.13861 • Published Oct 17, 2024 • 56

authored a paper over 1 year ago

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Paper • 2403.12963 • Published Mar 19, 2024 • 8