😃 Greetings!
I have joined Alibaba DAMO Academy as a research scientist via the Alibaba Star (‘阿里星’) program, working on cutting-edge problems in foundation models. I also hold a research position at Zhejiang University, working with Prof. Yi Yang. I obtained my PhD from Zhejiang University in the beautiful summer of 2024 (from Sept. 2019), under the supervision of Prof. Dong Ni, Prof. Samuel Albanie (University of Cambridge/DeepMind), Deli Zhao (Alibaba DAMO) and Shiwei Zhang (Alibaba Tongyi Wan Team). I have undertaken a visiting Ph.D. program at MMLab@NTU, supervised by Prof. Ziwei Liu and Dr. Chenyang Si.
My representative projects include VGen (includes InstructVideo), VideoComposer, the Lumos series (includes Lumos-1, UniLumos), the RLIP series (v1 and v2), the DreamVideo series (v1 and v2) and ModelScopeT2V.
I am currently interested in Video World Models, Multi-modal Large Language Models and Video Unified Models. My ultimate goal is to achieve interactive intelligence to shed light on human well-being. I lead a small research team to work on these topics. Feel free to contact me if you need research instructions.
Generally, my current research interests include:
- 1️⃣ 🌟🌟🌟 Generative models: video world models, visual generation, visual autoregressive models, and visual generation alignment;
- 2️⃣ Representation learning: vision-language models, video understanding, visual relation detection (HOI detection/scene graph generation);
- 3️⃣ AI for science and engineering.
📧 I am recruiting interns to work on cutting-edge problems in foundation models, so feel free to drop me an email at hj.yuan@zju.edu.cn if you are interested in collaborating with me as a full-time researcher/intern or remotely. You can refer to this Job Description (it might be outdated though).