Yi Liu (刘熠)

Google Scholar     CV   

Senior Engineer at Honor Device Co., Ltd .

Personal Email : yiliu61richard@gmail.com

Research interests : Video-Language Modeling, Video Understanding

About Me

Now I work at Honor Device Co., Ltd. as the project leader (PL) of the On-device VLM Group, focusing on Vision-Language Models (VLM) and video understanding. I received my Ph.D. degree at MMLab@SIAT, University of Chinese Academy of Sciences, supervised by Prof. Yu Qiao and Prof. Yali Wang in 2024. And I was a research intern at Shanghai AI Laboratory from 2022 to 2023. I received a B.Eng. degree in Huazhong University of Science and Technology (HUST), Wuhan, China, in 2019.

Publications

Multimodal-LLM:
MagicGen: A Universal Multimodal Data Synthesis Agent for Domain-Specific Vision-Language Model Tuning, arXiv 2025 (In process, 第1通讯)
VideoCap-R1: Enhancing MLLMs for Video Captioning via Structured Thinking, arXiv 2025 (NeurIPS 2025 under review, 第2通讯)
MLLM-TA: Leveraging Multimodal Large Language Models for Precise Temporal Video Grounding, IEEE Signal Processing Letters, 2024 (SPL, IF=3.9, 第1作者)
Traditional Video Understanding:
F2S-Net: Learning Frame-To-Segment Prediction for Online Action Detection, Journal of Real-Time Image Processing, 2024 (JRTIP, IF=3.0, 第1作者)
Dual masked modeling for weakly-supervised temporal boundary discovery, IEEE Transactions on Multimedia, 2023 (TMM, IF=9.7, 第2作者)
Learning Discriminative Feature Representation for Open Set Action Recognition, ACM International Conference on Multimedia, 2023 (ACM MM, CCF-A, 第2作者)
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization, IEEE Transactions on Image Processing, 2022 (TIP, IF=13.7, 第1作者)
VideoPipe 2022 Challenge: Real-World Video Understanding for Urban Pipe Inspection, International Conference on Pattern Recognition, 2022 (ICPR, CCF-C, 第1作者)

Experience

Workshops

  • Student organizer of ECCV 2022 DeeperAction Challenge, Track 1: Temporal Action Localization
  • Student organizer of ICPR 2022 VideoPipe Challenge, Track 2: Temporal Defect Localization
  • Student organizer of ICCV 2021 DeeperAction Challenge, Track 1: Temporal Action Localization
  • 1st Prize in ECCV 2022 Ego4D Episodic Memory Challenge, Moments Queries Track
  • 1st Prize in ECCV 2022 Ego4D Episodic Memory Challenge, Looking At Me Track
  • Journal Reviewer

  • Neural Networks, Journal of Visual Communication and Image Representation
  • Pattern Recognition, International Journal of Computer Vision
  • IEEE Transactions on Pattern Analysis and Machine Intelligence