Boqiang Zhang 张博强

Senior Research Engineer at Tencent AI Lab

Email: arvidzhang@tencent.com

Boqiang Zhang's photo

About Me

I hold a Master's degree from the University of Science and Technology of China (USTC), where I worked under the guidance of Prof. Hongtao Xie. I completed my Bachelor's degree at Northwestern Polytechnical University (NWPU).

My current research interests focus on multimodal large language models and unified understanding and generation. Earlier in my career, I concentrated on scene text recognition and editing, exploring both self-supervised and semi-supervised learning approaches.

Publications & Preprints

VideoLLaMA 3
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
Boqiang Zhang, Kehan Li, Zesen Cheng, Zhiqiang Hu, Yuqian Yuan, Guanzheng Chen, Sicong Leng, Yuming Jiang, Hang Zhang, Xin Li, Peng Jin, Wenqi Zhang, Fan Wang, Lidong Bing, Deli Zhao
ArXiv, 2025
TextGen
How Control Information Influences Multilingual Text Image Generation and Editing?
Boqiang Zhang, Zuan Gao, Yadong Qu, Hongtao Xie*
NeurIPS, 2024
CVPR 2024 paper
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing
Boqiang Zhang, Hongtao Xie*, Zuan Gao, Yuxin Wang
CVPR, 2024
LPV paper
Linguistic more: Taking a further step toward efficient and accurate scene text recognition
Boqiang Zhang, Hongtao Xie*, Yuxin Wang, Jianjun Xu, Yongdong Zhang
IJCAI, 2023
CLIPSTR paper
Symmetrical linguistic feature distillation with clip for scene text recognition
Zixiao Wang, Hongtao Xie*, Yuxin Wang, Jianjun Xu, Boqiang Zhang, Yongdong Zhang
ACM MM, 2023
SSM paper
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition
Zuan Gao, Yuxin Wang*, Yadong Qu, Boqiang Zhang, Zixiao Wang, Jianjun Xu, Hongtao Xie
IJCAI, 2024
I2CL paper
Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition
Bangbang Zhou, Yadong Qu, Zixiao Wang, Zicheng Li, Boqiang Zhang, Hongtao Xie*
IJCAI, 2024

Working Experience

  • Tencent AI Lab Logo
    Tencent AI Lab | ShenZhen | Jul. 2025 - Present
    Senior Research Engineer
    Topic: Multi-modal Large Language Model, Image/Video Understanding and Generation
  • Alibaba DAMO Academy Logo
    Alibaba DAMO Academy | Hangzhou | Jun. 2024 - Jul. 2025
    Research Intern
    Topic: Multi-modal Large Language Model, Image/Video Understanding, Embodied AI

Services

  • Conference Reviewer: NeurIPS, ACM MM, ICLR, ICML, TMM

Honors

  • Outstanding Graduate of USTC and Province Anhui, 2025
  • HuaWei Scholarship, 2023
  • Outstanding Graduate of NWPU, 2022 (top 5%)
  • National Scholarship, 2024, 2021, 2020, 2019
  • Outstanding Student of NWPU, 2020 (top 1%)