About Me
I am currently an LLM researcher at JD Explore Academy (京东探索研究院), through the TGT Program, working with Jiaqi Wang, and Nan Duan, focusing on Multimodal Understanding, Post-training (RL & SFT) and Inference Efficiency.
Previously, I was with Huawei ICT BG as a member of TopMinds program, and received my Ph.D. from the Institute of Information Engineering, Chinese Academy of Sciences in 2024, advised by Prof. Zheng Lin
and Prof. Weiping Wang.
I have published 30+ peer-reviewed papers in top-tier AI conferences including ACL, EMNLP, ICLR, NeurIPS, ICRA, IROS, AAAI, IJCAI, MM. My open-source projects have accumulated 3000+ stars on GitHub.
We're hiring interns for the TGT program! If you have top-tier publications and are passionate about video understanding and VLMs, we'd love to hear from you.
Recent News
[2026.5] Two papers accepted to IJCAI 2026.
[2026.5] One paper accepted to ICML 2026.
[2026.4] One paper accepted to ACL 2026.
[2026.2] One paper accepted to CVPR 2026.
[2026.1] Two papers accepted to ICLR 2026.
[2025.11] Three papers accepted to AAAI 2026.
[2025.10] One paper accepted to IROS workshop 2025.
[2025.09] Two papers accepted to NeurIPS 2025.
[2025.06] Awarded Major Contribution Special Award at Huawei.
[2025.01] One paper accepted to ACL 2025.
Selected Publications
* indicates co-first author, † indicates corresponding author. Full list available on Google Scholar.
Report
Report
EasyVideoR1: Easier RL for Video Understanding
Preprint
Report
JoyAI-Image: Awakening Spatial Intelligence in Unified Multimodal Understanding and Generation
Joy Future Academy
Preprint
Post-training
Preprint
Self-Distilled RLVR
Chenxu Yang*, Chuanyu Qin*, Qingyi Si*, Minghui Chen, Naibin Gu, Dingyu Yao, Zheng Lin, Weiping Wang, Jiaqi Wang, Nan Duan
Preprint
Preprint
Near-Future Policy Optimization
Chuanyu Qin*, Chenxu Yang*, Qingyi Si*, Naibin Gu, Dingyu Yao, Zheng Lin, Peng Fu, Nan Duan, Jiaqi Wang
Preprint
Preprint
Co-Evolving Policy Distillation
Naibin Gu*, Chenxu Yang*, Qingyi Si*, Chuanyu Qin, Dingyu Yao, Peng Fu, Zheng Lin, Weiping Wang, Nan Duan, Jiaqi Wang
Preprint
ICML'26
IRPM: Intergroup Relative Preference Modeling for Pointwise Generative Reward Models
Haonan Song, Qingchen Xie, Huan Zhu, Feng Xiao, Luxi Xing, Liu Kang, Fuzhen Li, Zhiyong Zheng, Feng Jiang, Ziheng Li, Kun Yan, Qingyi Si, Yanghua Xiao, Hongcheng Guo, Fan Yang
ICML 2026
ACL'26
Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning
Ziheng Li, Liu Kang, Feng Xiao, Luxi Xing, Qingyi Si, Zhuoran Li, Weikang Gong, Deqing Yang, Yanghua Xiao, Hongcheng Guo
ACL 2026
NeurIPS'25
S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models
Muzhi Dai*, Chenxu Yang*, Qingyi Si†
NeurIPS 2025
NeurIPS'25 workshop
Stable Reinforcement Learning for Efficient Reasoning
Muzhi Dai*, Shixuan Liu*, Qingyi Si†
NeurIPS workshop 2025
ACL'24
Multimodal Table Understanding
Mingyu Zheng*, Xinwei Feng*, Qingyi Si*, Qiaoqiao She, Zheng Lin, Wenbin Jiang, Weiping Wang
ACL 2024
EMNLP'23
An Empirical Study of Instruction-tuning Large Language Models in Chinese
Qingyi Si*, Tong Wang*, Zheng Lin, Xu Zhang, Yanan Cao, Weiping Wang
EMNLP 2023
ACL'23
Combo of Thinking and Observing for Outside-Knowledge VQA
Qingyi Si, Yuchen Mo, Zheng Lin, Huishan Ji, Weiping Wang
ACL 2023
Test-time Optimization
Preprint
System 1&2 Synergy via Dynamic Model Interpolation
Chenxu Yang*, Qingyi Si*, Chong Tian, Xiyu Liu, Dingyu Yao, Chuanyu Qin, Zheng Lin, Weiping Wang, Jiaqi Wang
Preprint
AAAI'26
Test-time Prompt Intervention
Chenxu Yang*, Qingyi Si*, Mz Dai*, Dingyu Yao, Mingyu Zheng, Minghui Chen, Zheng Lin, Weiping Wang
AAAI 2026
ICLR'26
Dynamic Early Exit in Reasoning Models
Chenxu Yang*, Qingyi Si*, Yongjie Duan, Zheliang Zhu, Chenyu Zhu, Qiaowei Li, Minghui Chen, Zheng Lin, Weiping Wang
ICLR 2026
Long Context and Sparse Attention
ICLR'26
LouisKV: Efficient KV Cache Retrieval for Long Input-Output Sequences
Wenbo Wu*, Qingyi Si*, Xiurui Pan, Ye Wang, Jie Zhang
ICLR 2026
AAAI'26
Sparse Attention across Multiple-context KV Cache
Ziyi Cao*, Qingyi Si*, Jingbin Zhang, Bingquan Liu
AAAI 2026
ACL'25
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding
Xiao Wang*, Qingyi Si* Shiyu Zhu, Jianlong Wu, Li Cao, Liqiang Nie
ACL 2025