Bio

Research Scientist, Qingyun Talent Program
Tencent Hunyuan Team, Shenzhen

I am an AI researcher focused on reinforcement learning and large language models, with broader interests in large-scale training, scaling, and applications. My work has been recognized with Best Paper Runner-up (NeurIPS 2024 FITML Workshop), Oral Presentation (UAI 2023, ICLR 2024), and Spotlight Presentation (NeurIPS 2023) distinctions.

Experience

02/2026 - Present. Research Scientist, Tencent Hunyuan LLM, Shenzhen.

05/2025 - 01/2026. Top Seed Intern, ByteDance Seed, Beijing.

10/2021 - 08/2022. Intern, Tencent AI Lab, Shenzhen.

07/2019 - 06/2020. Research Assistant, Nanjing University, Nanjing.

Education

08/2020 - 01/2026. Ph.D. in Data Science, The Chinese University of Hong Kong, Shenzhen. Advisor: Prof. Tom Luo.

08/2015 - 06/2019. B.E. in Electrical Engineering, Xi'an Jiaotong University. Advisor: Prof. Zhiyuan Liu.

Contributions

I have contributed to large-scale model efforts including Seed-OSS, Seed 1.8, and Hy3 Models, with work spanning reinforcement learning, post-training systems, and evaluation for agentic and reasoning models.

My work has been adopted and widely used in open-source training libraries including verl and LLaMA-Factory, and has been deployed in production settings at Huawei, Alibaba, ByteDance, and Tencent.

Research areas

RL training stability, training efficiency, algorithm-infrastructure co-design, and theoretical analysis for large-scale post-training.

I was advised by Prof. Tom Luo, a prominent applied mathematician in optimization and signal processing. My academic lineage extends to Prof. John Tsitsiklis at MIT, a pioneer of reinforcement learning and co-inventor of actor-critic algorithms.