Publication*: indicating equal contribution or alphabetic ordering. ![]() CoRT: Code-integrated Reasoning within ThinkingChengpeng Li*, Zhengyang Tang*, Ziniu Li*, Mingfeng Xue, Keqin Bao, Tian Ding, Ruoyu Sun, Benyou Wang, Xiang Wang, Junyang Lin, Dayiheng Liu arXiv:2506.09820 ![]() Quality-Diversity Red-Teaming: Automated Generation of High-Quality and Diverse Attackers for Large Language ModelsRen-Jian Wang, Ke Xue, Zeyu Qin, Ziniu Li, Sheng Tang, Hao-Tian Li, Shengcai Liu, Chao Qian arXiv:2506.07121 ![]() Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPOPeter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin arXiv:2505.11595 ![]() Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference AlignmentXueyao Zhang, Yuancheng Wang, Chaoren Wang, Ziniu Li, Zhuo Chen, Zhizheng Wu The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025 ![]() Controlling Large Language Model with Latent ActionsChengxing Jia, Ziniu Li, Pengyuan Wang, Yi-Chen Li, Zhenyu Hou, Yuxiao Dong, Yang Yu The 42nd International Conference on Machine Learning (ICML), 2025 ![]() Adam-mini: Use Fewer Learning Rates To Gain MoreYushun Zhang, Congliang Chen, Ziniu Li, Tian Ding, Chenwei Wu, Diederik P. Kingma, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun The 13th International Conference on Learning Representations (ICLR), 2025 ![]() Preserving Diversity in Supervised Fine-tuning of Large Language ModelsZiniu Li, Congliang Chen, Tian Xu, Zeyu Qin, Jiancong Xiao, Zhi-Quan Luo, Ruoyu Sun The 13th International Conference on Learning Representations (ICLR), 2025 ![]() Understanding and Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and InterventionTianyun Yang, Ziniu Li, Juan Cao, Chang Xu The 13th International Conference on Learning Representations (ICLR), 2025 ![]() Enabling Scalable Oversight via Self-Evolving CriticZhengyang Tang*, Ziniu Li*, Zhenyang Xiao*, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin arXiv:2501.05727 ![]() RealCritic: Towards Effectiveness-Driven Evaluation of Language Model CritiquesZhengyang Tang*, Ziniu Li*, Zhenyang Xiao*, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin arXiv:2501.14492 ![]() Pruning for Robust Concept Erasing in Diffusion ModelsTianyun Yang, Ziniu Li, Juan Cao, Chang Xu NeurIPS Workshop on Safe Generative AI, 2024 ![]() Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order OptimizationHeshen Zhan, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Findings), 2024 ![]() Sensing Jamming Strategy from Limited Observations: An Imitation Learning PerspectiveYoulin Fan, Bo Jiu, Wenqiang Pu, Ziniu Li, Kang Li, Hongwei Liu IEEE Transactions on Signal Processing (TSP) ![]() ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language ModelsZiniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo The 41st International Conference on Machine Learning (ICML), 2024 ![]() On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching RegularizationJiancong Xiao, Ziniu Li, Xingyu Xie, Emily Getzen, Cong Fang, Qi Long, Weijie J. Su arXiv:2405.16455 ![]() Why Transformers Need Adam: A Hessian PerspectiveYushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo Conference on Neural Information Processing System (NeurIPS) 38, 2024 ![]() When is RL better than DPO in RLHF? A Representation and Optimization PerspectiveZiniu Li*, Tian Xu*, Yang Yu The 12th International Conference on Learning Representations (ICLR) (Tiny Paper Track), 2024 ![]() Imitation Learning from Imperfection: Theoretical Justifications and AlgorithmsZiniu Li*, Tian Xu*, Zeyu Qin, Yang Yu, Zhi-Quan Luo Conference on Neural Information Processing System (NeurIPS) 37, 2023 ![]() Provably Efficient Adversarial Imitation Learning with Unknown TransitionsTian Xu*, Ziniu Li*, Yang Yu, Zhi-Quan Luo The 39th Conference on Uncertainty in Artificial Intelligence (UAI), 2023 ![]() Deploying Offline Reinforcement Learning with Human FeedbackZiniu Li, Ke Xu, Liu Liu, Lanqing Li, Deheng Ye, Peilin Zhao arXiv:2303.07046 ![]() Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled AnalysisTian Xu*, Ziniu Li*, Yang Yu, Zhi-Quan Luo arXiv:2208.01899 ![]() Rethinking ValueDice: Does It Really Improve Performance?Ziniu Li*, Tian Xu*, Yang Yu, Zhi-Quan Luo The 10th International Conference on Learning Representations (ICLR) (Blog Track), 2022 ![]() A Note on Target Q-learning for Solving Finite MDPs with A Generative OracleZiniu Li*, Tian Xu*, Yang Yu arXiv:2203.11489 ![]() HyperDQN: A Randomized Exploration Method for Deep Reinforcement LearningZiniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo The 10th International Conference on Learning Representations (ICLR), 2022 ![]() A Concise Introduction to Imitation Learning (In Chinese)Tian Xu, Ziniu Li, Yang Yu Online Available ![]() Error Bounds of Imitating Policies and Environments for Reinforcement LearningTian Xu, Ziniu Li, Yang Yu IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021 ![]() Error Bounds of Imitating Policies and EnvironmentsTian Xu, Ziniu Li, Yang Yu Conference on Neural Information Processing Systems 34 (NeurIPS), 2020 ![]() Efficient Exploration by Novelty-pursuitZiniu Li*, Xiong-Hui Chen* The 2nd International Conference on Distributed Artificial Intelligence (DAI), 2020 ![]() Self-Guided Evolution Strategies with Historical Estimated GradientsFei-yu Liu, Ziniu Li, Chao Qian The 29th International Conference on Joint Artificial Intelligence (IJCAI), 2020 |