More

Some information about me


Education

Nanjing University
  • Major in Automation
  • 2018.9 - 2022.09
Tsinghua University
  • PhD in Department of Automation
  • 2022.9 - Presents



Adwards

  • 2021.10 Hengfang Scholarship
  • 2020.10 National Scholarship



Publications

2026

  1. Tech Report
    Seed2.0: A Large-Scale Production-Ready Foundation Model Series
    ByteDance Seed
    2026
  2. Tech Report
    Seed1.8 Model Card: Towards Generalized Real-World Agency
    ByteDance Seed
    2026
  3. ICLR 2026
    GlobeDiff: State Diffusion Process for Partial Observability in Multi-Agent Systems
    Yiqin Yang, Xinyu Yang, Yuhua Jiang, N. Mu, Hao Hu, R. Xie, Z. Zhang, S. Li, Y. Ni, Qianchuan Zhao, and others
    In ICLR 2026
  4. ICLR 2026
    OPRIDE: Efficient Offline Preference-based Reinforcement Learning via In-Dataset Exploration
    Yiqin Yang, Hao Hu, Y. Mao, J. Zhang, C. Wu, Yuhua Jiang, Xinyu Yang, R. Xie, Y. Fan, and others
    In ICLR 2026
  5. ICLR 2026
    Risk-Sensitive RL for Alleviating Exploration Dilemmas in Large Language Models
    Yuhua Jiang, J. Huang, Y. Yuan, X. Mao, Y. Yue, Qianchuan Zhao, and L. Yan
    In ICLR 2026

2025

  1. CogSci 2025
    DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration
    X. Li, Y. Ding, Yuhua Jiang, Y. Zhao, R. Xie, S. Xu, Y. Ni, Yiqin Yang, and Bo Xu
    In CogSci 2025
  2. MATH-AI’25
    PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier
    Yuhua Jiang, Y. Xiong, Y. Yuan, C. Xin, W. Xu, Y. Yue, Qianchuan Zhao, and L. Yan
    In NeurIPS 2025 MATH-AI Workshop 2025
  3. RA-L 2025
    Maximum Next-State Entropy for Efficient Reinforcement Learning
    Dianyu Zhong, Yiqin Yang, Z. Zhang, Yuhua Jiang, Bo Xu, and Qianchuan Zhao
    IEEE Robotics and Automation Letters 2025
  4. Tech Report
    Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning
    ByteDance Seed
    2025
  5. ICLR 2025
    Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
    Yiqin Yang, Q. Wang, Chenghao Li, Hao Hu, C. Wu, Yuhua Jiang, Dianyu Zhong, Z. Zhang, Qianchuan Zhao, and others
    In ICLR 2025
  6. ICLR 2025
    Episodic Novelty Through Temporal Distance
    Yuhua Jiang*, Qihan Liu*, Yiqin Yang, Xiaoteng Ma, Dianyu Zhong, Hao Hu, Jun Yang, Bin Liang, Bo XU, Chongjie Zhang, and Qianchuan Zhao
    In ICLR 2025
    Oral @ IMOL@NeurIPS 2024 (3/47, Top 7%)

2024

  1. NeurIPS 2024
    NeuralPlane: An Efficiently Parallelizable Platform for Fixed-wing Aircraft Control with Reinforcement Learning
    C. Xue, Qihan Liu, Xiaoteng Ma, X. Qin, Yuhua Jiang, G. Ning, Y. Qi, J. Ren, Bin Liang, and Jun Yang
    In NeurIPS 2024
  2. AAAI 2024
    Learning Diverse Risk Preferences in Population-based Self-play
    Yuhua Jiang*, Qihan Liu*, Xiaoteng Ma, Chenghao Li, Yiqin Yang, Jun Yang, Bin Liang, and Qianchuan Zhao
    In AAAI 2024
    Oral Presentation (Top 3%)

2022

  1. GitHub
    Light Aircraft Game: A Lightweight, Scalable, Gym-Wrapped Aircraft Competitive Environment with Baseline Reinforcement Learning Algorithms
    Qihan Liu, Yuhua Jiang, and Xiaoteng Ma
    2022