M.Eng. student at Peking University. Foundation models, agentic RL, diffusion LMs.

I am Haojin Yang (杨昊锦), a M.Eng. student in Software Engineering at Peking University (2024–2027). I received my B.Eng. in Software Engineering from Nanjing University in 2024.

My research focuses on reinforcement learning for large language models — in particular multi-turn / agentic RL, credit assignment under sparse business rewards, and diffusion language models with adaptive decoding schedules.

I am currently a research intern at StepFun Foundation Models (since 2026-04, advised by Ruihang Miao). Previously, I worked on multi-turn RL for industrial sales agents at Meituan Longcat Interaction (2025-10 to 2026-04, advised by Jingqing Ruan), on RLHF for Excel agents at Microsoft Research Asia DKI (advised by Ran Jia), and on GraphRAG at Baidu TPG (advised by Ziwei Jin).

Feel free to reach out at yhj [at] stu [dot] pku [dot] edu [dot] cn.

prof_pic.jpg

School of Software & Microelectronics

Peking University

Beijing, China

news

Apr 2026 Started as Agent RL Research Intern at StepFun Foundation Models.
Apr 2026 Harmonizing Dense and Sparse Signals in Multi-turn RL (DuCA) is accepted to ACL 2026 (Poster).
Mar 2026 VADE is accepted to CVPR 2026 Findings.
Jan 2026 WavefrontDiffusion is accepted to ICLR 2026 (Poster). 🎉
Oct 2025 Started as Foundation Algorithm Research Intern at Meituan Longcat Interaction.

selected publications

  1. ICLR’26
    WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning
    Haojin Yang, R. Hu, Z. Sun, and 3 more authors
    In International Conference on Learning Representations (ICLR), 2026
    Poster
  2. ACL’26
    Harmonizing Dense and Sparse Signals in Multi-turn RL: Dual-Horizon Credit Assignment for Industrial Sales Agents
    Haojin Yang, A. Jian, X. Huang, and 5 more authors
    In Annual Meeting of the Association for Computational Linguistics (ACL), 2026
    Poster
  3. EMNLP
    Asymmetric On-Policy Distillation: Bridging Exploitation and Imitation at the Token Level
    N. Jia, Haojin Yang, X. Ma, and 6 more authors
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2026
    Under Review (co-first author)