25 05

  • 7 papers accepted to ACL 2025.
  • We release a survey on learning from rewards, including reinforcement learning (in RLHF, DPO, and GRPO), reward-guided decoding, and post-hoc correction.
  • One paper accepted to ICML 2025.