The paper Pipe-RLHF: A Computation Mode-Aware Parallel Framework for RLHF has been selected as the cover feature for the latest issue of Journal of Computer Science and Development, a prestigious CCF-A ranked journal.

Pipe-RLHF is a novel parallel acceleration framework designed for Reinforcement Learning from Human Feedback (RLHF) training. It innovatively proposes a computation pattern-aware adaptive parallel strategy, breaking through the efficiency bottleneck caused by the traditional sequential execution of RLHF training stages.

By leveraging asynchronous PPO (Proximal Policy Optimization) algorithms to deeply exploit inter-stage parallelism, the framework significantly improves training efficiency without compromising model performance. Compared to existing methods, Pipe-RLHF achieves up to a 3.7X speedup, demonstrating its effectiveness and superiority.