Skip to content

Training Methods

Papers on training techniques, distillation, and optimization methods.

Overview

This section contains 3 papers covering:

  • Generalized Knowledge Distillation - On-policy distillation with flexible divergences
  • DeepSpeed-Chat - End-to-end RLHF system with Hybrid Engine
  • ReST_EM - Self-training with binary feedback loops