Training Methods¶

Papers on training techniques, distillation, and optimization methods.

Overview¶

This section contains 3 papers covering:

Generalized Knowledge Distillation - On-policy distillation with flexible divergences
DeepSpeed-Chat - End-to-end RLHF system with Hybrid Engine
ReST_EM - Self-training with binary feedback loops