Alignment¶
Papers on alignment techniques and fine-tuning methods for large language models.
Overview¶
This section contains 6 papers covering:
- LIMA - Minimal alignment with 1000 examples
- Large Reasoning Models - Learning alignment from flawed thinking
- RECAP - Mitigating capabilities forgetting during RL
- PPO-max - Stabilizing RLHF with careful PPO engineering
- Reducing Sycophancy - Synthetic data to reduce agreement bias
- TÜLU 2 - Advanced instruction-tuning dataset mixture