Pretraining¶
Papers on pretraining methods, data quality, and scaling laws.
Overview¶
This section contains 3 papers covering:
- PaLM 2 - Compute-optimal scaling with multilingual data
- phi-1.5 - Textbook-quality synthetic data for small models
- In-Context Pretraining - Document ordering for better learning