Architecture¶
Papers on novel architectures and efficiency improvements.
Overview¶
This section contains 10 papers covering:
- Ring-linear - Hybrid attention mechanisms for long context
- Core Attention Disaggregation - Efficient long-context training
- Looped Language Models - Latent reasoning via recurrence
- AutoDeco - Learned dynamic decoding parameters
- Mamba - Selective state space models with dynamic selection
- LongNet - Dilated attention for billion-token sequences
- YaRN - RoPE extension for long contexts
- ReLU Attention - Softmax-free attention for Vision Transformers
- LongLoRA - Efficient context extension with shifted sparse attention
- Relax - ML compiler unifying computational graphs