Attention & MemoryFlashAttention, IO-aware algorithms, and why materializing attention matrices kills performance.Coming soon.