Knowledge Transfer

Distillation, supervised fine-tuning, QLoRA, and fitting large model capabilities into smaller ones.