Hacker News new | ask | show | jobs
Apriel-H1: Towards Efficient Enterprise Reasoning Models (arxiv.org)
1 points by guiriduro 177 days ago
1 comments

Apriel-H1-15b-Thinker-SFT uses incremental distillation from Apriel-Nemotron-15B-Thinker, selectively replacing less critical attention layers with linear Mamba blocks to reduce computational complexity while preserving reasoning quality.