| Hey HN, I'm a 20yo solo builder from India, I got frustrated that every capable AI model
assumes you have a GPU, a credit card, or
reliable internet. None of those are true for
most of the world — including me. So I started digging into the compression literature and ways through which i can solve this problem What I found:
- DeepSeek distilled 671B reasoning into 1.5B
that runs on a laptop
- TRM (Samsung, 2025) beat DeepSeek R1 on
ARC-AGI with 7M parameters by iterating
instead of scaling
- RWKV runs in constant memory with no
quadratic attention cost
- GRPO lets you specialize a tiny model on a
narrow domain in hours on CPU The techniques exist. What doesn't exist:
a systematic effort to apply all of them together,
specifically for low-resource languages and
low-end hardware, and give the results away free. I'm building this. Calling it KIRO. The goal is simple: take every major open source
frontier model, compress it into domain-specific
versions under 500MB, and deploy them offline
on the cheapest Android hardware available. Starting with math/physics education
because that's the problem I know personally.
Expanding to healthcare triage, legal aid,
and agricultural advisory. Currently running my first experiment on my i3
— R1-1.5B vs Qwen-7B on Hindi math problems.
Will post results when training finishes. Two honest questions for HN: 1. Is anyone else working on this specific
intersection — compression + low-resource
languages + offline deployment? 2. What would make this genuinely useful vs
just technically interesting to you? Everything will be open source. |
I would publish: 1) your domain specific eval set 2) your model's results on that eval set 3) biglab's model's results on that eval set
That would give users a way to determine if your model is actually capable in that reduced domain