Hacker News new | ask | show | jobs
by binyang_qiu 39 days ago
For me, the interesting part here isn't GPT-2, it's the memory discipline. I feel like most inference runtimes slowly leak allocations everywhere as features pile up.