Hacker News new | ask | show | jobs
by pavelstoev 836 days ago
We build software acceleration for LLM, effectively running smaller llama2 models at the same performance on several L4's as on 1xA100.