Hacker News new | ask | show | jobs
by DougBTX 1067 days ago
There's info about that here: https://replicate.com/a16z-infra/llama13b-v2-chat

> Run time and cost

> Predictions run on Nvidia A100 (40GB) GPU hardware. Predictions typically complete within 9 seconds.