Hacker News new | ask | show | jobs
by dilipray 790 days ago
I faced latency challenges on llama2. So I haven’t used it.

Usually OpenAI response is within 2 seconds, but llama2 takes 20+ seonds for the same.

How are they tackling the performance?