Hacker News new | ask | show | jobs
by kromem 1048 days ago
Ghost attention is only used in the 70B model in llama 2, FWIW.

So would need to make sure comparing apples to apples.