Y
Hacker News
new
|
ask
|
show
|
jobs
by
jezzarax
741 days ago
I checked some logs from my past experiments, the decoding went for about 400 tps over a ~3k token query, so about 7 seconds to process it, and then the generation speed was about 28 tokens.