Y
Hacker News
new
|
ask
|
show
|
jobs
by
alekandreev
726 days ago
This is mostly about inference speed, while maintaining long context performance.