Y
Hacker News
new
|
ask
|
show
|
jobs
by
ubermenchh
164 days ago
yes it does continous batching along with paged attention and prefix caching. i am also goint to be adding some more inference techniques