Y
Hacker News
new
|
ask
|
show
|
jobs
by
dmarwicke
164 days ago
does this do continuous batching or just static? couldn't tell from the code
1 comments
ubermenchh
162 days ago
yes it does continous batching along with paged attention and prefix caching. i am also goint to be adding some more inference techniques
link