Hacker News new | ask | show | jobs
by dmarwicke 164 days ago
does this do continuous batching or just static? couldn't tell from the code
1 comments

yes it does continous batching along with paged attention and prefix caching. i am also goint to be adding some more inference techniques