Y
Hacker News
new
|
ask
|
show
|
jobs
by
tarruda
148 days ago
I'm only interested in the local, single user use case. Plus I use a Mac studio for inference, so vLLM is not an option for me.
1 comments
mycall
147 days ago
You can get concurrency gains [0] as local/single user (multi-agent) use case with vLLM with your Mac Studio.
[0]
https://youtu.be/Ze5XLooTt6g?t=658
link
[0] https://youtu.be/Ze5XLooTt6g?t=658