Hacker News new | ask | show | jobs
by diego898 551 days ago
This is great thank you!

Does any one know of something similar in python? I want to share with my team something similar to this that goes into (almost) everything (at least conceptually) needed to efficiently serve an LLM.

It doesn’t actually need to be performant mind you (it’s in python) I just need something “conceptually complete” while being more “tutorial style” and concise than vLLM codebase

1 comments

Using Jax, you should be able to get good performance out of the box