|
|
|
|
|
by cootsnuck
300 days ago
|
|
Super helpful to see actual examples of what it (roughly) can look like to deploy production inference workloads, and also the latest optimization efforts. I consult in this space and clients still don't fully understand how complex it can get to just "run your own LLM". |
|