|
|
|
|
|
by jpreagan
345 days ago
|
|
I've been using LLMPerf for a while to evaluate the performance of our inference servers (vLLM, SGLang, etc.). It works great, but I was running into memory constraints while testing large number of concurrent users on some servers, and didn't always find the specific Python version requirements convenient. So, I rewrote the benchmark aspect of this tool in Rust to get an easy single-line install. I hope its useful to others as well, and would love to hear feedback if you have any suggestions for improvement. |
|