Hacker News new | ask | show | jobs
by jpmcb 762 days ago
Original author here: thanks for posting.

I'm glad this is making the rounds since I haven't seen alot on the "AI-DevOps" or infrastructure side of actually running an at-scale AI service. Many of the AI inference engines that offer an OpenAI compatible API (like vLLM, llama.cpp, etc.) make it very approachable and cost effective. Today, this vLLM AI service handles all of our batching micro-services which scrape for content to generate text on over 40,000+ repos on GitHub.

I'm happy to answer any / all questions anyone might have!

1 comments

great read. I hope more folks show off the infrastructure of AI. Its cool to see demos of products, but its a breath of fresh air to see how its made.