| HN Mirror

https://github.com/persimmon-ai-labs/adept-inference/issues/...

It’s funny you say production, because all of the errors I ran into suggest the container is expecting your production architecture.

My advice is stream first then make synchronous convenience wrappers on top of that. Also, lean on community standards for PoC. I’m guessing your investors are interested in making this scale as cheaply as possible, but that is probably the least important feature for people evaluating your model’s quality locally.