Hacker News new | ask | show | jobs
by sjnair96 1724 days ago
Ahh, we don't have access to the server. It's closed - an NVIDIA inference engine. Which under the hood talks to their Triton engine. Unfortunately, while Triton allows configuring the limit, the layer in front of it eats our channel options which have the message size configurations.