Just read through and wondering if you could shed a bit more light onto how you’re decreasing latency? Also would like to learn more about the types of prompting you found to be effective.
Self hosting is huge - you don't have to wait in queues like everyone else, and can host in a region close to other servers. Similarly, just getting on the enterprise tiers of the services you use can also guarantee lower latency.