Just read through and wondering if you could shed a bit more light onto how you’re decreasing latency? Also would like to learn more about the types of prompting you found to be effective.
Self hosting is huge - you don't have to wait in queues like everyone else, and can host in a region close to other servers. Similarly, just getting on the enterprise tiers of the services you use can also guarantee lower latency.
Yeah, all the safety concerns are very valid. Easiest immediate fixes that we're building into our system are 1) preventing people from scaling before doing KYC, 2) creating a content moderation filter on the AI's prompts