Nice idea. Can I choose a strategy for token reduction, based on what I'm optimzing for? I might be ok with a quality drop for a great cost savings, for example.
Yeah, you can set roughly the target ratio through the api (for example, target_ratio=.3), though our api will try to maximize the quality given the this target ratio (and it might add a couple more tokens to do so)
Fair criticism. I replaced that borrowed hero treatment and redeployed the homepage with a native Rose/Adola hero today. It was a bad first impression for a technical audience, especially for something asking people to trust an API in their LLM pipeline.