|
|
|
|
|
by defytonofficial
11 hours ago
|
|
This matches my experience. I've been using OpenRouter with
GPT-4o for an image verification service, and the prompt
engineering choices have a measurable impact on cost. One thing I found: asking the model to respond in structured
JSON (with a strict schema) vs free-form text cuts token output
by ~40% on average. The model stops "explaining itself" and just
gives you the answer. Also noticed that including a reference image in vision calls
roughly doubles the input cost but improves accuracy enough that
you save on retries. Net cost ended up lower for my use case. Curious if you've measured the difference between asking for
"concise" output vs actually constraining the response format. |
|