| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by karpatic 599 days ago
	Great! I wish there was a "bang to buck" value. Some way to know the cheapest model I could use for creating structured data from unstructured text, reliably. Using gpt4o-mini which is cheap but wouldn't know if anything cheaper could do the job too.

3 comments

jampa 599 days ago

Take a look at Gemini Flash 1.5. I had videos I needed to turn into structured notes, and the result was satisfactory (even better than the Gemini 1.5 Pro, for some reason). https://jampauchoa.substack.com/i/151329856/ai-studio.

According to this website, the cost is half of the gpt4-o mini. 0.15 vs 0.07 per 1M token.

link

nostrebored 599 days ago

Seconding Gemini flash for structured outputs. Have had some quite large jobs I’ve been happy with.

link

sdesol 599 days ago

I haven't found a model at the price point of GPT-4o mini that is as capable. Based on the hype surrounding Llama 3.3 70B, it might be that one though. On Deepinfra, input tokens are more expensive, but the output token is cheaper so I would say they are probably equivalent in price.

Also, best bang for the buck is very subjective, since one person might need it to work for one use case vs somebody else, who needs it for more.

link

mcbuilder 599 days ago

I always plug openrouter.ai for making cross-model comparisons. It's my general goto for random stuff. (I am not affiliated, just a user)

link

pickettd 599 days ago

I love the idea of openrouter. I hadn't realized until recently though that you don't necessarily know what quantization a certain provider is running. And of course context size can vary widely from provider to provider for the same model. This blog post had great food for thought https://aider.chat/2024/11/21/quantization.html

link

avereveard 598 days ago

To expand a little, some providers may apply more aggressive optimization in periods of high load.

link