| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sebzim4500 686 days ago
	You can specify the max length of the response, which presumably includes the hidden tokens. I don't see why this is qualitatively different from a cost perspective than using CoT prompting on existing models.

2 comments

BoorishBears 686 days ago

For one, you don't get to see any output at all if you run out of tokens during thinking.

If you set a limit, once it's hit you just get a failed request with no introspection on where and why CoT went off the rails

link

Aeolun 686 days ago

Why would I pay for zero output? That’s essentially throwing money down the drain.

link

dartos 686 days ago

You can’t verify that you’re paying what you should be if you can’t see the hidden tokens.

link

sebzim4500 686 days ago

With the conventional models you don't get the activations or the logits even though those would be useful.

Ultimately if the output of the model is not worth what you end up paying for it then great, I don't see why it really matters to you whether OpenAI is lying about token counts or not.

link

dartos 686 days ago

As a single user, it doesn’t really, but as a SaaS operator I want tractable, hopefully predictable pricing.

I wouldn’t just implicitly trust a vendor when they say “yeah we’re just going to charge you for what we feel like when we feel like. You can trust us.”

link