| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by XenophileJKO 818 days ago
	So my experience creating products on GPT3.5-Turbo is that there is an upper limit to how much instructional complexity the model can handle at a time. It isn't really about "adding computation", though you are doing this. The key is to construct the process so that the model only has to focus on a limited scope to make the decision on. In effect you are kind of creating a tree structure of decisions that build off of each other. By generating intermediate tokens the model can now only pay attention to the smaller set of already collapsed decisions. It is a little more complicated than that as the model will create anticipatory behavior where intermediate steps get biased by an incorrect result that the model anticipates.

1 comments

Also I should say it isn't just instructional complexity, it is ambiguity which creates the upper limit on capability.