| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by itsumoiru 704 days ago
	As you mentioned, the data processing inequality[1] applies here, but I imagine synthetic data could help training squeeze out more from the existing data. [1] https://en.m.wikipedia.org/wiki/Data_processing_inequality

2 comments

worstspotgain 704 days ago

It's neat how a longer "digestive tract" loses entropy, but can make up for it by making more sense of things. It's akin to adding a NN layer, to a more computationally-intensive lossy compression algorithm, or to asking a LLM to explain the problem domain and the relevant variables (populating attention) before getting to the point.

It's probably true for people too. Instead of asking an expert for an opinion right away, ask them to discuss the options out loud first.

link

skybrian 704 days ago

There are probably a lot of applications where the LLM could rely more on data that's supplied to it just-in-time in the context window, and less on specialist knowledge from its training set.

Also, "natural" data taken from the Internet is probably quite inefficient as training material. It's going to have a lot of duplication. You only need each fact once to be able to synthesize more examples of it.

link