| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stri8ed 951 days ago
	Do existing LLM's not already train on this data?

2 comments

brlewis 951 days ago

The linked tweet has a diagram where you can pretty quickly see that this isn't just about using wikidata as a training set. The paper linked from the tweet also gives a good summary on its first page.

link

kfrzcode 951 days ago

Nope. Training data for the big LLMs is a corpus of text, not structured data. There would be much more dimensionality with regard to parameterization as far as I understand when it comes to structured data

link