Hacker News new | ask | show | jobs
by stri8ed 951 days ago
Do existing LLM's not already train on this data?
2 comments

The linked tweet has a diagram where you can pretty quickly see that this isn't just about using wikidata as a training set. The paper linked from the tweet also gives a good summary on its first page.
Nope. Training data for the big LLMs is a corpus of text, not structured data. There would be much more dimensionality with regard to parameterization as far as I understand when it comes to structured data