|
|
|
|
|
by seanwilson
310 days ago
|
|
> What animal is featured on a flag of a country where the first small British colony was established in the same year that Sweden's King Gustav IV Adolf declared war on France? ... My point is that if all knowledge were stored in a structured way with rich semantic linking, then very primitive natural language processing algorithms could parse question like the example at the beginning of the article, and could find the answer using orders of magnitude fewer computational resources. So as well as people writing posts in English, they would need to provide semantic markup for all the information like dates, flags, animals, people, and countries? It's difficult enough to get people to use basic HTML tags and accessible markup properly, so what was the plan for how this would scale, specifically to non-techy people creating content? |
|
This actually happened already and it's part of why llms are so smart, I haven't tested this but I venture a guess that without wikipedia and wikidata and wikipedia clones and stolen articles, LLMs would be quite a lot dumber. You can only get so far with reddit articles and embedded knowledge of basic info on higher order articles.
My guess is when fine tuning and modifying weights, the lowest hanging fruit is to overweigh wikipedia sources and reduce the weight of sources like reddit.