|
|
|
|
|
by speedgoose
984 days ago
|
|
I would have wikipedia and a dump of some of the most important research papers (from sci-hub?). If size isn't a limit, a copy of the latest common crawl dataset. If size is really restricted and it's only one file, then I would seriously consider LLaMa2 70B. It hallucinates, but in terms of knowledge in about 100GB I don't think you can find anything better. |
|
Your LLaMa2 suggestion was very thought provoking and meritorious, there might be some path forward with something like that, even if for some neutral knowledge steward AI to be the Interface of the Database.
AIDBI?