Hacker News new | ask | show | jobs
by asboans 311 days ago
It would be fun to train an LLM with a knowledge cutoff of 1900 or something
4 comments

Someone tried this, I saw it one of the Reddit AI subs. They were training a local model on whatever they could find that was written before $cutoffDate.

Found the GitHub: https://github.com/haykgrigo3/TimeCapsuleLLM

That’s been done to see if it could extrapolate and predict the future. Can’t find the link right now to the paper.
This one? "Mind the Gap: Assessing Temporal Generalization in Neural Language Models" https://arxiv.org/abs/2102.01951
The idea matches, but 2019 is a far cry from, say, 1930.
In 1930 there was not enough information in the world for consciousness to develop.
You mean information in digestible form.
I think this is a meta-allusion to the theory that human consciousness developed recently, i.e. that people who lived before [written] language did not have language because they actually did not think. It's a potentially useful thought experiment, because we've all grown up not only knowing highly performant languages, but also knowing how to read / write.

However, primitive languages were... primitive. Where they primitive because people didn't know / understand the nuances their languages lacked? Or, were those things that simply didn't get communicated (effectively)?

Of course, spoken language predates writings which is part of the point. We know an individual can have a "conscious" conception of an idea if they communicate it, but that consciousness was limited to the individual. Once we have written language, we can perceive a level of communal consciousness of certain ideas. You could say that the community itself had a level of shared-consciousness.

With GPTs regurgitating digestible writings, we've come full circle in terms of proving consciousness, and some are wondering... "Gee, this communicated the idea expertly, with nuance and clarity.... but is the machine actually conscious? Does it think undependably of the world, or is it merely a kaledascopic reflection of its inputs? Is consciousness real, or an illusion of complexity?"

Llama are not conscious
Not sure we have enough data for any pre-internet date.
That would be hysterical