Hacker News new | ask | show | jobs
by javajosh 475 days ago
That only captures your output, not your input. The best people to simulate in this world would be so-called terminally online people virtually all of whom's input is itself online. So for those who've read a lot of paper books or done a lot of traveling or had a lot of offline conversations or relationships, I think it would be difficult to truly simulate someone.
2 comments

I think aggregate information across billions of humans can compensate. It would be like a human personality model, that can impersonate anyone. How do you train such a model? Simple -

Collect texts with known author and date. They can be books, articles, papers, forum and social network comments, emails, open source PRs, etc. Then assign each author a random ID, and train the model with "[Author-ID, Date] Text", and also "Text [Author-ID, Date]". This means you have a model that can predict authors and impersonate them. You can simulate someone by filling in the missing pieces of knowledge from the personality model.

Currently LLMs don't learn to assign attribution or condition on author. A whole layer of insight is lost, how people compare against each other, how they evolve over time. It would allow more precise conditioning by personality profile.

While I agree somewhat with my sibling comment's assertion that "aggregate information across billions of humans can compensate", somewhat, I'd like to offer that a lot of important output is non-digital, as well!

For example, lately I've spent a lot of time with resin printers, laser cutters, vacuum chambers, and the meaningful positioning of physical models on large sheets of paper. It'll be a while yet before my haphazard, freewheeling R&D methods are replicable by robots. (Although it's tough to measure the economic value of these labors.)