Hacker News new | ask | show | jobs
by sphinxster 1283 days ago
Thank you for this interesting insight I haven't seen before.

Are there any datasets out there that provide the full edit stream of a human from idea to final refinement, that a model could be trained on?

1 comments

REPL transcripts (i.e. bash sessions, python REPL, etc) tend to be pretty good demonstrations of "working up to a conclusion". And, not coincidentally, putting GPT in a REPL environment yields better results.

Other good examples narratives that include a lot of internal monologue. Thing a book written in the form:

> The sphinx asked him, "A ham sandwich costs $1.10. The ham costs $1 more than the bread. How much does the bread cost?"

> He thought carefully. He knew the sphinx asked tricky problems. If the ham costs a dollar more than the bread, the bread couldn't possibly be more than 10 cents. But if the bread was 10 cents, the ham would be $1.10 and the total would be $1.20. That can't be. We need to lose 10 cents, and it has to be divided evenly among the ham and bread to maintain the dollar offset. So the ham must be $1.05 and the bread must be $0.05. He answered the sphinx confidentally "The bread is $0.05!".