Hacker News new | ask | show | jobs
by ForTheKidz 462 days ago
If chatgpt is scraping the web, why can they not link tokens to source of token? being able to cite where they learned something would explode the value of their chatbot. At least a couple of orders of magnitude more value. Without this chatbots are mostly a coding-autocomplete tool for me—lots of people have takes, but it's the tying into the internet that makes a take from an unknown entity really valuable.

Perplexity certainly already approximates this (not sure if it's at a token level, but it can cite sources. I just assumed they were using a RAG.)

1 comments

That's asking for the life stories and photos and pedigrees and family histories of all the chickens that went into your McNuggets. It's just not the way LLMs work. It's an enormous vat of pink slime of unknown origins, blended and stirred extremely well.

https://en.wikipedia.org/wiki/Pink_slime

you sort of can do (a decent approximation of) this, it’s just even the approximate version is impractical for computational reasons.

https://www.anthropic.com/news/influence-functions