| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mattnewton 1620 days ago
	That's really cool. But unless I am misunderstanding this, that still puts the burden on the existing web though right, it's just avoiding having to retrain the model? If there is no economical market for humans to produce new content about a topic how will the search engine find the "ground truth" content?

1 comments

visarga 1620 days ago

You might want to use a limited subset of the web, a curated list of sources or feeds. Apparently 1TB of text could be enough, just need to collect it or download it from a trusted source.

link

mattnewton 1620 days ago

So, suppose there is a new kind of cocktail that is popular in bars near me that nobody has written about under it's new trendy name.

How do I ask this system about the recipe, or the history of the cocktail? Someone has to write an article about it, right? How do they get paid if it gets scraped once and people go to the scraping model for the answer instead of visiting the original article's page?

link