|
|
|
|
|
by fennecfoxy
347 days ago
|
|
With that opinion, are you also suggesting that we ban ad blockers? Because it's better I not click & consume resources than click and not be served ads, basically just costing the host money. It means sense to allow for RAG in the same way that search engines provide a snippet of an important chunk of the page. A blog author could not complain that their blog is getting ragged when they're extremely liable to be Google/whatever searching all day and basically consuming others' content in exactly the same way that they're trying to disparage. |
|
I get that everyone wants data, but presumably the big players already scraped the web. Do they really need to do it again? Or is it bit players reproducing data that's likely already in the training set? Or is it really that valuable to have your own scraped copy of internet scale data?
I feel like I'm missing something here. My expectation is that RAG traffic is going to be orders of magnitude higher than scraping for training. Not that it would be easy to measure from the outside.