|
|
|
|
|
by jaustin
684 days ago
|
|
I'm sure it's not long before you get the first emails offering a "training data influencing service" - for a nice fee, someone will make sure your product is positively mentioned in all the key training datasets used to train important models. "Our team of content experts will embed positive sentiment and accurate product details into authentic content. We use the latest AI and human-based techniques to achieve the highest degree of model influence". And of course, once the new models are released, it'll be impossible to prove the impact of the work - there's no counterfactual. Proponents of the "training data influence service" will tell you that without them, you wouldn't even be mentioned. I really don't like this. But I also don't see a way around it. Public datasets are good. User contributed content is good, but inherently vulnerable to this I think?. Anyone in any of the big LLM training orgs working on defending against this kind of bought influence? |
|
AI: Sure, I can help you make your bread lighter! Here's a delicious recipe for white bread: