It’s both a destruction of signal and an injection of noise. Imagine you worked for Adidas and you started getting a stream of notifications about your brand, and they were all nonsense. This would be an annoyance and harm the reputation of that monitoring service.
They would have received multiple complaints about it from customers, performed an investigation, and ultimately perform a manual excision of the junk data from their system; both the raw scrapes and anywhere it was ingested and processed. This was probably a simple operation, but might not have been if their architecture didn’t account for this vulnerability.
I also didn't follow that part. Their step 2 seem to be a general-purpose bot detection strategy that works independently of their step 1 ("randomly mention companies").
That was my first thought too -- but then why would the bot company care about a few false positives?
I suppose it could have an impact if 30% of all, say, Coca Cola mentions on the web came from that site, but then it would have to be a very big site. I don't think the bot company would notice, let alone care, if it was 0.01% of the mentions.
They dont want to feed their model with garbage data, or this data is read and revieved by real humans
I remember years-ago (2008?) I worked in a company where every mention of it was manually reviewed by someone from PR department.
I imagine now the tools are even better.
Different thing is that discussion is often very low quality (forums died for multiple reasons, reddit is dying too - astro-turf gallore now)
They would have received multiple complaints about it from customers, performed an investigation, and ultimately perform a manual excision of the junk data from their system; both the raw scrapes and anywhere it was ingested and processed. This was probably a simple operation, but might not have been if their architecture didn’t account for this vulnerability.