Hacker News new | ask | show | jobs
by tobyjsullivan 1578 days ago
We can only speculate but I notice Grammarly has a feature for plagiarism detection[0].

> Ensure your work is fresh and original by checking it against 16 billion web pages.

How do they know what text is on 16B web pages? Presumably they have a web crawler of some sort.

[0] https://www.grammarly.com/plans

2 comments

>Presumably they have a web crawler of some sort.

Can confirm. Caught one of their bots on my site and called them out about it on Twitter.

They did not respond.

> Ensure your work is fresh and original

The page linked doesn't contain the text.

On a desktop browser, click the "Plagiarism detected" benefit under one of the plans - the text will show up as a tool tip.

That alone doesn't indicate they collected the 16 billion documents themselves, of course.