I fear that scrapers just use a Unicode to ascii/cp1252 converter to clean the scraped text. Yes it makes scraping one step more expensive but on the other hand the Unicode injection gives legit use case a hard time
I was about to say, tricks like this work for a bit, and then are useless pretty quickly. Generally they make a lot more problems for the humans attempting to access the system at the end of the day.
Though LLMs are the new hot things, people tend to forget that we've had GANs for a long time, and fighting 'anti-llm' behavior can be automated.
Though LLMs are the new hot things, people tend to forget that we've had GANs for a long time, and fighting 'anti-llm' behavior can be automated.