I wonder why the script doesn't just try to parse the metadata (and maybe H1) from the submitted site? That would impartially cover most cases, I think.
If a title is misleading or linkbait, we're meant to adjust it on submission. It'd be a good idea if the rule were just that you must use original titles, but it's a bit more nuanced and does seem to need some human oversight.