Why even use an LLM? A classifier is perfectly suited for this kind of thing, and they aren't new. As far as I can tell, this is what is often used in the real world, and is incredibly cheap compared to an LLM, so GitHub/$OTHER_PLATFORM could totally run it on everything posted. They could even use a classifier as a first filtering step, then run a smarter model on flagged comments. (Then let a human double-check, right? Right?)
Using OpenAI GPT-4, sure. On the other hand, running a small, fine-tuned LLM shouldn't be that resource intensive.
It may not be any better than the current generation of spam detection, but it will not require rule updates, at least not that frequently.