I’m guessing reviewers have very limited time for each review. I don’t know what the internal process looks like, but I’m guessing an automatically generated screen listing the app’s permission requests, the justifications for each and other stuff? I would be surprised if there would be time for them to actually use any of the apps that they review or if they even have a phone to try apps on.
This seems exactly the kind of result I would expect from LLM automation. I would never trust any system that used LLM output without human review. Actually, I don’t think I would trust any system that used LLM output even with human review.