What is working better for us is to spin the application up and poke at it with HTTP requests and look for non 200 response codes.
The tradeoff is it's not as comprehensive. But giving users something that is 95% working and making it easy for them to fix those issues appears to be the best user experience we've found so far.
It's a test in the sense that it's meant to validate functionality. You're correct there.
The endpoints we poke at are provided as context when creating the application.
Our approach evolved to be more liberal in what was required to pass. So instead of looking for an HTML element with id="foo" we accept a 200 HTTP response code. It's subtle but had a huge improvement in the end user experience.
The tradeoff is it's not as comprehensive. But giving users something that is 95% working and making it easy for them to fix those issues appears to be the best user experience we've found so far.