The best definition of "undefined behavior" is: "if some other company created a clone of this console by following the spec to the letter, but otherwise making all their own decisions... would this game run?"
Of course, if you don't have cloners immediately nipping at your heels, you might end up releasing a smash hit that takes advantage of UB in a way that forces you to define the behavior as "it works however it needs to work to keep that game working", because eventually you yourself (the original console manufacturer) will tape out later revisions of the CPU, and you'll want to make sure they can run that game, given that it's "officially licensed" by you.
If you had the right static analysis tools in play, though, you might have wanted to run them over the game, notice the use of UB, and thus fail the game at the QA stage, before things need to escalate to that point.
I meant "best" as in "most practical for a third-party dev when deciding what behaviors to take advantage of." Defined behaviors are guaranteed to work in both 1. clone consoles that conform to the spec, and 2. future revisions of the official hardware. Undefined behaviors aren't.
I think you're confusing "revision" with "generation." Revisions are the difference between e.g. the original SNES, and the 1CHIP (SoC) SNES console, not between the original SNES and the SNES Classic.
Even though the 1CHIP SNES is a complete ground-up re-layout (has to be, turning a bunch of individual chips into one SoC), every (officially licensed) cartridge made for "the SNES" works on a 1CHIP SNES.
A hardware revision like this, requires retaining bug-for-bug UB compatibility with officially-sanctioned released game titles, because to do otherwise is to doom the console's support line to endless complaints from people whose games don't work. But hardware revisions are not concerned with keeping UB working the same where no officially-sanctioned released title took advantage of said UB. For everything not constrained by bug-for-bug compat with an existing title, all that's required of the new revision is that it keeps to the spec.
If you were a third-party in the middle of developing a new title, and were relying on some new UB you found, when suddenly your console mfgr released a new revision, it could turn out that your clever UB hack won't work the same on the new revision. (And this happened; it was why unlicensed titles—your Game Genies, your Aladdin Deck Enhancers, etc.—frequently wouldn't work with later console revisions.)
There's a fascinating story of Pilotwings being (very gently) bitten by a very slightly different hardware revisions that causes the idle animation plane to just barely crash rather than land safely: http://www.nintendolife.com/news/2019/05/random_the_captivat...
The argument that GP seems to be making is that only relying on defined behavior ensures that your game will work on all NES systems made by Nintendo as well as any clone systems that follow the specifications that Nintendo issued for programmers on the NES.
Does anyone actually care? This was embedded software designed to run on exactly one system, not some libc designed to run on 20 architectures ranging from 8-bit microcontrollers to VLIW supercomputers.
What I find interesting is looking for creative ways to prioritize the game experience even when the 'official spec' didn't support it. They could have given up by accepting that the spec represented the limit of what could be done, but instead pushed to find a better way.
I hadn't put thought into how important that scroll effect is to the game, but if there was a clean wipe between scenes it would have been tremendously distracting. This technique really is essential to the feeling of immersion.
I don't agree with the suggestion that there is no difference (other than word semantics) between relying on empirical observations and relying on a privately communicated piece of spec.
The one playing semantics is you, since your apparent concern is the definition of the term "undefined behavior" and whether something falls under that definition.
My point is about how well an engineering decision is justified, not what term applies to it according to some document.
We have no idea if it was even undefined in the first place, since the PPU documentation is not public.
You are correct thought, often guarantees about behavior can be made after the fact. Generally what happens these days is someone determines they need or want to use a certain undefined behavior. The hardware people then go, look at the RTL for the chip and confirm it behaves in a certain way, and then they UPDATE the documentation to explicitly document the newly guaranteed behavior.
Conversely, often something that is documented to work just doesn't. Especially in the case of console developers who tend to be using early steppings of custom silicon. In that case sometimes when you go to the HW team they fix it in the next stepping, sometimes they document it as errata and move on.
Of course, if you don't have cloners immediately nipping at your heels, you might end up releasing a smash hit that takes advantage of UB in a way that forces you to define the behavior as "it works however it needs to work to keep that game working", because eventually you yourself (the original console manufacturer) will tape out later revisions of the CPU, and you'll want to make sure they can run that game, given that it's "officially licensed" by you.
If you had the right static analysis tools in play, though, you might have wanted to run them over the game, notice the use of UB, and thus fail the game at the QA stage, before things need to escalate to that point.