imo the more usual concern would be mistakenly removing code that was not dead.
Without a design document, we might think there would be no senders of #factorial without understanding that the intention was to invoke that method on the command line.
For some meaning of "contain things it shouldn't":
Given that the base image we started working with didn't "contain things it shouldn't" why could we not be completely sure that base image + source code fileIn didn't "contain things it shouldn't" ?
First, as most other companies using ST we were not working on the original image, but on a company-specific one that had been in use for a long time and contained a lot of company-specific stuff. On the other hand, there were also things in the original images of the commercial STs that one would or should not ship with the product.
Not sure what you're up to; there were things in the image nobody had the source code (or the current version) anymore, but even with the source code it was a nightmare; it would undoubtedly have been less bad had I then had tools like I recently built for ST80 and the knowledge gained with them.
That was my personal impression as a trained engineer based on relevant experience, not a condemnation; I on the other hand find it annoying when some people glorify Smalltalk in retroperspective and attribute to it any great qualities without evidence and against better knowledge; nevertheless, the technology is impressive in a historical context and worth studying (but I still wouldn't use it for industrial projects anymore and would also not like to tempt other people).
We tested until our ears fell off; in particular, we had also to test all the stuff that the compiler (or linker) would otherwise have told us in a statically typed language; there was no other way; might explain why I directly switched to Ada for some time after ST; but how do you test for dead code (i.e. things in the image not used but still there for some reason)? It is not for nothing that people speak of a "Big ball of Mud" in the context of image based languages.
> … no algorithm that can identify all dead code reliably…
Why were you so concerned with the removal of all dead code?
Compared to "… nobody had the source code…" it seems like a minor issue.