Hacker News new | ask | show | jobs
by Sugimot0 1149 days ago
I know uutils/core-utils uses the old tests, which makes sense, that way you cover most of the intentional behavior. A more comprehensive method could be to generate a comprehensive set of random scripts with a capable LLM like GPT4 in identical vm's with the 2 different binaries and then log/diff each scripts behavior.