Neat! I came up with a similar statistic (3.65 average moves, hard mode, no failures) using a very simple heuristic.
1) pick a first guess, and partition the remaining words by the hints that they give
2) given any part of a partition, pick a word that minimizes the maximum-size part of the resulting sub-partition
3) repeat (2) until words are solved.
I brute-forced the first guess (that is, generated a tree rooted at each word), and the best one was 'bland'. That has one failure, getting stuck on the chain (hound, mound, pound, sound), which ended up being easy to fix manually.
My tree is quite different from yours, with guess distribution 1, 103, 861, 1114, 212, 24.
Yeah I'm on easy mode. With hard mode on I had some failures with words that only differ by 1 letter. I was happy with the result though and didn't care to try and optimize the hardmode.
I think your heuristic is pretty similar to mine, just accomplished in a different way.
My tree is quite different from yours, with guess distribution 1, 103, 861, 1114, 212, 24.