interesting - I was trying to think of a backtracking system as well, but wasn't able to create something efficient. Did it work well for large dense crosswords?
I had good success using a DAWG for pruning, and letter-wise (rather than word-wise) branching, picking the letter with fewest options. If you want to get fancy, you can use back-jumping instead of backtracking.