Hacker News new | ask | show | jobs
by bzbarsky 4541 days ago
It means to use as the key for max 3(number of things this pattern matches) - (number of characters in pattern). The basic idea is that patterns that match more things are good, while patterns that are long are bad; if you select the pattern with the maximal score for the above expression it's matching more things than others or is shorter than others, or both.

Why the "3" bit, good question. Would be interesting to see what happens with other relative weights of number of matches and length.

1 comments

Norvig did say why he chose 3: " I may have chosen a bad tradeoff. (I arbitrarily decided that matching a winner is 3 times more important than spending a character (because a disjunction seems to take about 3 characters on average).)"