| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by greenkey 2025 days ago

That’s awesome!

In comparison, here a quote from the OP’s blog entry:

“Fast forward to today. A program to load /usr/share/dict/words into a hash table is 3-5 lines of Perl or Python, depending on how terse you mind being. Looking up a word in this hash table dictionary is a trivial expression, one built into the language. And that's it. Sure, you could come up with some ways to decrease the load time or reduce the memory footprint, but that's icing and likely won't be needed. The basic implementation is so mindlessly trivial that it could be an exercise for the reader in an early chapter of any Python tutorial.

That's progress.”

But is a simpler, less efficient method progress? Sure it allows more words to be added/removed with ease, and I don’t want to advocate over-optimization, but the solution you made for the Spectrum seems better because words don’t change much. Why don’t we use a similar specialized hash and compressed dictionary format to increase spellchecking speed and allow more words in less space? We could still produce that format using /usr/share/dict/words and similar.

2 comments

RobAley 2025 days ago

> Why don’t we

Because we don't need to and we have much more interesting problems to take up our time.

link

darkwater 2025 days ago

But GP already solved the problem (at least for English and other Latin script languages). Why throw away those findings?

link

devenblake 2025 days ago

Problems tend to have more than one solution. GP's solution should be documented, yes, but the alternate solution that won out was computers being capable of storing a million words or so in plaintext very easily, and doing the same using their compression scheme just isn't really worth the space saved nowadays.

link

ASalazarMX 2024 days ago

Also, compressing could actually be slower for modern computers. Remember when compressing your hard disk made your PC faster, up until disks became faster, then it actually made it slower?

Today's CPUs are very fast, so the trend could have flipped again, that would be an interesting benchmark.

link

RobAley 2025 days ago

Implementation takes time. Keep it simple and move on.

link

nenadst 2025 days ago

I always thought that we still use a trie or (to save memory) ternary search trees for that..

link

greenkey 2025 days ago

How many operations and objects? The method he’s talking about would seem more efficient for the purpose vs all of the strings still being created even if never used in the plain hash version.

link