Hacker News new | ask | show | jobs
by achempion 1420 days ago
Great work

I'm working on similar dictionary app and found wiktionary insanely usable as dictionary source.

Here is one more project aiming to make wiktionary data usable as json data structure: https://github.com/tatuylonen/wiktextract.

It has a link to a site https://kaikki.org/ which hosts dictionary data dumps.

1 comments

Thanks! Yeah I've seen a few similar projects (particularly written Java which I wasn't excited about). That looks like a nice project. It's written in Python and says it can take from an hour to several days depending on the computer and they don't recommend running it on a Mac.

I don't have up-to-date benchmarks but my project is written in Go and everything was designed to be as highly parallel as possible, broken up into multiple pipeline steps (splitting the Wiktionary dump, lexing, parsing, resolving, etc...) with a high emphasis on performance so I would assume it's faster but would need to do a head-to-head test.