Hacker News new | ask | show | jobs
by akudha 1419 days ago
Dictionaries are one of those things that should be free, at least the digital version. This is 2022 - shouldn't governments pay the dictionary creators using tax payer money and make the digital version available for free, to anyone for any use?

How much does it cost to maintain a dictionary anyway? A few million dollars at best? It is crazy that a ton of projects don't even get started, because these APIs are so expensive and unfriendly

4 comments

Dictionaries are more free today than ever before. It's actually surprising how free they are given that (good) dictionaries are actually very labour intensive to create.
Why should governments get to decide which words we use? Unless you are a plusgood citizen making agitprop for the proles.
It is the case in France. We have the Académie Française, an institution that edits an "official" dictionary. Unlike other dictionaries that describe the words that are in current usage, this one is supposed to act as a reference for "proper French".

The Académie Française is not a ruling body anymore, however, there is a committee that decides on which words to use, following a very bureaucratic process. French people are still free to use French as the way they want (thankfully!), but it is mandatory for official government communication.

And in case you are wondering, the ones who decide are usually famous French writers who got a honorific position for their past work. They tend to be completely out of touch with the modern world, and with a bureaucratic process that doesn't help the result is more silly than manipulative.

  > Unless you are a plusgood citizen making agitprop for the proles.
what?
1984 newspeak
Almost all Newspeak, but “agitprop” is Russian and predates the book.
That's a strange take.

They won't they would fund some other org or dept to do the work. Same way you have a public education system, or public health care.

A dictionary is a creative, curated and editorial work. While I could see, in the US for example, the National Endowment for the Humanities supporting the work, it is very far removed from something like the National Institute if Standards and Technology.

It is not a given that such work is best handled through the public purse.

It’s also not clear it won’t work.

Governments fund the arts all the time. Why is there automatically an assumption that this will end with the government editing the dictionary to control speech?

I just made something like that: https://public.law/dictionary

Scraping government sites for glossaries, mashing up the definitions to create a free comparative international dictionary.

What’s wrong with Wiktionary, which the author used for this?
Sounds like it was a bit tedious (and expensive) to hammer the server for 30 hours... vs. a downloadable database (which, granted, could get out of date).
Yeah, I wonder why they didn’t just download it from https://dumps.wikimedia.org/enwiktionary/ per https://en.wiktionary.org/wiki/Help:FAQ.
The source they used (freeDictionary) does all of the work of parsing wiktionary and giving you simple json object for each word. (It actually started life as a way to get the definitions out of Google's "define:" operator, but it seems those days are over)

It also, turns requests for some words, like "rolling" into definitions for their root word like "roll", even when wiktionary has distinct and useful definitions for the word, which makes it less than ideal for me.

Turning "rolling" into "roll" is something a stemmer does. I'd imagine there are a number of JavaScript libraries readily available for this.
I wondered the same. The main downside is that you need to do some processing to extract the entries from the dump and get the plain text of the fields you want.

I'm also a little surprised they didn't think Wiktionary was sufficient for languages apart from English. I could be wrong, but my impression is that it's pretty good for major languages[1].

1. https://meta.wikimedia.org/wiki/Wiktionary