Hacker News new | ask | show | jobs
by lupire 833 days ago
The selection is expressive.

https://www.copyright.gov/comp3/chap300/ch300-copyrightable-...

The Originality Requirement for Compilations

A compilation may contain several distinct forms of authorship:

• Selection authorship involved in choosing the material or data that will be included in the compilation;

• Coordination authorship involved in classifying, categorizing, ordering, or grouping the material or data; and/or

• Arrangement authorship involved in organizing or moving the order, position, or placement of material or data within the compilation as a whole

1 comments

Indeed. The creator of wordle started with a reasonably exhaustive list of ~13000 five letter words. That list won't be copyrightable.

But the first prototype wasn't so fun, because it would often pick a word the player didn't even know existed.

So the creator (and his partner) manually classified the entire list based on if they knew the word or not; splitting the list into two groups, words which might appear as solutions and words that won't appear as solutions but will still be accepted.

This manual classification step and splitting it into two groups makes a very good argument for the wordlists meeting the criteria for copyright.

Source: https://slate.com/culture/2022/01/wordle-game-creator-wardle...

What's your basis for that? I'm very skeptical. Intuitively, whether a word is within the working vocabulary of a sample of the population is an objective fact, not creative expression.

Do you know of any case law to the contrary?

And, as it turns out, it was the author's girlfriend who categorized each of the words. Not the author. If there is copyright in the selection (which I doubt), NYT doesn't appear own it.

The case law is linked above, the "Feist Publications, Inc., v. Rural Telephone Service Co" lawsuit that sets some minimum guidelines for what counts as a copyrightable arrangement of facts. And that standard is pretty low, it basically just requires some kind of authorship.

The courts care about amount the method used to create the collection. You are right that if the wordlist had been created by selecting the top ~2000 words from an objective vocab survey or a frequency of use list, then it's unlikely to be eligible.

But the wordle wordlist wasn't created that way, it was "authored". They also filtered out offensive words.

> as it turns out, it was the author's girlfriend who categorized each of the words. Not the author. If there is copyright in the selection (which I doubt), NYT doesn't appear own it.

That's a bold claim. Nobody has seen the wordle sale paperwork, but I'm willing to bet that the lawyers went out of their way to make sure the copyright of the wordlist was assigned to NYT and the creator's partner was fairly compensated.

One of the primary reasons for sale was because the creator didn't want to deal with all the clones, so they would have bought in expensive intellectual property lawyers to make sure the sale was done right.

I am the one who cited Feist.

You have cited no case law to support your wild, speculative claim about how it applies in this case.

You have cited no factual source for your wild, speculative claims that Wardle's partner was deemed to have a copyright interest in the word list or transferred such interest to NYT.

Sigh.

If you refuse to actually read the findings of Feist (or at very least the wikipedia page [1] that does a good job of summarising the ruling and it's implications), then I'm not really sure I can be bothered to repeat and expand upon the above explications of how it applies to this case.

To quote wikipedia:

The ruling has major implications for any project that serves as a collection of knowledge. Information (facts, discoveries, etc.) from any source is fair game, but cannot contain any of the "expressive" content added by the source author. That includes not only the author's own comments, but also their choice of which facts to cover, which links to make among the bits of information, the order of presentation (unless it is something obvious like alphabetical), evaluations of the quality of various pieces of information, or anything else that might be considered the author's "original creative work" rather than mere facts.

The key part for this case being "Their choice of which facts to cover".

> You have cited no factual source for your wild, speculative claims that Wardle's partner was deemed to have a copyright interest in the word list or transferred such interest to NYT.

How can I? As I said, nobody has seen the paperwork, so there is no factual source that says either way. And it really doesn't matter. What does matter is the possibility that NYT do have the correct paperwork.

There is no way to be sure about the possibility that the wordlist might be copyrighted (or not) and who actually owns the copyright, short of a full court case on this exact issue.

I'm not a lawyer. But I suspect any intellectual property lawyer who was asked about this topic would advise their client against using the offical wordle wordlists. Not because they know for sure, but out of caution.

Besides, it's really not that hard for someone to derive their own wordlist from base principles (as you have pointed out above). We are only talking about a few days of effort if they take the same approach of manual classification and the piece of mind for closing a possible legal venerability is (in my opinion) more than worth it.

[1] https://en.wikipedia.org/wiki/Feist_Publications,_Inc.,_v._R....

> Sigh. If you refuse to actually read the findings of Feist...

Well, since I've litigated this issue in federal court (with a major credit bureau as our client), I feel pretty confident I have read Feist in its entirety quite a few times. Perhaps you should reconsider your approach here.

> the piece of mind for closing a possible legal venerability

This is moving goal posts. The advice I would give a client is a question of acceptable legal risk and cost-benefit analysis. By contrast, you claimed that there was "a very good argument for the wordlists meeting the criteria for copyright," which is a different question that sounds solely in legal analysis.

I have only done a cursory search, but I am not aware of any case law that establishes that a list of words based on whether the word is known, rather than on a creative editorial decision, is amenable to copyright. When asked, you became emotional and condescending, rather than providing any support for your position. As it stands, there appears to be no basis in law or fact to support your "very good argument."