Hacker News new | ask | show | jobs
by CodesInChaos 1910 days ago
1. Is a database like that even copyrightable, especially in the US?

> United States: Uncreative collections of facts are outside of Congressional authority under the Copyright Clause (Article I, ยง 8, cl. 8) of the United States Constitution, therefore no database right exists in the United States. Originality is the sine qua non of copyright in the United States (see Feist Publications v. Rural Telephone Service). https://en.wikipedia.org/wiki/Database_right#United_States

2. I'm skeptical that using a GPLed database makes this library a derivative work of the GPLed database, though the "distribute as a part of the whole" clause still applies

> These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works

> But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.

2 comments

> 1. Is a database like that even copyrightable, especially in the US?

Yes, collections of data are very much copyrightable, especially in the US.

This is not just a list of mime-types. It is a list of mime-types and instructions on how to detect those mime-types.

I would have interpreted simple patterns (e.g. value x at offset y) as non copyrightable facts about the file format.

Complex patterns could be problematic though, since you could argue they are original programs.

See the Olson Timezone database[1] as another example of "simple patterns" that are very much copyrightable.

The act of curating a collection of what may be "simple facts" creates a copyrightable work.

A farmer's almanac of seasons and weather patterns is copyrightable, even though the bare facts that it tabulates are not.

[1]:(https://en.wikipedia.org/wiki/Tz_database#2011_lawsuit)

> Olson Timezone database

That lawsuit was dismissed, in fact the article you linked says as much.

I stand corrected -- I was working from memory and didn't spot that development. Thanks!

I'll have to do a bit more digging to see if my original point still holds, even though the example I used to illustrate it doesn't.

E.g. see https://www.dmlp.org/legal-guide/works-not-covered-copyright -

> there may be situations in which a compilation of facts may be protected if the creator of the original publication selected, coordinated, or arranged the facts in an original way. For example, a sports almanac may arrange baseball scores in a creative way, a genealogy chart may arrange birth dates in an original way, or a cookbook may arrange ingredients in a creative and original way as part of its recipes. In each of those instances, the creator of the work would have a copyright in the creative arrangement of the facts, but not the facts themselves.

Though https://www.copyright.gov/circs/circ33.pdf says about recipes,

> the Office cannot register recipes consisting of a set of ingredients and a process for preparing a dish. In contrast, a recipe that creatively explains or depicts how or why to perform a particular activity may be copyrightable. A registration for a recipe may cover the written description or explanation of a process that appears in the work, as well as any photographs or illustrations that are owned by the applicant

So I'm not clear where the boundary actually is on this one.

Is that bit about recipes why so many recipes online come with a story?
I took a closer look at this database and library.

The actual patterns are very simple and standardized:

* The base-case is checking if a certain byte-string can be found within a given offset range * patterns form a tree where at all patterns from the root to one leaf need to match, which amounts to a restricted form of expressing "AND" and "OR" expressions

So it looks like there is very little space for originality in expressing these patterns.

* It doesn't appear to be a curated database, but rather aims for completeness (i.e. the selection or arrangement shouldn't be covered by copyright) * Mime types and extensions are also very simple facts which can't be expressed in an original way * The human friendly format allows a bit more freedom, but is still quite limited

IANAL, but I'd guess this database is not copyrightable in the US, but protected in the EU since it recognizes database rights.

https://en.wikipedia.org/wiki/Database_right

Anytime you publish something, it is copyrighted. The data within may not be, but my presentation of it in a certain database certainly is.
When is my work protected?

Your work is under copyright protection the moment it is created and fixed in a tangible form that it is perceptible either directly or with the aid of a machine or device.

Source: https://www.copyright.gov/help/faq/faq-general.html

The relevant question is "What does copyright protect?"

> Copyright, a form of intellectual property law, protects original works of authorship including literary, dramatic, musical, and artistic works, such as poetry, novels, movies, songs, computer software, and architecture. Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed. See Circular 1, Copyright Basics, section "What Works Are Protected."

You need to argue that such a database is an original work, and not merely an uncreative collections of facts. I would at least consider simple patterns uncreative facts, but complex patterns might be considered copyrightable original works.

Both of your claims are incorrect, at least in the US. A work must pass a certain level of creative expression to be eligible for copyright and collections of plain facts, e.g a phone book are famously not copyrightable.