Hacker News new | ask | show | jobs
by theta_d 1913 days ago
> 1. Is a database like that even copyrightable, especially in the US?

Yes, collections of data are very much copyrightable, especially in the US.

This is not just a list of mime-types. It is a list of mime-types and instructions on how to detect those mime-types.

1 comments

I would have interpreted simple patterns (e.g. value x at offset y) as non copyrightable facts about the file format.

Complex patterns could be problematic though, since you could argue they are original programs.

See the Olson Timezone database[1] as another example of "simple patterns" that are very much copyrightable.

The act of curating a collection of what may be "simple facts" creates a copyrightable work.

A farmer's almanac of seasons and weather patterns is copyrightable, even though the bare facts that it tabulates are not.

[1]:(https://en.wikipedia.org/wiki/Tz_database#2011_lawsuit)

> Olson Timezone database

That lawsuit was dismissed, in fact the article you linked says as much.

I stand corrected -- I was working from memory and didn't spot that development. Thanks!

I'll have to do a bit more digging to see if my original point still holds, even though the example I used to illustrate it doesn't.

E.g. see https://www.dmlp.org/legal-guide/works-not-covered-copyright -

> there may be situations in which a compilation of facts may be protected if the creator of the original publication selected, coordinated, or arranged the facts in an original way. For example, a sports almanac may arrange baseball scores in a creative way, a genealogy chart may arrange birth dates in an original way, or a cookbook may arrange ingredients in a creative and original way as part of its recipes. In each of those instances, the creator of the work would have a copyright in the creative arrangement of the facts, but not the facts themselves.

Though https://www.copyright.gov/circs/circ33.pdf says about recipes,

> the Office cannot register recipes consisting of a set of ingredients and a process for preparing a dish. In contrast, a recipe that creatively explains or depicts how or why to perform a particular activity may be copyrightable. A registration for a recipe may cover the written description or explanation of a process that appears in the work, as well as any photographs or illustrations that are owned by the applicant

So I'm not clear where the boundary actually is on this one.

Is that bit about recipes why so many recipes online come with a story?
I took a closer look at this database and library.

The actual patterns are very simple and standardized:

* The base-case is checking if a certain byte-string can be found within a given offset range * patterns form a tree where at all patterns from the root to one leaf need to match, which amounts to a restricted form of expressing "AND" and "OR" expressions

So it looks like there is very little space for originality in expressing these patterns.

* It doesn't appear to be a curated database, but rather aims for completeness (i.e. the selection or arrangement shouldn't be covered by copyright) * Mime types and extensions are also very simple facts which can't be expressed in an original way * The human friendly format allows a bit more freedom, but is still quite limited

IANAL, but I'd guess this database is not copyrightable in the US, but protected in the EU since it recognizes database rights.

https://en.wikipedia.org/wiki/Database_right