Hacker News new | ask | show | jobs
by jajko 719 days ago
Its 2024, we can trivially preserve 100% of the original form of any book digitally, with all original intentions, bugs in creation etc. The actual physical book itself becomes just a curiosity (or investment), nothing fundamentally necessary.

With such discussions we often move from facts to emotions of specific individuals/groups. Some people feel strong emotions about absolute preservation of such items. Others, not so much, priorities in our lives lie elsewhere.

3 comments

"Trivially" might be overstating it. If you do the scan at 1000 dpi and get several people to check it was done properly then very probably you will never want to refer to the original again, because a scan is much easier to consult: just turning to a particular page is much faster, and you can zoom in without having to manipulate a physical magnifying glass. But I've worked with book scans done by ordinary people using ordinary equipment and I have on many occasions found that the scan wasn't quite good enough and I've taken the trouble to check with a paper copy whether a certain short word was printed in italics or whether the exact shape of the serif can help me disambiguate a letter which was badly printed in the original (the plate wasn't properly inked or whatever). I've never personally experienced a situation in which I've said to myself, "I wish I had another copy of this same edition to compare against", but I can imagine that might happen sometimes. So I think it is worth keeping multiple copies of the same edition, if it's an "important" book, or you have some reason for suspecting that it might be important: you don't know for sure what people in the future will be interested in.
Even 1000dpi might not cut it. Remember the stories about really old texts on parchment that they discovered had been re-written on reused parchment? At certain specific wavelengths they could "see" the original text. If you just have a plain old RGB scan, that's no longer there.

More for paintings, and less so for books (except very old handwritten ones), is that the ink or paint used has its own characteristics. It might look very different from different angles due to the light reflecting off the pigment in different ways. Doesn't matter how high a resolution scan you have, it's going to be hard to scan that, and even harder to display it on a screen.

Have you looked at old books in the google books archive? You quite frequently find pages where the page-turning robot has screwed up, with results like distortion, lack of focus, weird smearing effects as the page was still moving when the picture was taken, or plain thinks it can't deal with like fold-outs. I recently tried to find a 'plate' (illustration) in an engineering journal from ~1864; it was missing from both the google and archive.org scans; probably due to being a fold-out. If all the archives throw their paper copies away, quite a lot of information will be lost.
Fold-outs are the worst. I'm building a magazine encyclopedia and magazines are chock-full of weird inserts and fold-outs that never appear in any scans. I am trying to build an archive of all the physical magazines so that at some point people can go through and check them against the scans to see what's missing.

Scanning systems are set up generally to only scan "regular" media without any surprises.

In 100 years when you can do even better scans, how will you make them without an original? You'll be stuck with whatever we were capable of in the moment when we decided to not give a shit about the original anymore.

It's not trivial to "preserve 100%" of a physical item at all. You can't hold a digitized copy or take samples. You can't get at anything that's beyond the resolution and nature of your scan.

The notion that digital copies simply contain all relevant information is on its face naive because it's inherently lossy to create a scan of a physical item.

> when we decided to not give a shit about the original anymore.

Poe wrote a Manuscript. That's the original. Do we have the manuscript? No? Then we do not have and have long since decided to "not give a shit about" the original.

At the time, the way to make your Manuscript (one document you wrote) accessible to a large number of potential readers was to publish it, which is why these books exist. But if Poe had instead been alive in 2024 and made a TikTok that would be the same, why don't you focus on preserving each hard disk used by TikTok? Does it feel irrelevant? But these are the vitally important originals, aren't they?

[Edited to fix spelling of TikTok, I'm sorry I'm old]

It depends on the book. In the case of Kafka's "The trial", for example, the manuscript is "the original" because the author never finished the work and all editions were made after his death based on that manuscript. However, with most books what happened is that the author sent the manuscript to the printer/publisher and there was then a bit of back-and-forth during which changes suggested by proofreaders were accepted or rejected by the author, other last-minute changes were made by the author, and so on, so in that case the manuscript is best thought of as "a draft" rather than "the original".

Usually the last edition produced during the author's life and under the author's supervision gives you the most reliable indication of the author's intention, what they call the "copy-text" in textual criticism, I think, but there are some interesting exceptions. For example, many people think the first edition of Mary Shellely's "Frankenstein" is better than a later edition, in which the older author appears to have wanted to make certain passages more respectable, less shocking, and thus arguably spoilt the work she produced in her youth. But that's presumably a matter of opinion. For scientific textual criticism we need all the editions, all the manuscripts, any manually corrected proofs or published copies that the author has written on, ...

The existence of distinct variant works is mildly interesting, and so there's (diminishing with volume) value in preserving variation, Big Wave (神奈川沖浪裏)† is a woodblock print, like many modern low scale printing processes for art it produced definitely different items each time, if you've seen Big Wave in a local city gallery and can see another somewhere they're certainly not just the same thing, the way every student's identical print of a famous band poster is.

But on the other hand, most of the value is in the core thing, not the variation. Modulo special cases like the "Wicked Bible" (which has a misprinting in the Ten Commandments) we care about that and not the variation.

It's easy to say that Blade Runner is an important movie. Hard up for preservation capacity if we're choosing obviously Blade Runner beats Ghostbusters 2. Easy. Ghostbusters 2 wasn't a terrible movie but it's no Blade Runner. OK, how about the "Director's Cut"? Sure OK, two copies of Blade Runner. The "Final Cut"? Ugh. Fine, OK, let's have three versions of Blade Runner rather than kick one out for a sub-par sequel to Ghostbusters. There are four more commercially released versions of the movie. That's too many. At some point we should cut our losses and say no, another Blade Runner is not worth it, we'll have Ghostbusters 2 instead thanks.

† That's deliberately not an accurate translation, but we all know what I'm talking about.

Original printings and original manuscripts are different types of originals, and contain different types of data for scholars. They are not fungible.

Do you presume that nobody is bothering to try to preserve natively digital culture? Because I assure you people do. Do you presume nobody will care in the future? Why would that be the case?

What a strange, smugly uninformed angle.