| Until fairly recently (historically), books were overwhelmingly scarce. A few datapoints: - The total number of books -- not titles, but actual bound volumes -- in Europe as of 1500 CE, was about 50,000. By 1800, the total was just under one billion. - The library of the University of Paris circa 1000 CE comprised about 2,000 volumes. It was among the largest in Europe. - The Library of Constantinople in the 5th century had 120,000 volumes, the largest in Europe at the time. - A fair-sized city public library today has on the order of 300,000 volumes. A large university library generally a millon or so. The Harvard Library contains 20 million volumes. The University of California collection, across all ten campuses, totals more than 34 million volumes. - The total surviving corpus of Greek literature is a few hundred titles. I believe many of those were only preserved through Arabic scholars, some possibly in Arabic translation, not the original greek. - There's an online collection of cuneiform tablets. These generally correspond to a written page (or less) of text, with the largest collections numbering in the tens of thousands of items. - As of about 1800, the library of the British Museum (now the British Library) had 50,000 volumes. Again, among the largest of its time. - From roughly 1950 - 2000, roughly 300,000 titles were published annually in the United States and/or English-language editions. R.R. Bowker issues ISBNs and tracks this. From ~2005 onward, "nontraditional" books (self- / vanity-published) have been about or above 1 million annually. - The US Library of Congress, the largest contemporary library in the world, holds 24 million books in its main collection (another 16 million in large type), and has 126 million catalogued items in total (2015). - At about 5 MB per book, in PDF form, total storage for the 38 million volumes of the Library of Congress would be slightly under 200 TB. At about $50/TB, that's $10,000 of raw disk storage. (Actual provisioning costs would be higher.) Costs are falling at 15%/year. - Total data in the world comprises far more than books, and has been doubling about every 2 years. Or stated inversely: half of all the recorded information of humankind was created in the past two years. Sources: Some of this is off the top of my head, but partial support for the facts from: https://en.wikipedia.org/wiki/History_of_printing#/media/Fil... https://en.wikipedia.org/wiki/History_of_libraries http://www.bowker.com/tools-resources/Bowker-Data.html https://www.loc.gov/item/prn-16-023/the-library-of-congress-... https://en.wikipedia.org/wiki/Harvard_Library https://en.wikipedia.org/wiki/University_of_California_Libra... https://www.techpowerup.com/249972/ssds-are-cheaper-than-eve... https://qz.com/472292/data-is-expected-to-double-every-two-y... |
> half of all the recorded information of humankind was created in the past two years
That is shocking to imagine, and it's exponentially growing.
It reminds me of Vannevar Bush's "As We May Think", pointing out the emerging information overload in society. It certainly puts things in perspective, how we (humanity) have been making a conscious, collaborative effort to develop globally networked computers, one of whose important functions is to help us organize all the information, including books.
The conundrum it seems is that technology is also a massive multiplier/amplifier of the amount of data, that its capacity to help us organize would never catch up to what it's helping to produce.
> total storage for the 38 million volumes of the Library of Congress would be slightly under 200 TB
I guess it's redundant to say, but I'm sure in the near future that would fit on a thumb drive!