Hacker News new | ask | show | jobs
by wand3r 891 days ago
This was a problem before "AI" and will definitely be a bigger problem going forward. That said, there are more great books than I could read in 100 life times published before 2020, so I doubt this will be an issue for me personally in practice. For current events or evolving knowledge like new science or technology, I would probably have that summarized by an LLM anyway as a preference. The author is really saying "AI is hurting authors".
10 comments

The complaint that too many books are being written and that most of them are useless nonsense is indeed as old as time -- or at least, as old as books.

Montaigne (1533-1592) famously said il y devroit avoir quelque coerction des loix contre les escrivains ineptes et inutiles ("there should be laws against stupid and useless writers").

This was about just a hundred years after the invention of the printing press.

I remember wondering what books on the stock market would have been like before like the 50s as a curiosity. I've taken 2 CFA exams and have a Finance degree and noticed that none of the major concepts introduced seem to go back further than the 70s and market was definitely older than that.

So I searched on Google Books and limited the timeline... What I found were the pretentious ramblings of an old man that thought way too highly of himself for getting lucky at a time that a little luck could make you a fortune to last a lifetime. He wrote a bunch of books on the market. It was like 1910s or something. They were so worthless, but I did find them hilarious to read.

Nah, those are different. Especially the second one. lmao. It also happened to have few sales till a bit later on. I think my goal back then was to see what someone would find if they were just trying to learn about it at the library. Reason why is because the ROI using the Fama-French data back then would suggest someone could have rapidly become a millionaire and billionaire off of a thousand dollars if they performed even the most basic value investing formula back then to bifurcate the market or, better yet, divide it into deciles based on (book equity)/(market equity). Only in the past 10 years, you'd actually underperform the market return.

So, considering a very simple formula could have been so effective, I wondered what they'd find at the library and if even that simple concept would have been available. What wouldn't be available is SEC data. But, as your second suggestion made clear, you could get the information for ratings agencies.

I mean, comparatively speaking you were looking for programming books in an era where hardly anyone even knew what a computer was. The market changed so drastically at the turn of the century that it really only shaped into our current recognizable form in the 50s and 60s. There's many reasons for that.
To be clear, I expected NOT to find anything that would help accomplish that task. Because the vast opportunities would imply it didn't exist. So, yeah, what I expected to find is crap and what I found was hilarious crap. The dude did have a way with words, though. It's just hilarious how pretentious the books were considering they didn't really impart any useful knowledge.
GenAI is not making a qualitative change in spam & junk content, it's making a quantitative change. Previously, you had to wade through some noise to get to the signal, but with everyone making scammy content with genAI you'll have to wade through 100x noise to find the 1x signal.

Just like existence of email didn't create the concept of spam, it just made sending it much, much, cheaper.

Zoe Bee did a video essay[0] on her experience as a ghostwriter producing essentially the kind of junk that is now being automated with LLMs. In her case she was (somewhat unknowingly) paid to write unlicensed Minecraft stories. I think the video is worth a watch if you're not aware of the state of ebooks (especially on Amazon) prior to LLMs.

It's also worth mentioning that this problem isn't limited to ebooks. There's also a cottage industry of mass produced minimal-effort audiobooks on Audible as part of various "passive income" scams. Dan Olson made a pretty good video essay[1] about one such scam where he actually played along for most of it and also gave it a try as a ghostwriter.

[0]: https://www.youtube.com/watch?v=O1aqLLiIjgA

[1]: https://www.youtube.com/watch?v=biYciU1uiUw

EDIT: Considering the most cited examples are Amazon/Kindle and Audible, I think parallels can also be drawn to the proliferation of no-name brand Chinese whitelabel dropshipping products on regular Amazon. Everyone already knows not to trust Amazon reviews but at this point it's hard to find reputable brands for products you're not already familiar with (e.g. which of the one hundred brands featuring near identical products are actually brands you might find in a retail store rather than a random name slapped on the product in the same Chinese factory?). LLMs will definitely make reviews even more untrustworthy but they might also help generating even more plausible copycat product descriptions and designs.

> e.g. which of the one hundred brands featuring near identical products are actually brands you might find in a retail store rather than a random name slapped on the product in the same Chinese factory?

Ye this one is annoying. I often look for automotive tools and before I realized this I thought I was turning insane. Different store fronts pretend they are making some tool or like has sourced a factory to do their design.

But they like order the tool with a sticker and paint job from the same supplier.

Sometimes nuts and bolts or like cover plates vary but it is the same base.

I was browsing for tool hooks for the garage the other day and it felt like there are two factories in China that make hooks and two factories that put rubber on them for 4 combinations and that is it. But 20 flavours of branding.

Edit: It would be really neat if the factory had to mark everything they made.

I recently came across the tongue-in-cheek conspiracy theory that consumerism has reached its endstate in that companies now intentionally design only good-enough-but-unsatisfying products because those make you more likely to continue shop around for something different-but-equally-unsatisfying, so just like planned obsolescence but more based on dissatisfaction.

I think there is some truth to this in that it's likely profitable to flood the market with mediocre products (especially if you do it through a white label network of "competing" brands) as long as you can avoid refunds (and on Amazon certain sellers make returns intentionally difficult by providing Chinese addresses without a prepaid label, making you pay more for the return shipping than you likely paid for the product to begin with). This also drives down the overall expectation of quality and reduces market pressures for quality while also completely swamping any competitor trying to sell a genuinely high quality product by making reviews completely unreliable.

I don't think this is a coordinated strategy as with whitelabel dropshipping being sold as a get-rich-quick-scheme for years there's no real need for it. Plus as you mention in many cases this is simply a consequence of a race to the bottom earlier in the supply chain resulting in nearly identical mediocre products even when some parts differ.

Personally I've run into this when picking out interior doors for our new house: despite being in the mid-tier price range, various aspects ranging from the veneer to the locks are extremely underwhelming but apparently on par with other doors in that price range. There's literally no reason the locks should feel like cheap plastic toys but it's something that's not visible and can't usually be tested even in a show room so apparently that's where they decide to cut corners. And because even standard sizes are considered custom, there's no way to return them, let alone once installed.

Ye. It is hard to give a number of significance of the effect, but it is interesting.

> I think there is some truth to this in that it's likely profitable to flood the market with mediocre products

I think a factor here could be that you don't want the customers to recognize bad products? Some appliance models rotate so fast there are hardly any (non fake) review consensus on them. It is not just no-name Alibaba, but "proper" brands to, that seem to do that.

Contemporary literature is valuable for its insights in the present. I love classics, and much of what I read was published before this century, but I would be missing out if I stuck just with tried and proven titles and authors.
Can you give an example of a book published after, say, 2000 that is really valuable for insights we would not be able to get from either older books (if related to the human condition) or news articles (if more to do with some bare fact)?

I read mostly books written before the World Wars and I’m doubtful there’s much after that period of any real and lasting value. That doesn’t mean I don’t enjoy some of it, of course, mostly the fiction. The exception might be really niche works of local history, which is probably <0.001% of all such books, or a few really good scientific/mathematical compendiums.

It's not that any specific particular work of fiction will provide really valuable insights — what is insightful to one reader may be obvious to another — but that the whole field changes and absorbs societal change and extrapolates and interprets society from there.

It means books get written now which explore perspectives and voices unheard before, which in turn can help us readers expand our frame of reference.

The collective works of fiction written before 1900 tend to reflect the societal viewpoints of well-off white men (even when written by women or specifically dealing with societal ills). Go a few decades beyond that and you see authors from a working class background join the chorus, then more women, a broadening of sexual themes reflecting society's change (feminism, sexual liberation, homosexuality, etc.), more open criticism of religion too. Digital technology changed society significantly, and this is of course reflected in writing from the more recent decades, and coming up towards today you see more and more diversity amongst authors, adding — through the characters and narratives they create — yet more perspectives and insights. Sometimes pushing the envelope of a specific field, sometimes getting rid of tropes which no longer convince. Fiction changes constantly and will always be rooted in the year it was written.

You sell yourself short if you stop at 1940.

Unique perspectives are not made by sex or skin color, but by life experiences.

Two Americans today, whatever their sex and race, have virtually everything in common with each other compared to anybody from 1800, let alone 800. The extreme, excessive focus on race and sex in contemporary writing is exactly what makes it boring and irrelevant. This comment is a great example - you've been taught by contemporary writing that such a tendency existed, when it in fact excludes the objective reality of the tons of works that could not have been said to be "written" or "about" such people even by modern framing, but also the fact that "well-off white man" is a completely meaningless and inapplicable phrase if you go back more than a couple centuries.

The sort of work you're describing is the stuff we're taught to acculturate us to the world we already live in. There's no point browbeating me with even more material that I am already steeped in.

> Two Americans today, whatever their sex and race, have virtually everything in common with each other compared to anybody from 1800, let alone 800.

Define "virtually everything"?

The rich, white, and men have had massively different resources and societal privileges in each of those eras.

Even today it seems obvious to me that to be rich or white or male each brings benefits at every stage of life which can drastically change ones life experience: food security, personal safety, education, job prospects, and romantic opportunities.

I mean that if you put a random 2024 white man and black woman in a room with Charlemagne and the ability to communicate, the former two would have 99% overlap in their worldviews and perspectives while Charlemagne would be an alien to them both. “Privilege”, as the ancients noted, is fleeting and superficial.

“White” is a modern social abstraction that rapidly breaks down the further back you try to apply it. It also carries with it loaded assumptions that do not really hold even at a population level in different places. This is exactly what I mean: immersing yourself in modern culture blinds you to the reality people did not and could not apply such categories because they did not even exist. People often repeat that race is a social construct but rarely think about what that really means.

People in past cultures were totally alien. There’s some substrate of common humanity but trying to say, apply modern racial privilege politics to Xenophon’s encounter with the very pale Paphlagonians, or the widespread trafficking of European slaves even into modernity in Northern Africa and Anatolia, or even into parts of colonial America is just nonsensical.

I mean, take this quote from Benjamin Franklin for example, which would have been early in the conception of modern race:

> Which leads me to add one Remark: That the Number of purely white People in the World is proportionably very small. All Africa is black or tawny. Asia chiefly tawny. America (exclusive of the new Comers) wholly so. And in Europe, the Spaniards, Italians, French, Russians and Swedes, are generally of what we call a swarthy Complexion; as are the Germans also, the Saxons only excepted, who with the English, make the principal Body of White People on the Face of the Earth.

This is a completely bizarre Martian take to any of us alive today. The past was a foreign country. If you just stamp your feet and repeat “all old books is just rich white men” you’re not only completely wrong but missing genuinely different perspectives.

As an aside, what does it mean to not have a reasonably good understanding, as a man, of a woman’s perspective in 2024? Did you have mother, sisters, or cousins? No friends or girlfriend or wife? No empathy? As a last resort never read mumsnet or 2X? Ignoring the fact that once again highlighting the sex differences ignores the fact there are literally four billion women and they all have different perspectives, they’re not a hive mind, if you have to read a book to figure out the common differences in perspective…hasn’t something really gone wrong in society for that to even be possible? And could a book by some woman you don’t know and had some kind of connections or very unique experience - she got published after all - really tell you more than having long and deep empathetic conversations with people you actually know?

Books are great because they’re a window into that we can never see for ourselves. Literally 50% of the people on Earth right now are a different sex. You can find out what they think right now with no effort. We don’t need the nth book on “what it’s like to be an X in America”, we all know, everyone is shouting it from the rooftops. I’d trade it all for more Sappho, and I’m a member of a minority group that could write my own What It’s Like to Be An X in America book.

Sex and skin color most definitely influence unique experience (not exclusively of course). I guess you didn’t catch this in your readings of Shakespeare.
I don't think it's as bad as <0.001%. There's a larger volume of books published these days than ever before the 2nd world war, so there's a lot more crap to wade through, the nice thing about old books is that time has done the job of pruning out a lot of the crap.

I think people still write good books which can provide good insights into the human condition, it's just hard to find them in the present.

The proposition that no one has written a worthwhile book or had a new insight in the past 24 years is so laughably indefensible that it stuns me to see someone put it forth in earnest. To engage with it risks dignifying it beyond its stature. Rather than post my own list of books, I will link to this democratically determined list of "books that changed your thinking, changed your mind, enlightened you, helped you move ahead, helped you heal, in short, made a difference.":

https://www.goodreads.com/list/show/21995.Best_21st_Century_...

I don’t guess we’ll ever come to an understanding then, because I find that list totally laughable and a perfect example of what I’m talking about. I like Bill Bryson and I own the book but if his a Short History of Nearly Everything (#1 on the list) is the strongest example of what “changed your thinking, changed your mind, enlightened you, helped you move ahead” etc., well, things are very grim indeed. Scrolling down the list mostly gets much worse. The world would be no worse off if most of those books had never existed, and many of them don’t offer anything new at all. I scrolled down to 100, against my better instincts and encountering such luminaries as Tina Fey.

People have had interesting insights in that period - of course - but very few of them require anything longer than a sentence, and they are few and far between.

By the way, 2000 was chosen semi-arbitrarily, I would probably put the line much farther back.

Obviously it's a bit of a cop out but analysis of events. Books on the invasion of Iraq, or the gulf war. Books on the financial crisis. You might say this is a small portion of what's worth reading, but these are things with undoubtedly huge relevance to what is happening today in the short term future, as well as giving insight into the people and systems that govern us very specifically
I could be wrong, but I think almost everything in those examples exists in news articles, although to be fair, those articles are often long-form, and some are sourced with reporting from the books you mention. I've never read a book on either subject, but I maybe presumptuously think myself well-informed. ;)

I do think, as time goes on, and things are declassified, we'll learn more about e.g., the Iraq War than we know now. To a point, things become clearer with distance, which often renders books written on contemporary subjects outdated. But a book published today could be a decent summary of events for people who didn't live through them, or were too young to pay attention at the time. Unfortunately, one recent why I don't pay them too much mind is that most of the genre is hopelessly partisan, one way or another, and I don't know that there are any really great historians to write something timeless. But either way, I would say the insights you can glean from those books are still readily available if you're watching events critically today. Not much has changed, and if it has, it was usually for the worst. Maybe the most interesting part for many people would be the reminder that many of the same people directly responsible for those disasters are still considered respectable people today and help drive policy. E.g., Bill Kristol didn't go into some kind of exile - he's still considered a very serious person and has a lot of influence and is out there shilling the same type of thinking in a different context.

> I could be wrong, but I think almost everything in those examples exists in news articles, although to be fair, those articles are often long-form, and some are sourced with reporting from the books you mention. I've never read a book on either subject, but I maybe presumptuously think myself well-informed. ;)

Crashed, by Adam Tooze was a very very good retelling of the GFC, and I did read all the newspapers at the time and still found lots of interesting stuff.

More generally, I'd argue that modern historians probably have the highest chance of producing useful new insights recently.

If you accept fiction (and like fantasy fiction), I'd argue that the Malazan books of the fallen and/or the Nine Worlds series are serious works that are going to endure. Ask me again in a century, though ;)

I recently read a blog that argued exactly the opposite: that there was no point in reading old books because if they contained good ideas, somebody had written a newer book that better summarized those and combined them with new ideas, and that old books were liable to be wrong. Both arguments are wrong. The point is to read good books, not (usually) to just pick books before or after a certain date. I see elsewhere you argue that people at the time can’t evaluate the facts because they don’t have perspective. It is usually argued that people writing history lack understanding (and facts) about periods they didn’t live through. Short vs long form is also missing the point: short articles can be error ridden, misguided and even boring. At best they will waste less of your time, but there are great, long books that provide texture and insight that cannot be summarized.
I would say that contemporary literature can't provide any insights, by definition. Because it's a slave to its zeitgeist. Anything touching politics, history, economy and especially society will be tainted.
That isn’t an argument to stop producing them. It’s an argument for letting time pass to determine which remain relevant.
Wouldn’t this apply to all literature then? Because it was all a product of its time.
The difference is that the books we still read are the ones that have proven to be more universally true than the rest. We only know about old books that have remained relevant in one way or another. We've long forgotten the tripe.

With modern things, we don't yet know what will remain relevant and what will fade into obscurity or lose its relevance.

Plenty of still-popular ancient books are full of fiction, myth, and superstition. Perhaps one could say some have remained more useful.
Of course all literature is a product of its time but when you are reading an old book most of the time you are distanced from the events or then-new ideas presented and it's much easier to think objectively about them. As an example I'll give Marx works on communism. If I was XIX century working man I would be ecstatic about the idea, with the hindsight how communism actually works in practice I would have second thoughts. But the difference between the theory and the practice would be the actual insight. And in case of the contemporary events the actual insight would be in how they were presented and perceived when they were happening compared to how they are seen now
Reminds me of The Machine Stops:

    “Beware of first-hand ideas!” exclaimed one of the most advanced of them.
    “First-hand ideas do not really exist. They are but the physical impressions produced by love and fear, and on this gross foundation who could erect a philosophy? Let your ideas be second-hand, and if possible tenth-hand, for then they will be far removed from that disturbing element — direct observation. [...] And in time there will come a generation that had got beyond facts, beyond impressions, a generation absolutely colourless, a generation seraphically free from taint of personality.”
Marxism is very antiquated without modern insights applied to it.
> This was a problem before "AI"

Yes, I knew someone who was in "publish as remainder", that is publish at a high price then sell in bulk at a discount. Creating the books involved:

1. Think of a subject

2. Get some pictures

3. Find someone to write something which goes with the pictures

4. Repeat

There has always been a market for the undescerning.

Your post makes me sad.

"The Value Of Owning More Books Than You’ll Ever Read" - https://clivethompson.medium.com/the-value-of-owning-more-bo...

"...The writer Umberto Eco belongs to that small class of scholars who are encyclopedic, insightful, and nondull. He is the owner of a large personal library (containing thirty thousand books), and separates visitors into two categories: those who react with “Wow! Signore professore dottore Eco, what a library you have! How many of these books have you read?”* and the others — a very small minority — who get the point that a private library is not an ego-boosting appendage but a research tool. Read books are far less valuable than unread ones. The library should contain as much of what you do not know as your financial means, mortgage rates, and the currently tight real-estate market allows you to put there..."
This was partially solved by the internet and Google Books although there's still a lot of room for improvement. We really need to get all of the books in the world digitized and behind an API that allows anyone to access all of the content with some sane pricing scheme rather than charging per book. I doubt this will ever happen so the best we can hope for are the various efforts to pirate as many books as possible and make them available for download. It would be a dream to have a really good search over all of the written works of mankind though.
Yes, but some books require extensive effort to write while others are produced factory like. How do we take that in account?
It shouldn't be, the effort with which something is produced has zero weight when judging its value. For books they should be rated based on their purpose. Non-fiction should be rated along dimensions of accuracy, new information, utility, clarity and absence of errors. Fiction should be rated on how entertaining and inventive it is among other things. If you want to subdivide further I'm sure you could come up with other meaningful criteria.
I wonder at what point people would consciously spend less time online and go back to physical books, especially those works that were published decades/centuries ago? If I am a normal, non-tech person reading something online, how am I supposed to know who wrote it (software or human), what is the agenda behind it etc?

Of course I understand physical books can also be written by AI (just last week, I saw a physical poetry book fully written by AI, except for the preface). But it is much more cost to produce a physical book than throwing up something online only.

> That said, there are more great books than I could read in 100 life times published before 2020, so I doubt this will be an issue for me personally in practice

I spam this fact whenever someone makes arguments against piracy. I am utterly fine if no new culture is produced, as there already exists more than I could ever enjoy. Besides, creatives will create (with lowered production values) regardless of money. It’s a primal drive for them.

> I am utterly fine if no new culture is produced

Although, I would be a little upset if they stopped making high-budget binge-worthy series.

Then again... What the hell do I care if it's made by AI? Especially if they use one of those preference-based RLHF and penalize the model against episodes where the series lost all its viewers.

The same can be said for most code. It isn’t improving quality of life for most people. Like I really need another todo app. Let all the programmers live on scraps. It’s a primal drive for them. /s
This but without any irony.
>That said, there are more great books than I could read in 100 life times published before 2020, so I doubt this will be an issue for me personally in practice.

I'm personally split on this viewpoint. Through a series of events this year (starting with the death of Charlie Munger), I've started reading the Harvard Classics. Doing the 15 minutes a day challenge, which started on New Year's Day. I bought the full original 1910 set on Ebay. They're pretty cheap because they're essentially just room decorum and probably nobody reads them. People on Ebay will gladly sell cool looking leather cover books because it makes their offices look nice. This is an actual descriptions many sellers use.

For me though, above all else it's been an incredibly humbling experience. Because its been a very long time since I've read works from genuine professionals who spent a lifetime mastering their craft. I'm struggling with prose and vocabulary. I'm not an avid reader. I will at most read maybe 5-10 books a month. Some of which are technical or industry related. But I still like hardcopies so I tend to visit bookstores. And the crushing reality is that what's generally available on bookshelves (at least in the USA) is fucking garbage. Half-assed ghostwritten autobiographies of no-name celebrities or politicians, lifestyle in-your-face FEMALE EMPOWERMENT books that give questionable advice, pseudo-historical nonfiction books, bland and uninspired Star Wars science fiction rip offs (and their accompanying video game expanded universe novels). The list goes on.

Barnes and Noble realized that their audience doesn't really buy physical books anymore (or just plainly don't fucking read) so have moved in to fill the gap that Toys R' Us left behind. Half the floor space is a glorified toy store. Most of the people that come here either bring their kids or are kids themselves hanging out in the Manga section.

The cliche is obviously here and repeating itself: but with each consecutive generation we become less literate. AI has made it so the illiterate get nifty summaries of other less literate summaries of great works and ideas, which then get challenged by the pseudo-intellectual hack frauds extracting wealth from their fanbase and college students pumping out bullshit papers to meet graduation requirements. Just how fucking low exactly can we limbo before it's too late? In this context AI is basically Accelerationism

This is a general problem with art, music, movies, books, etc. There is just so much!

Performances and storytelling was pretty much ephemeral before printing presses, records, and now computers. Now it’s cumulative. All the writing and art of the past several hundred years is out there.

Kind of creates an overproduction / saturation problem.

Current LLMs can not summarize books even if they are in their training set. It is only possible if the training set also includes human summaries.
"Even if" is an odd phrasing. Why would you expect that to be possible? LLMs don't have direct access to their training set. Having a book in the training set just means they emit text in the style of the book - it's not like a database they can query. But they can summarize very well, if the text is front of them.
I mean, currently you have to wait for a human to summarize a book, before you can read an llm rephrasing of that summary. Or you have to buy the book and pass it to the llm employing some tricks to avoid reunning out of context length.
> Current LLMs can not summarize books even if they are in their training set

Why is this? due to legal reasons?

The context windows aren’t large enough, as I understand it. It might be possible via a chain-of-summarization, though.
Most importantly due to the context length not being long enough. If the context length was long enough, it is possible that they could do it with clever training. I only trained much smaller language models though.