Hacker News new | ask | show | jobs
by Nullabillity 1049 days ago
Honestly, this is the really offensive part of the article. Who cares about whether or not it's legal, the idea that it's, in any way, shape, or form, useful is bafflingly laughable.

Not everything can be meaningfully quantified. Not everything needs to be.

6 comments

Certainly something interesting is bound to come out of quantifying things? "Hm, this three act structure thing seems to work, I wonder why." "Children doesn't seem to understand texts which include these words, I wonder why."

Patterns rarely show themselves before we investigate.

> Certainly something interesting is bound to come out of quantifying things?

In science they call this trap P-hacking. Even data "scientists" know to be wary of overfitting. We're really good at finding patterns, but few of them actually mean anything.

>> Certainly something interesting is bound to come out of quantifying things?

> In science they call this trap P-hacking. Even data "scientists" know to be wary of overfitting. We're really good at finding patterns, but few of them actually mean anything.

Quantifying things is not always p-hacking. When people do experiments on novel materials or structures they quantify the data, make readings and record them, and then look for patterns. For example measuring the electronic properties of a new novel nano structure or molecule.

When I think of p-hacking[1] I think of using the same static data and doing various data analysis over and over again until something potentially interesting is found and ignoring the risks of false positives as you do so.

[1] https://en.wikipedia.org/wiki/Data_dredging

> Not everything can be meaningfully quantified. Not everything needs to be.

Ok, so who decides what's OK to analyze or not? Is there some obvious moral line I fail to see, that everyone would immediately agree on?

It seems the project was about analyzing books, not about producing new books. How is that hurting the authors?

What will hurt artists is, when in 10 years, all publishers are demanding that the vividness score (TM) be at least a 95% “because that’s what drives sales”.

Which is what will happen if the authors don’t proactively stop it from happening. Look at how the music industry has evolved over time.

How his this different from all the vampire novels that hit the shelf after the success of Twilight? Publishers alway preferred the money makers, just the measure changed.

Nowadays writers can at least publish their books without the need of publishers and I think some like the help of the bad Silicon valley stuff that made writing, publishing and interacting with the readers easier.

I'm on your site if it's about automatic content creation and style copying but text analysis is not the real danger. Especially when the usefulness of such statistics isn't even given.

> publish their books without the need of publishers

Except those are very likely to be metoo vampire novels. And lately LLM generated.

I'd move that on the contrary, the role of the publisher as a curator will only become more important in the future.

But publishers will have to deal with a lot more content thanks to LLMs.
Or it could help me find terser books I like, people will still have preferences and if the author tries to pander to only the largest market segment I'd argue that's on them.
I think it’s much more likely you would get the book equivalent of crap SEO sites spammed out to satisfy numerical measures of quality.
How is this different to the current process, other than feedback is slower (if forthcoming at all) and less specific?
> How is this different to the current process, other than feedback is slower (if forthcoming at all) and less specific?

Let me rephrase your question: "how is it different to the current process, other than <the fact that it is different>?" :-). I would say that the answer lies in the question.

Sounds as though your view of the AI is purely positive, in that case. That's fair enough. The answer for other people may well not lie in the question (e.g. for all the people who don't like this development), but it did for you!
Sorry I did not understand that :-).

My point was that it is different: when humans read a book, they don't train a machine learning model. They can't read as many books as a machine, at the same speed, and they can't remember nearly as much as what a machine can.

Humans and computers are fundamentally different, and it matters. You can't conclude that because it works for one, it will fork for the other.

the difference is that an machine analysis is necessarily limited and can't account for all the factors that make a text interesting. so it is possible that this analysis rejects texts that would not be rejected by a human.

it is objective but potentially biased. and it could even be discriminating if the input for this tool isn't diverse enough. but these are the issues that can go wrong with any use of technology, and we have seen many examples of that happening. however i don't think that is problematic if writers use it to analyse their own texts in comparison. it is however a serious issue if publishers use it to decide what to accept

Again, I don't particularly care about whether this is allowed to exist, I'm just here to laugh at the mindset that lead to it being created. But sure, I can see this being used in harmful ways.

> It seems the project was about analyzing books, not about producing new books. How is that hurting the authors?

"Vivid books are really in this year, we're gonna have to ask that you aim for a Vividness(tm) of 85 or above."

"US books have 15% more adjectives, clearly this is proof of our superior detail-oriented work ethic!"

"What does the rise in Emotion(tm) have to say about the decline of society?"

So if I understand you correctly, you're saying that we should not create "metrics" for anything because said metrics could be misused by clueless people?
The analysis is cool. The problematic thing is what would have happened next, if this tool turned out to be any good.

Publishers rejecting manuscripts because "this years trend shows customers are looking for vividness in the 70+ percentile, your book is only at 55". Everything becoming the same style. If you thought Hemingway, Joyce or Nabokov had it bad with rejections, there'd be zero chance for actual innovative writing to break through the walls of The Algorithm.

Joyce should have had more rejections, but that’s just my personal opinion
> Not everything can be meaningfully quantified.

Sure, but written words _can_ be meaningfully quantified. We have been doing that for thousands of years. Starting with numerology and other mystical/religious beliefs, poem metrics, stylometry, crypto analysis, stroke counting, to name a few.

> Not everything needs to be.

Why not?

> Honestly, this is the really offensive part of the article.

I would argue that "Offensive" is either hyperbolic or you've used the wrong word.

> the idea that it's, in any way, shape, or form, useful is bafflingly laughable.

I don't know if it's useful because I never tried it. I might harbour my doubts but I'd like to find out. This is how I approach new things.

If you don't find it useful, don't use it. But why get outraged about something that others find useful? It's clearly a tool that other writers were positive and excited about. Why not let them have it? If you don't find those quantifications meaningful, so be it. You don't need to use it. Why force your opinion on others?
Simple. Just allow an opt-out for Authors or Publishers. Then only interested parties will comprise of and make use of the service, like you want.