Hacker News new | ask | show | jobs
by simianwords 97 days ago
i don't buy this logic. if i have studied an author greatly i will be able to recognise patterns and be able to write like them.

ex: i read a lot of shakespeare, understand patterns, understand where he came from, his biography and i will be able to write like him. why is it different for an LLM?

i again don't get what the point is?

5 comments

You will produce output that emulates the patters of Shakespeare's works, but you won't arrive at them by the same process Shakespeare did. You are subject to similar limitations as the llm in this case, just to a lesser degree (you share some 'human experience' with the author, and might be able to reason about their though process from biographies and such)

As another example, I can write a story about hobbits and elves in a LotR world with a style that approximates Tolkien. But it won't be colored by my first-hand WW1 experiences, and won't be written with the intention of creating a world that gives my conlangs cultural context, or the intention of making a bedtime story for my kids. I will never be able to write what Tolkien would have written because I'm not Tolkien, and do not see the world as Tolkien saw it. I don't even like designing languages

that's fair and you have highlighted a good limitation. but we do this all the time - we try to understand the author, learn from them and mimic them and we succeed to good extent.

that's why we have really good fake van gogh's for which a person can't tell the difference.

of course you can't do the same as the original person but you get close enough many times and as humans we do this frequently.

in the context of this post i think it is for sure possible to mimic a dead author and give steps to achieve writing that would sound like them using an LLM - just like a human.

You're still confusing "has a result that looks the same" and "uses the same process"; these are different things.
Why do you say it has a different process? When I ask it to do integrals it uses the same process as me
Not everything works like integrals. Some things don't have a standard process that everyone follows the same way.

Editing is one of these things. There can be lots of different processes, informed by lots of different things, and getting similar output is no guarantee of a similar process.

The process is irrelevant if the output is the same, because we never observe the process. I assume you are arguing that the outputs are not guaranteed to be the same unless you reproduce the process.

If we are talking about human artifacts, you never have reproducibility. The same person will behave differently from one moment to the next, one environment to another. But I assume you will call that natural variation. Can you say that models can't approximate the artifacts within that natural variation?

I don’t see why editing is any different. If a human can learn it why not an llm
Even if the visualization of the integration process via steps typed out in the chat interface is the same as what you would have done on paper, the way the steps were obtained is likely very different for you and LLM. You recognized the integral's type and applied corresponding technique to solve it. LLM found the most likely continuation of tokens after your input among all the data it has been fed, and those tokens happen to be the typography for the integral steps. It is very unlikely are you doing the same, i.e. calculating probabilities of all the words you know and then choosing the one with the highest probability of being correct.
> the way the steps were obtained is likely very different for you and LLM

this is not true, any examples?

You are not able to write like Shakespeare. Shakespeare isn't really even a great example of an "author" per se. Like anybody else you could get away with: "well I read a lot of Bukowski and can do a passable imitation" or "I'm a Steinbeck scholar and here's a description of his style." But not Shakespeare.

I get that you're into AI products and ok, fine. But no you have not "studied [Shakespeare] greatly" nor are you "able to write like [Shakespeare]." That's the one historical entity that you should not have chosen for this conversation.

This bot is likely just regurgitating bits from the non-fiction writing of authors like an animatronic robot in the Hall of Presidents. Literally nobody would know if the LLM was doing even a passable job of Truman Capote-ing its way through their half-written attempt at NaNoWriMo

>Literally nobody would know if the LLM was doing even a passable job of Truman >Capote-ing its way through their half-written attempt at NaNoWriMo

As I look back on my day, I find myself quite pleased with this line.

You can understand his biography and analyses about how shakespeare might have written. You can apply this knowledge to modify your writing process.

The LLM does not model text at this meta-level. It can only use those texts as examples, it cannot apply what is written there to it's generation process.

no it does and what you said is easily falsifiable.

can you provide a _single_ example where LLM might fail? lets test this now.

Yes, what I said should be falsifiable. The burden is on you to give me an example, but I can give you an idea.

You need to show me an LLM applying writing techniques do not have examples in its corpus.

You would have to use some relatively unknown author, I can suggest Iida Turpeinen. There will be interviews of her describing her writing technique, but no examples that aren't from Elolliset (Beasts of the sea).

Find an interview where Turpeinen describes her method for writing Beasts of the Sea, e.g.: https://suffolkcommunitylibraries.co.uk/meet-the-author-iida...

Now ask it to produce a short story about a topic unrelated to Beasts of the Sea, let's say a book about the moonlanding.

A human doing this exercise will produce a text with the same feel as Beasts of the Sea, but an LLM-produced text will have nothing in common with it.

>You need to show me an LLM applying writing techniques do not have examples in its corpus.

why are you bringing this constraint?

Because the entire point is the LLM cannot understand text about text.

If someone has already done the work of giving an example of how to produce text according to a process, we have no way of knowing if the LLM has followed the process or copied the existing example.

And my point of course is that copying examples is the only way that LLMs can produce text. If you use an author who has been so analyzed to death that there are hundreds of examples of how to write like them, say, Hemingway, then that would not prove anything, because the LLM will just copy some existing "exercise in writing like Hemingway".

>Because the entire point is the LLM cannot understand text about text.

you have asked for an LLM to read a single interview and produce text that sounds similar to the author based on the techniques on that single interview.

https://claude.ai/share/cec7b1e5-0213-4548-887f-c31653a6ad67 here is the attempt. i don't think i could have done much better.

>> i again don't get what the point is?

The point is that you dont become Jimi Hendrix or Eric Clapton even if you spend 20 years playing on a cover band. You can play the style, sound like but you wont create their next album.

Not being Jimi Hendrix or Eric Clapton is the context you are missing. LLMs are Cover Bands...

This is the plot of a short story of Borges’ called “Pierre Menard, the Author of Don Quixote.”
There's a relatively common pattern of "new tech idea => Borges already explained why that approach is conceptually flawed".