I’ve also noticed that with longer-form text, the amount of meaningful information seems to plateau — it doesn’t scale proportionally with the character count.
It probably depends a lot on how the system is prompted. One of the interesting things about generative images is how easy it is to know what something looks like without being able to describe it.
Longform text is likely similar where there are a bunch of interactions and scenes that humans pick up on if they are there without being able to describe. The early Game of Thrones series was a fascinating example of good writing because most of the terrible things that happened to people were a neat result of their own choices (it had a consistent flawed character -> bad choice -> terrible consequence style that repeated over and over) - but I don't think most people would pick up on that without it being explicitly pointed out. And when that started to go away people could tell the writing was falling off but couldn't easily pick out why.
A hypothetical LLM could be prompted with something like that ("your writing is boring, please make consequences follow from choices") but it is less clear that the average prompter would be able to figure out that was what was missing. Like how image generators often needed to be prompted with "avoid making mistakes" to get a much higher quality of image; it took a bit to realise that was an option.
That's my experience as well. If you feed your summary + outline + guidance and prompt a one-shot output, it'll rush through it. If you prompt it for longer length, it'll extend it for little benefit. To get good output, you have to work in chunks, like a paragraph or a scene at a time, adjusting your prompt as you work through the outline.
That said, the resulting quality usually isn't so great that I want to put in the effort to do that, so I tend to interact with it in more of a choose-your-own-adventure way.
Longform text is likely similar where there are a bunch of interactions and scenes that humans pick up on if they are there without being able to describe. The early Game of Thrones series was a fascinating example of good writing because most of the terrible things that happened to people were a neat result of their own choices (it had a consistent flawed character -> bad choice -> terrible consequence style that repeated over and over) - but I don't think most people would pick up on that without it being explicitly pointed out. And when that started to go away people could tell the writing was falling off but couldn't easily pick out why.
A hypothetical LLM could be prompted with something like that ("your writing is boring, please make consequences follow from choices") but it is less clear that the average prompter would be able to figure out that was what was missing. Like how image generators often needed to be prompted with "avoid making mistakes" to get a much higher quality of image; it took a bit to realise that was an option.