|
|
|
|
|
by capnrefsmmat
102 days ago
|
|
I work on research studying LLM writing styles, so I am going to have to steal this. I've seen plenty of lists of LLM style features, but this is the first one I noticed that mentions "tapestry", which we found is GPT-4o's second-most-overused word (after "camaraderie", for some reason).[1] We used a set of grammatical features in our initial style comparisons (like present participles, which GPT-4o loved so much that they were a pretty accurate classifier on their own), but it shouldn't be too hard to pattern-match some of these other features and quantify them. If anyone who works on LLMs is reading, a question: When we've tried base models (no instruction tuning/RLHF, just text completion), they show far fewer stylistic anomalies like this. So it's not that the training data is weird. It's something in instruction-tuning that's doing it. Do you ask the human raters to evaluate style? Is there a rubric? Why is the instruction tuning pushing such a noticeable style shift? [1] https://www.pnas.org/doi/10.1073/pnas.2422455122, preprint at https://arxiv.org/abs/2410.16107. Working on extending this to more recent models and other grammatical features now |
|
Collapsed mode makes the models truncate entire token trajectories, repeat themselves, and indirectly it does something MUCH deeper, they converge on almost 1:1 input-to-output concept mapping (instead of one-to-many, like in base models). Same lack of variety can be seen in diffusion models, GANs, VAEs and any other model regardless of the type and receiving human preference.
Moreover, these patterns are generational. Old ones get replaced with new ones, and the list in the OP is going to be obsolete in a year. This is what already happened to previous models several times, from what I can tell. Supposedly this is because they scrape the web polluted by previous gen models.