Hacker News new | ask | show | jobs
by hnlmorg 601 days ago
Are you then plagiarising if the LLM is just regurgitating stuff you’d personally written?

The point of these detectors is to spot stuff the students didn’t research and write themselves. But if the corpus is your own written material then you’ve already done the work yourself.

2 comments

Oh I agree, producing text by llms which is expected to be produced by human is at least deceiving and probably plagiarising. It's also skipping some important work, if we're talking about some person trying to detect it at all, usually in education context.

Student don't have to perform research or study for the given task, they need to acquire an example of text suitable for reproducing their style, text structure, to create an impression of being produced by hand, so the original task could be avoided. You have to have at least one corpus of your own work for this to work, or an adequate substitute. And you still could reject works by their content, but we are specifically talking about llm smell.

I was talking about the task of detecting llm generated text which is incredibly hard if any effort is made, while some people have an impression that it's trivially easy. It leads to unfair outcomes while giving false confidence to e.g. teachers that llms are adequately accounted for.

LLM is just regurgitating stuff as a principle. You can request someone else's style. People who are easy to detect simply don't do that. But they will learn quickly
I’ve found LLMs to be relatively poor at writing in someone else’s style beyond superficial / comical styles like “pirate” or “Shakespeare”.

To get an LLM to generate content in your own writing, there’s going to be no substitute for training it on your own corpus. By which point you might as well do the work yourself.

The whole point cheating is to avoid doing the work. Building your own corpus requires doing that work.

I meant you don't need to feed it your corpus if it's good enough at mimicking styles. Just ask to mimic someone else. I don't mean novelty like pirate or shakespeare. Mimic "a student with average ability". Then ask to ramp up authenticity. Or even use some model or service with this built in so you don't even need to write any prompts. Zero effort

You're saying it's not good enough at mimicking styles. others saying it's good enough. I think if it's not good enough today it'll be good enough tomorrow. Are you betting on it not becoming good enough?

I’m betting on it not becoming good enough at mimicking a specific students style without having access to their specific work.

Teachers will notice if students writing style shifts in one piece compared to another.

Nobody disputes that you can get LLMs to mimic other people. However it cannot mimic a specific style it hasn’t been trained on. And very few people who are going to cheat are going to take the time to train an LLM on their writing style since the entire point of plagiarism is to avoid doing work.

How would the teacher know what student's style is if she always uses the LLM? Also do you expect that student's style is fixed forever or teachers are all so invested that they can really tell when the student is trying something new vs use an LLM that was trained to output writing in the style of an average student?

Imagine the teacher saying "this is not your style it's too good" to a student who legit tried killing any motivation to do anything but cheat for remaining life

> How would the teacher know what student's style is if she always uses the LLM?

If the student always uses LLMs then it would be pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

> Also do you expect that student's style is fixed forever

Of course not. But people’s styles don’t change dramatically on one paper and reset back afterwards.

> teachers are all so invested that they can really tell when the student is trying something new vs use an LLM that was trained to output writing in the style of an average student?

Depends on the size of the classes. When I was at college I do know that teachers did check for changes in writing styles. I know this because one of the kids on my class was questioned about his changes in his writing style.

With time, I’m sure anti-cheat software will also check again previous works by the students to check for changes in style.

However this was never my point. My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

> Imagine the teacher saying "this is not your style it's too good" to a student who legit tried killing any motivation to do anything but cheat for remaining life

That’s how literally no good teacher would ever approach the subject. Instead they’d talk about how good the paper was and ask about where the inspiration came from.

Yep, some with fun results. I occasionally amuse myself now by asking for X in the style of writing of fictional figure Y. It does have moments.