Hacker News new | ask | show | jobs
by lhnz 1179 days ago
I believe you meant to respond to: https://news.ycombinator.com/item?id=35430052

However, I won't respond to you here, since (1) it should be quite clear that I think @slaren wasn't given enough recognition for their work from my prior comments and that there is a more positive approach you could have taken to helping to give them this, and (2) the rest of what you said about ethics is subjective, and I think wrong in magnitude -- for example, I'm not sure it's correct to call it "plagiarism" when @jart's PR mentioned the collaboration with @slaren, used co-authored commits and linked to their PR.

1 comments

"my changes"

"Here's how folks in the community have been reacting to my work."

"I just wrote a change that's going to let your LLaMA models load instantly..."

https://archive.ph/PyPFZ

"I'm the author"

https://archive.ph/qFrcY

"Author here..."

"Tragedy of the commons...We're talking to a group of people who live inside scientific papers and jupyer notebooks."

"My change helps inference go faster."

"The point of my change..."

"I stated my change offered a 2x improvement in memory usage."

https://archive.ph/k34V2

"I can only take credit for a 2x recrease in RAM usage."

https://archive.ph/MBPN0

"I just wrote a change that's going to let your LLaMA models load instantly, thanks to custom malloc() and the power of mmap()"

https://archive.ph/yrMwh

jart was working on a malloc() approach that didn't work and slaren wrote all the code actually doing mmap, which jart then rebased in a random new PR, changed to support an unnecessary version change, magic numbers, a conversion tool, and WIN32 support when that was already working in the draft PR. https://archive.ph/Uva8c

From what I can see, @jart had spent a considerable amount of time on this problem and had posted an interesting-but-not-production hack to it (https://github.com/ggerganov/llama.cpp/commit/5b8023d9354010...) on March 17th, which they had also excitedly posted about on Twitter.

This was 2 weeks prior to @slaren's contribution (https://github.com/slaren/llama.cpp/commit/fc685122f95f212d1...) on March 29th, so in a sense, it's quite possible that what you've just shown is that @slaren saw that @jart was working on mmap support, worked out a cleaner solution and then wasn't happy with only being a co-author -- for their contribution, they believed that they must be the only person mentioned on the PR: although this is weird, since I don't think they even have a public profile, so maybe instead the truth is that they weren't comfortable with working with somebody that hypes up any changes they've worked on for popularity?

I don't think saying "my changes" on Twitter and other social media means what you suggest it does as is it is just informal speech to refer to things you've worked on with "my", and particularly when you see the times this was expanded (e.g. "yesterday my changes to the LLaMA C++ file format were approved") it seems more reasonable than it does without this context.

If you read the rentry you'll see that both of them were working on an issue that l29ah raised, along with other users. jart's work was on something that didn't end up making it in, the malloc() approach. slaren is the one who wrote the code in the commits I linked to, and that's the code that was adopted. You can (and should) do a comparison of the mmap code and see. What I wrote about the version change, magic number, WIN32, etc., is all true too. As is the haste with which the new PR was made, leading to the recent pushes to revert due to swap thrashing and anger over false and rushed claims about "miracle RAM reduction" etc.

In fact, if you read the thread you linked to, you'll see this for yourself too, no reentry required. There's nothing actually objectionable or "repulsive," as jart put it, in that renetry, with an exception of the "r word" being applied to a proposed technical solution.

Your interpretation is incompatible with what we see and the clear timeline. The social media bragging, the second PR, etc., are further evidence. I hope whatever anger you had going into this has abated to the point where you can now actually judge the evidence.