| That's not the original PR. jart was working on a malloc() approach that didn't work and slaren wrote all the code actually doing mmap, which jart then rebased in a random new PR, changed to support an unnecessary version change, magic numbers, a conversion tool, and WIN32 support when that was already working in the draft PR. https://archive.ph/Uva8c This is the original PR: https://github.com/ggerganov/llama.cpp/pull/586. Jart's archived comments: "my changes" "Here's how folks in the community have been reacting to my work." "I just wrote a change that's going to let your LLaMA models load instantly..." https://archive.ph/PyPFZ "I'm the author" https://archive.ph/qFrcY "Author here..." "Tragedy of the commons...We're talking to a group of people who live inside scientific papers and jupyer notebooks." "My change helps inference go faster." "The point of my change..." "I stated my change offered a 2x improvement in memory usage." https://archive.ph/k34V2 "I can only take credit for a 2x recrease in RAM usage." https://archive.ph/MBPN0 "I just wrote a change that's going to let your LLaMA models load instantly, thanks to custom malloc() and the power of mmap()" https://archive.ph/yrMwh slaren replied to jart on HN asking her why she was doing and saying those things, and she didn't bother to reply to him, despite replying to others in that subthread within minutes. https://archive.ph/zCfiJ |
This is BillG-style product skill -- there is a ton of work that goes into representing a piece of software as something important and valuable that people should buy into.