Hacker News new | ask | show | jobs
by mk_stjames 1085 days ago
I'm always wondering about the, I don't even know what to call this, etiquette? of proposing PR's to projects like these that add a feature or a demo or whatnot to the main branch of a very focused project by adding something that is very different in interface, language, set and setting etc.

So in this case, Tobi made this awesome little web interface that uses minimal HTML and JS as to stay in line with llama.cpp's stripped-down-ness. But it is still a completely different mode of operation, it's a 'new venue' essentially.

What if GG didn't want such a thing? When is something like this better for a separately maintained repo and not a main merge? How do you know when it is OK to submit a PR to add something like this without overstepping (or is it always?)

I see this with a few projects on github that really 'blow up' and everyone starts working on. They get a million PR's from people hacking things on it in their domain of knowledge, expanding the complexity (and potentially difficulty to maintain quality). Sometimes it gets weird feeling watching from the outside at least (I'm not a maintainer on any public FOSS).

Just curious what others think because those are my thoughts that came to mind when I saw this.

8 comments

My POV is that llama.cpp is primarily a playground for adding new features to the core ggml library and in the long run an interface for efficient LLM inference. The purpose of the examples in the repo is to demonstrate ways of how to use the ggml library and the LLM interface. The examples are decoupled from the primary code - i.e. you can delete all of them and the project will continue to function and build properly. So we can afford to expand them more freely as long as people find them useful and there is enough help for maintaining them. Still, we try to keep the 3rd party dependencies to a minimum so that the build process is simple and accessible

There was a similar "dilemma" about the GPU support - initially I didn't envision adding GPU support to the core library as I thought that things will become very entangled and hard to maintain. But eventually, we found a way to extend the library with different GPU backends in a relatively well decoupled way. So now, we have various developers maintaining and contributing to the backends in a nice independent way. Each backend can be deleted and you will still be able to build the project and use it.

So I guess we are optimizing for how easy it is to delete things :)

Note that the project is still pretty much a "big hack" - it supports just LLaMA models and derivatives, therefore it is easy atm. The more "general purpose" it becomes, the more difficult things become to design and maintain. This is the main challenge I'm thinking how to solve, but for sure keeping stuff minimalistic and small is a great help so far

> What if GG didn't want such a thing? When is something like this better for a separately maintained repo and not a main merge? How do you know when it is OK to submit a PR to add something like this without overstepping (or is it always?)

I try to explain my vision for the project in the issues and the discussion. I think most of the developers are very well aligned with it and can already tell what is a good addition or not

Thank you for the ggml library, by the way. It let me play around with whisper in a sane manner. To run the CUDA torch versions, I needed to shut down X to free enough GPU memory for the medium model, and the small model might require me to quit firefox. With ggml, I can use cublas and run even the large model with a huge speedup compared to CPU only torch.
Thanks for replying to me directly! I'm finding it fascinating to follow this project. Good luck with your company Georgi.
i’m curious, what’s is the approach for maintainable and decoupled various gpu backends?
It was designed in #915 (read just the OP and the linked PRs at the end) and the implementation pretty much follows it closely, at least for the Metal backend. The CUDA and OpenCL backends are currently slightly coupled in ggml as they started developing before #915, but I think we'll resolve this eventually.

#915 - https://github.com/ggerganov/llama.cpp/discussions/915

interesting decoupling method, ty :)
I generally think it's fine to do this sort of thing for your own benefit and open a PR as long as you're really 100% fine with "no, I'm not interested in merging this" being the answer.

Where the problems tend to arise (in my experience, at least) is when people hack on something expecting that it will be merged, get invested in it, and then get upset when the maintainer(s) aren't interested.

Checking in before starting to work on something is important if your goal is to have it merged, not just to do the work. The problem is that a lot of people start in the first category, but then move into the second category as they get invested in their project.

Some tips/tricks/tidbits:

* Open a draft PR early in the process with a Request for Comment [RFC] tag. Explain your goal/approach in words, then follow up with code.

* Be succinct.

* Provide minimal viable examples and build more complex concepts from these.

* Accept feedback with grace, and execute promptly.

* Don't take personal offense if your work isn't merged, or even responded to.

* Single-maintainer open-source looks very different than consortium & working group FOSS.

A nice thing about llama.cpp is that it's well organized to accept a feature like this without really disturbing any other part or potentially stepping on someone's toes. There is the core repo and then this is on examples/server (as are various other "example" features). This organization feel like it would make it much easier to accept a pr like this than if doing the same thing required wider changes.
GG would just say no and that's that. No hard feelings, that's what makes open source so great.
You just saw it happen in OP. Just do as others do and don't sweat this stuff. The principle is code sharing and an offer of a thing you've done.

Don't overthink it.

This is fantastic. I love the way he handles his project. Just great for adoption and contribution.

My view is you get so many people blatantly ripping off your code to try and pass it off as their own that you actually appreciate when you come across someone doing something novel or interesting that users like. That was my view anyway.
Well that's why you open an issue first, an RFC. Get some discussion going, if everyone's happy, then you proceed.