| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mk_stjames 1085 days ago

I'm always wondering about the, I don't even know what to call this, etiquette? of proposing PR's to projects like these that add a feature or a demo or whatnot to the main branch of a very focused project by adding something that is very different in interface, language, set and setting etc.

So in this case, Tobi made this awesome little web interface that uses minimal HTML and JS as to stay in line with llama.cpp's stripped-down-ness. But it is still a completely different mode of operation, it's a 'new venue' essentially.

What if GG didn't want such a thing? When is something like this better for a separately maintained repo and not a main merge? How do you know when it is OK to submit a PR to add something like this without overstepping (or is it always?)

I see this with a few projects on github that really 'blow up' and everyone starts working on. They get a million PR's from people hacking things on it in their domain of knowledge, expanding the complexity (and potentially difficulty to maintain quality). Sometimes it gets weird feeling watching from the outside at least (I'm not a maintainer on any public FOSS).

Just curious what others think because those are my thoughts that came to mind when I saw this.

8 comments

ggerganov 1085 days ago

My POV is that llama.cpp is primarily a playground for adding new features to the core ggml library and in the long run an interface for efficient LLM inference. The purpose of the examples in the repo is to demonstrate ways of how to use the ggml library and the LLM interface. The examples are decoupled from the primary code - i.e. you can delete all of them and the project will continue to function and build properly. So we can afford to expand them more freely as long as people find them useful and there is enough help for maintaining them. Still, we try to keep the 3rd party dependencies to a minimum so that the build process is simple and accessible

There was a similar "dilemma" about the GPU support - initially I didn't envision adding GPU support to the core library as I thought that things will become very entangled and hard to maintain. But eventually, we found a way to extend the library with different GPU backends in a relatively well decoupled way. So now, we have various developers maintaining and contributing to the backends in a nice independent way. Each backend can be deleted and you will still be able to build the project and use it.

So I guess we are optimizing for how easy it is to delete things :)

Note that the project is still pretty much a "big hack" - it supports just LLaMA models and derivatives, therefore it is easy atm. The more "general purpose" it becomes, the more difficult things become to design and maintain. This is the main challenge I'm thinking how to solve, but for sure keeping stuff minimalistic and small is a great help so far

> What if GG didn't want such a thing? When is something like this better for a separately maintained repo and not a main merge? How do you know when it is OK to submit a PR to add something like this without overstepping (or is it always?)

I try to explain my vision for the project in the issues and the discussion. I think most of the developers are very well aligned with it and can already tell what is a good addition or not

link

aidenn0 1085 days ago

Thank you for the ggml library, by the way. It let me play around with whisper in a sane manner. To run the CUDA torch versions, I needed to shut down X to free enough GPU memory for the medium model, and the small model might require me to quit firefox. With ggml, I can use cublas and run even the large model with a huge speedup compared to CPU only torch.

link

mk_stjames 1085 days ago

Thanks for replying to me directly! I'm finding it fascinating to follow this project. Good luck with your company Georgi.

link

vitaminka 1085 days ago

i’m curious, what’s is the approach for maintainable and decoupled various gpu backends?

link

ggerganov 1085 days ago

It was designed in #915 (read just the OP and the linked PRs at the end) and the implementation pretty much follows it closely, at least for the Metal backend. The CUDA and OpenCL backends are currently slightly coupled in ggml as they started developing before #915, but I think we'll resolve this eventually.

#915 - https://github.com/ggerganov/llama.cpp/discussions/915

link

vitaminka 1085 days ago

interesting decoupling method, ty :)

link

LawnGnome 1085 days ago

I generally think it's fine to do this sort of thing for your own benefit and open a PR as long as you're really 100% fine with "no, I'm not interested in merging this" being the answer.

Where the problems tend to arise (in my experience, at least) is when people hack on something expecting that it will be merged, get invested in it, and then get upset when the maintainer(s) aren't interested.

Checking in before starting to work on something is important if your goal is to have it merged, not just to do the work. The problem is that a lot of people start in the first category, but then move into the second category as they get invested in their project.

link

grepLeigh 1085 days ago

Some tips/tricks/tidbits:

* Open a draft PR early in the process with a Request for Comment [RFC] tag. Explain your goal/approach in words, then follow up with code.

* Be succinct.

* Provide minimal viable examples and build more complex concepts from these.

* Accept feedback with grace, and execute promptly.

* Don't take personal offense if your work isn't merged, or even responded to.

* Single-maintainer open-source looks very different than consortium & working group FOSS.

link

version_five 1085 days ago

A nice thing about llama.cpp is that it's well organized to accept a feature like this without really disturbing any other part or potentially stepping on someone's toes. There is the core repo and then this is on examples/server (as are various other "example" features). This organization feel like it would make it much easier to accept a pr like this than if doing the same thing required wider changes.

link

xal 1085 days ago

GG would just say no and that's that. No hard feelings, that's what makes open source so great.

link

renewiltord 1085 days ago

You just saw it happen in OP. Just do as others do and don't sweat this stuff. The principle is code sharing and an offer of a thing you've done.

Don't overthink it.

This is fantastic. I love the way he handles his project. Just great for adoption and contribution.

link

bfuller 1084 days ago

My view is you get so many people blatantly ripping off your code to try and pass it off as their own that you actually appreciate when you come across someone doing something novel or interesting that users like. That was my view anyway.

link

kaliqt 1084 days ago

Well that's why you open an issue first, an RFC. Get some discussion going, if everyone's happy, then you proceed.

link