Hacker News new | ask | show | jobs
Show HN: Quack Companion – VSCode extension for OSS contribution assistance (github.com)
49 points by fgfm 908 days ago
Hello there, I'm FG (short for François-Guillaume)! I’m building Quack Companion (https://github.com/quack-ai/companion), a tool designed to bridge the gap between project maintainers and contributors in open-source software (OSS).

If you've ever contributed to OSS, you're familiar with the challenge: diving into a new codebase, aligning with maintainers' expectations, and making meaningful contributions can be daunting. For maintainers, managing a flood of contributions and providing guidance while maintaining quality is overwhelming. As a contributor to PyTorch & an OSS author, I’ve been on both sides over the years. This challenged me to become a better engineer & team player, seeking ways to facilitate smooth collaborative software development.

This is where Quack steps in, as your AI companion for software team alignment. For developers, it's like having a seasoned mentor guiding you through the intricacies of a new codebase, offering live in-line hints based on the project’s guidelines to craft high-quality pull requests. For maintainers, it’s a practical toolset to identify and address workflow inefficiencies and align contributions effectively with project objectives.

As code generation gets commoditized, aligning these diverse efforts becomes critical if you wish to convert individual productivity boosts into team velocity. Quack AI is here to solve this alignment, and make collaborative software development scale without losing any efficiency.

We're committed to keeping the service accessible and free for OSS communities while we plan to generate revenue from the enterprise suite. The platform (React, Next JS), the IDE extension (VSCode) and the backend API (Python, FastAPI) are licensed under Apache 2.0. You can find a short demo here [1] and the GitHub project here [2].

Our roadmap includes:

* offering autocompletion and code chat in the IDE, making the contribution process even more intuitive and seamless;

* finalizing the transition of the community version to hostable OSS models;

* identifying ambiguities or unspecified aspects of a given project’s guidelines;

* developing a Fitbit-like feature for your software development productivity to identify bottlenecks in your workflow.

This is still the early days, but we've seen how sharing a "public design doc" with the community can significantly improve the outcome! How was your experience of managing inbound as maintainers? What were your personal hacks to mitigate those challenges? We'd love to hear about how it has impacted your developer life or if you have any feedback about the above.

Cheers!

[1] Demo video: https://dub.sh/quack-demo

[2] Open Source repo: https://github.com/quack-ai/companion

Our documentation: https://docs.quackai.com/

4 comments

OSS communities ... on GitHub only, seems to be the rest of that sentence: https://github.com/quack-ai/companion/blob/77281d882ff7b90fe...

there is also a tiny mention of it in the readme, but it's often hard to know how many times that's "shorthand" for "git based code forges" versus, as in this case, "we use the github api and others are on the backlog"

In case others went sniffing around looking for "does this use an OpenAI key?" the answer is yes but it's in the Python side: https://github.com/quack-ai/contribution-api#configuration

I appreciate the feedback about clarity, thanks! We'll update the documentation and agree to reflect that more accurately.

For now, we've started with VSCode as an IDE and used GitHub for authentication. But actually, we're already working with GitLab to add support. For other VCS, the prioritization will be demand-based as we don't want to spread thin early on.

Regarding the OpenAI part, as stated in the post, we're currently migrating the community version to self-hosted OSS models. If you sniff around the backend API repo, you'll see there is already a third-party service registered for Ollama and a corresponding docker-compose (https://github.com/quack-ai/contribution-api/blob/main/docke...). Our next release was already planned to switch to Ollama (keeping OpenAI as an alternative as well), so I'm thrilled if that goes along with the community preference!

If a contributor comes to my project with a pull request that an AI wrote using copyrighted, unknown licensed-and-attributed code it sucked off of GitHub, that PR is rejected immediately and I would strongly consider banning the "contributor" as well. A lot of OSS maintainers are tuning into this new reality so I would advise caution to contributors looking to have tools write their code for them.
Thanks for sharing, that's an interesting social component of the equation. From your comment, I assume you're referring to something I've also encountered as a maintainer: we filter out signals where no efforts were put in. If I get the feeling that a PR is perhaps a bit useful but that the author has committed an LLM-generated piece of code, I'll be on the fence. If I'm asked to review a PR with the bare minimal added value, but the author has tried their best and is seeking help to get them started with OSS contributions, I will help. Was that your experience as well?

In that regard, the proxy for "no effort" usually defaults to "it looks like the PR doesn't check any of the guidelines in the CONTRIBUTING.md or the PR template". Here we're trying to always bring that guideline context, make it requestable, and inject it into your coding workflow. In the process, we want to educate those developers about your specific engineering culture.

Besides, code generation is inevitably going to become a growing part of software engineering. Here we're making sure this transition isn't operated without proper alignment or context. It's already challenging to get everyone on the same page in code reviews, so team alignment isn't a trivial problem and it's not gonna improve with the extra thousands of LoC developers will be able to produce each day. Or do you foresee a significant proportion of OSS maintainers consistently rejecting automatically-generated code?

Why?
I imagine it's because including open-but-license-incompatible code (such as GPL3) would change the way the project legals work and potentially open you up to litigation.

Strictly speaking this is true for non-AI generated code too, such as a copy paste, but it's easier to tell when that happens. It's also true for closed source code but the fallout from that is going to take a few decades to manifest.

That last part feels very relatable to me: I've seen organizations who are mindful of the licenses of tools they use to avoid further problems, and others assuming that because it's closed source the problem won't ever arise.

License-wise, we're getting more and more transparency on the permissions that apply to the training sets of each OSS model. But I would argue that once we're passed that, developers are gonna raise their expectations:

- control over dependency multiplicity ~= "rewrite this using only a single linear algebra library with Apache 2 license" or even "rewrite this in pure Node JS"

- adding corresponding reference/license notice: the model copies/adapts a section of a library that requires copyright notice reproduction.

- transparency on the similarity with the source material if it was copied/adapted from somewhere else (even if the license allows this, this enters the realm of social courtesy/community codes)

Do you have an estimate for typical token usage for a developer who'd use it as part of their workflow? I'd imagine the costs can rack up fairly quickly if you're not careful.
We'll do our best to consistently report it since this can indeed influence the financial decisions of developers, especially if they go through third-party paying LLM APIs. In our early experiments, we've seen about 200-250 tokens per request (~= autocompletion), of which about 40-50 tokens are generated.

Two things we're doing this:

- right now our API response contains more than what's required for autocompletion, so there is room for improvement there. And since we focus on team alignment, the goal is to boost the suggestion acceptance rate compared to alternatives. So in the end, fewer calls and lower token consumption.

- since we're working on fully migrating to hostable OSS models of reasonable size, the financial aspect of token consumption should be mostly moved out of the picture to focus on latency.

> the function name should start with a verb

You lost me there.

Haha I don't know what your poison is, but the same goes for: - using the syntax of Python 3.11 for asynchronous tasks; - using Promises vs. Observables in Javascript

Was the demo example confusing, or not challenging enough perhaps? If you have tough coding guidelines you've been enforcing manually in code reviews up until now, please do share