Hacker News new | ask | show | jobs
by BrandonJung 1462 days ago
For this specific comparison, it’s essential to start from the technology, as many of the product differences stem from the differences in approach, architecture, and technology choices. Microsoft and OpenAI view AI for software development almost as just another use case for GPT-3, the behemoth language model. Code is text, so they took their language model, fine-tuned it on code, and called the gargantuan 12-billion parameter AI model, Codex.

Copilot’s architecture is monolithic: “one model to rule them all.” It is also completely centralized - only Microsoft can train the model, and only Microsoft can host the model due to the enormous amount of computing resources required for training and inference.

Tabnine, after comprehensively evaluating models of different sizes, favors individualized language models working in concert. Why? Because code prediction is, in fact, a set of distinct sub-problems which doesn't lend itself to the monolithic model approach. For instance: generating the full code of a function in Python based on name and generating the suffix of a line of code in Rust are two problems Tabnine solves well, but the AI model that best fits every such task is different. We found that a combination of specialized models dramatically increases the precision and length of suggestions for our 1M+ users.

A big advantage of Tabnine’s approach is that it can use the right tool for any code prediction task, and for most purposes, our smaller models give great predictions quickly and efficiently. Better yet, most of our models can be run with inexpensive hardware.

Now that we understand the principal difference between Microsoft’s huge monolith and Tabnine’s multitude of smaller models, we can explore the differences between the products:

First, kind of code suggestions. Copilot queries the model relatively infrequently and suggests a snippet or a full line of code. Copilot does not suggest code in the middle of the line, as its AI model is not best suited for this purpose. Similarly, Tabnine Pro also suggests full snippets or lines of code, but since Tabnine also uses smaller and highly efficient AI models, it queries the model while typing. As a user, it means the AI flows with you, even when you deviate from the code it originally suggested.

Second, ability to train the model. Copilot uses one universal AI model, which means that every user is getting the same generic assistance based on an “average of GitHub”, regardless of the project they're working on. Tabnine can train a private AI model on the specific code from customers’ GitLab/GitHub/BitBucket repositories and thus adjust the suggestions to the project-specific code and infrastructure. Training on customer code is possible because Tabnine is modular, enabling the creation of private customized copies.

Third, Code security and privacy. There are a few aspects of this. Users cannot train or run the Copilot model. The single model is always hosted by Microsoft. Every Copilot user is sending their code to Microsoft; not some of the code, and not obfuscated - all of it. With Tabnine, users can choose where to run the model: on the Tabnine cloud, locally on the developer machine, or on a self-hosted server. This is possible because Tabnine has AI models that can run efficiently with moderate hardware requirements.

In addition, Tabnine makes a firm and unambiguous commitment that no code the user writes is used to train our model. We don’t send to our servers any information about the code that the user writes and the suggestions they’re receiving or accepting.

Fourth, commercial terms. Microsoft currently offers Copilot only as a commercial product for developers, without a free plan (beyond a free trial) or organizational purchase. Tabnine has a great free plan and charges for premium features such as longer code completions and private models trained on customers’ code.

https://tabnine.com/tabnine-vs-github-copilot

6 comments

Might be worth disclosing you are the “VP (of) Ecosystem and Business Development” for Tabnine in any comments that your pitching Tabnine; while you’re at it, might not hurt to add that to your HN profile.
The other commenters have pointed out that this is basically just PR, but even if we ignore that entirely, it doesn't make a lot of sense. A few issues I pulled out somewhat at random:

> Copilot queries the model relatively infrequently

Huh? It queries whenever you stop typing.

> Copilot does not suggest code in the middle of the line, as its AI model is not best suited for this purpose

This is at best false, at worse disingenuous. Yes, it won't insert a completion in the middle of a line, but this is trivially solved by inserting a newline in the middle of the line and then triggering a completion at the end of the line you just created.

> In addition, Tabnine makes a firm and unambiguous commitment that no code the user writes is used to train our model.

Copilot has this option too...

> Copilot uses one universal AI model, which means that every user is getting the same generic assistance based on an “average of GitHub”, regardless of the project they're working on.

I mean, anyone who has used Copilot for more than 5 seconds would know that this isn't true. Copilot does a fantastic job at reading the current file and providing suggestions relevant to the file, rather than some hypothetical "average of GitHub"

> Huh? It queries whenever you stop typing.

That's relatively infrequently - TabNine offered me accurate completion while still writing my code, whereas with Copilot I not only have to wait for it to return the answer, I also have to hope it knows an answer. If it doesn't, too bad, lost time for nothing.

Thank you for laying out facts for us, at least from Tabnine perspective.

I didn't use Tabnine but anecdotally heard it is not as good to guess whole methods the way Copilot is. Now I understand why.

The only suggestion I would make, maybe lower the price, MS is bigger player, they pretty much set the standard now, if they have lower price even symbolically, it would hurt you more.

I personally am happy with Copilot, had some really wonderful moments and savings in time searching for proper method syntax is where I get a lot of benefit. I should spend time to get to know Tabnine as well so not to be ignorant on this new technology.

I noticed there is a free tier for individual users with limited capability. But I agree with the thrust of your argument here.
Good feedback and thank you for taking the time to share
There are some things in your comment that are wrong, or don't quite follow.

> Copilot’s architecture is monolithic: “one model to rule them all.” It is also completely centralized - only Microsoft can train the model

It's true that only Microsoft can train the model—for now. But GPT-3 is a monolithic model, and OpenAI allows fine-tuning. There's nothing about the architecture that prohibits Microsoft from offering customers the ability to fine-tune the model.

> Copilot does not suggest code in the middle of the line, as its AI model is not best suited for this purpose.

Here's a screenshot of GitHub Copilot suggesting code in the middle of a line:

https://imgur.com/gallery/Jp2qg7X

> Copilot uses one universal AI model, which means that every user is getting the same generic assistance based on an “average of GitHub”, regardless of the project they're working on.

If the model is big enough that seems to be an advantage, not a disadvantage. At least, when I use GPT-3 to create a 4chan greentext, or an X-Files script, or a presidential speech, or an article by Hunter S. Thompson (or a presidential speech by Hunter S. Thompson) it doesn't particularly feel like it's averaging things out.

I fail to see how in the screenshot you provided Copilot suggests code _in the middle_ of the line.

What I see is Copilot suggesting an entire line. Or does the suggestion start on the Err line with the "{" ?

On the highlighted line I typed `panic!("Error while ` and Copilot suggested finishing the line I started with `converting GeoJSON geometry to geometry: {}", e);`
Complete marketing PR. Probably you should be declaring where do you work when posting PR comments
To be fair. Tabnine has an actual free plan for personal use. The GitHub Copilot one is only for open source maintainers and educational institutions.

All though in the past I found the limited suggestions offered with the Tabnine free plan too limiting to be of much help.