Hacker News new | ask | show | jobs
by phil917 1106 days ago
I signed up for a Copilot trial a few weeks ago and so far my feelings on the tool are mixed.

When a code suggestion is correct it’s nice but it honestly feels like a very rare occurrence so far.

What often ends up happening is I will get suggested a snippet that is close to, but not exactly what I need. So if I accept the suggestion, I end up doing a fair amount of editing anyways to correct the suggestion.

Even worse, occasionally I will accept a suggestion and only later notice a subtle mistake.

The place I’ve seen the biggest benefit is with writing comments or user facing text copy.

I see people on Twitter and Hacker News talking about how much their productivity is being boosted by Copilot and ChatGPT…

But I’m just left scratching my head because I’m barely certain I’m seeing ANY boost from either of these tools so far. I feel like I’m missing something?

10 comments

> Even worse, occasionally I will accept a suggestion and only later notice a subtle “mistake”.

Same, I got burned one time by accepting a suggestion I thought I understood.

I think the general problem is that Copilot shifts you from writing code to reading code, and sometimes reading is harder. You can't really take a "yeah that seems right" attitude because it's just throwing guesses at you and seeing what sticks. The safe way to use it is as a jumping off point for writing your own code.

Do you use it for writing tests? It’s really helpful there. And in my experience it’s bad with non-standard algo heavy code. What language and domain are you using it for?

IMO it’s a nice 10-15% speed boost for web dev. And much more than that for unit tests. I expect you can measure a massive jump in test coverage by engineers that use copilot because it takes work that most people hate and speeds it up significantly.

I’ve heard many people say that but I have yet to try it with that particular task. I will give it a shot.
This has been my experience as well. I cannot understand how people are saying it has given them order of magnitude speed improvements in shipping.

Most code on github is by definition average, which is usually a few levels below the quality of an expert programmer. So it's not surprising the output of code LLMs is poor to average quality. For any larger block of code it outputs, a non-trivial amount of editing is required if a higher bar of quality is wanted.

Around half of developers write below average code, by definition. So it isn't surprising that raising the quality to average _for them_, and doing it faster, would be a productivity boon.

I'm a crappy marketing copy writer. A professional writer could do much better. But with chatGPT I too can write hack-quality marketing-copy that roughly conveys the a message and give it others to put in various marketing outlets.

> Around half of developers write below average code, by definition.

By definition, half the code out there is of below median quality. Whether or not half the code out there is of quality below the arithmetic mean depends on assumptions about the distribution of code quality. I would suspect that much more than half the code out there is "below average", so to speak.

Yep, there are assumptions all around. On the developer side, I also suspect there is a huge group casual "low-code" type programmers out there that could be assisted by copilot type tools. My main point was that the argument that the code produced by LLM type systems is low-quality is not really a barrier to adoption, and _might_ actually _raise_ the average quality of systems built in the wild. I'm less concerned with the actually distributions and more noting the point that the different relative value-point exists.
All pedantry aside, even if I'm skeptical and concerned at the moment, your hypothesis could well end up being true in the mid to long term.
> Even worse, occasionally I will accept a suggestion and only later notice a subtle mistake.

This is what hits me hard, as when I write it myself these things are more clear.

> I see people on Twitter and Hacker News talking about how much their productivity is being boosted by Copilot and ChatGPT…

I use ChatGPT quite a lot, but not really for coding. It does increase my productivity, but only because it better acts as a fuzzy search/query machine than Google. Often I use the two together (which leads to using Bard). I can remember the concept, but not the name, so GPT/Bard tells me the name, and then I can google from there. With this type of searching the hallucinations are not a big issue as they are only minor interruptions and are far outweighed just by the amount of searching I'd need to do to even find the thing I need in the first place. I should mention that I'm a researcher so I'm often looking for new tools, ways things have been done before, and frequently probing for things that are just outside my area of knowledge. But I should also say that I won't trust them to teach me a concept because when I've tested it on things I know, there are serious mistakes. Realistically the accuracy depends on how common/frequent the information is. If it is something that many people write about in a general field, it'll be accurate. If it is a topic, even popular, of a niche subfield, good luck.

Exactly the same for me. I will often get variable names autocompleted, but 90% of the time they're wrong, even when the right variables are declared sometimes in the same function or file I'm working on.

More than once has it shoved Python code into my Ruby file. Bash or YML files will often get comments added, but the wrong style for the language (which since I can never freakin' remember the style for every language actually makes me less productive since I have run the code, have it crash, and still go look up the language spec).

It keeps trying to autocomplete large hash manipulations, and it looks right, but subtly wrong in the syntax (missing commas or something) and then it's the same compile, crash, then stare at the code and have to add the same damn change for 50 lines of a hash when I should have just copied and pasted in the first place.

So far: net negative on my productivity. Plus, it sucks on shitty wifi like a plane or most hotels.

From what I understand (and I might be totally wrong), Copilot is useless for dealing with codebases where you often need to call internal functions, which makes it pretty much useless for anything but simple projects. Also, I just checked the FAQ for Copilot, and it says that users only accept 26% of suggestions, so 3/4 suggestions are garbage.
> Also, I just checked the FAQ for Copilot, and it says that users only accept 26% of suggestions, so 3/4 suggestions are garbage.

Without knowing the specific statistic being reported I’m not sure you can reach that conclusion. Copilots plug-ins can be configured to suggest as you type just like any other auto complete. It may just be that the user has paused and doesn’t need the suggestion being offered.

And it’s not like people use 100% of the suggestions their auto complete tools provide, but that doesn’t make those suggestions garbage.

Also 25% seems pretty good! If I could cut out writing 25% of the code I write per day, I'd consider that a good value, even it cost me reviewing the 3/4 of other code snippets that were stinkers. Some of the latter might also help form in my head how the manually written code should be done anyway.

I haven't tried copilot yet, but chatgpt code generation is probably useful to me 1/3 of time as a stack-overflow/reference-doc replacement. "I need a thing foo to perform bar in context baz" which I then will right myself using the pattern. But the lookup was way faster than googling tons of results.

    From what I understand (and I might be totally wrong), 
    Copilot is useless for dealing with codebases where 
    you often need to call internal functions
Perhaps this is the next big frontier or opportunity.

LLMs have finally gotten to the point where they are sometimes useful. And local project-level code intelligence (intellisense, static analysis, etc) has been pretty good for a while.

The first team to marry these two things is going to really have something great. A true force multiplier.

It certainly doesn't seem like it will be easy, but it also certainly doesn't seem to be impossible.

It definitely calls other functions in the same file. So some internal functions.

It's quite excellent even for large projects. It still saves a lot of mundane coding

I personally don’t use copilot. Sometimes I ask chatGPT for help on small, atomic functions. It has saved me lots of time, and the fact that I have to explicitly copy+paste makes me more aware of the code I’m potentially copying over
A good analogy is gps navigation. Imagine if instead of telling it where you want to go and having it generate a route with turn by turn directions, it reacted to every press of the accelerator or brake pedal and every turn of the steering wheel to guess what you wanted to do based on what others who braked or accelerated or changed lanes at the same point in the road did. It might be useful if you were completely new to the city and trying to go to the airport. It would guess that you're on this stretch of freeway to go to the airport and would JIT plot that you take exit 7 and get into the airport traffic. But if you're actually trying to go to a different location and just driving past the airport you'll have to reject that suggestion and then it would say, oh, you must be going to the arena, take exit 8!! And you'll have to reject that too. It's annoying as shit.

Oh, but it will learn that you're driving to work and will be able to prioritize that above the airport and the arena? Oh yay, so when I drive to someplace I drive to 5 times a week it can give me directions I don't need, but when I drive somewhere I haven't been before it can only tell me how to go to a bunch of places other people went on the way.

It's practically useless if your knowledge of your codebase and language is better than 0.

It's probabbly useful for languages and crap you don't use a lot or where you'd end up copypasting 90% of it anyway, like crappy webdev. (Good webdev is a different story)

I'd rather use it to explain what the fuck I was thinking when I wrote this shit 6 months ago, or what Nate was thinking when he seemingly fucked it all up (or maybe he fixed it). And maybe, maybe, playing the analogy back in reverse, it will get to where I tell it I'm trying to go somewhere specific and it gives me turn by turn directions, instead of arbitrarily suggesting shit that I drive past. Like I could say I'm trying to extend this to accept triangles and not just squares and it would say, oh use the visitor pattern and here are the places to change it, or something. But right now it would just flip it's shit and start suggesting triangle, polygon, stars, circles or something like that. Do you want to reverse a linked list of triangles? No, clippy. I know better than you how to change this shitty code. Thanks anyway.

Same feeling. I use GPT sometimes to get a reminder of a usage of a certain command or part of what I am doing, but for anything beyond that it usually wastes my time.
We are still in a exploration area where the use cases are in flux. If you find a set of use cases that work for you, then it will be a boon, but if not it is worthless. These tools aren't magic black boxes that do everything. On the one hand, that can be discouraging when you don't find immediate value. But for those that keep trying different things you'll probably start to see some value.

Some things I've done this week:

- I was asked to write a white paper on a topic I had previously built a slide presentation and given a talk on. I exported the outline, grabbed a few links to blog posts on the broader topic, and asked ChatGPT (with link-crawling) to write the whitepaper. I didn't like the output at all. So I asked chatgpt for several options on the outline for the whitepaper, and several options for narrative. I think selected an overall narrative arch for the paper, and the outline (with a few manual tweaks) and asked it to rewrite the paper with a revised prompt. It was pretty good and then I asked to change the style and tone. I then asked it to critique itself and suggest several improvements, several of which I then asked it to take into account and revise the paper. Then I asked to several variations of the document for different audiences. I picked out a few different parts I liked from each variant and ran another pass asking for improvements, including suggestions for visual aids. This all only took about 10 minutes, after which I shared this first draft with a colleague for review. That would have taken me a whole afternoon (and likely longer since I procrastinate when starting a blank page)

- I wanted a chrome plugin to use the archive.ph api for gated papers without needing to trust that sketchy extensions might steal my web history. I asked chatgpt to generate the code. Upon review I made a couple tweaks and had a private extension in about 3 minutes.

- I wanted to produce a menu for guests that staying with us this weekend. There were several dietary restrictions and preferences that need to be accounted for. I entered these in requesting a variety options. I selected those options, then asked for recipes and a suggested shopping list. I then asked for wine and beer pairings and got some good generic responses of varieties - but the brand-specific suggestions were bad. Time 2 minutes

- I wanted to some pro/con brainstorms on a new technology decision - I asked it to search for expert blogs and summarize their arguments. I also got it to generate hello-world level examples for each to compare and contrast. I then had it generate code for a more complex use case using the tool for each framework. I was then able to make a much more informed decision and start evaluating deeper a smaller set of options 2 from the dozen+ that existed. Time - 20 minutes

All of these have a pattern of iteratively working with a tool to semi-automate what I want. For now it sort hits a sweet spot of either summarizing research I would normally google and low-key generation of document or code that doesn't need a tone of context outside either what I researched or have on hand and can be self-contained.

Yeah that makes sense and it's why I'm forcing myself to try and use these tools. I feel like it can be a boon if I find the right use case in my workflow.