Hacker News new | ask | show | jobs
by arbol 1233 days ago
I use github copilot to create simple functions by typing the function name and copilot does the rest. This works reasonably well for basic stuff. It becomes unhelpful and starts creating incorrect paths to library functions as soon as you add your work context.
3 comments

This doesn't address the parent's excellent question: how do these models continually get trained and updated if they put their key sources of training data out of business?

Related: I'd be quite worried if I was a Q/A site like StackOverflow or Quora.

> This doesn't address the parent's excellent question

That's a large part of HN in a nutshell.

Imho, quora users don’t use it for finding answers but rather reading personal experience. I used to write on quora fairly often many year ago. But something happened, and in my subjective experience the platform became far less interesting and useful. Significant part of questions turned into barely hidden shills for business or products, right answers seldom ever get visible and overall quality of content went down drastically. Maybe I’m not a representative but from my point if view the readers and writers on quora 2023 won’t have any different experience when there’s some smart machines that gives right answers.
It's not excellent. It's a red herring. Google is already showing responses on the SERP without you having to go to SO, without AI, and the hypothetical scenario hasn't happened.
That is in fact the elephant in the room: you will most likely never be able to actually use it for complex stuff because of the token limits required for ai.

Imagine analyzing a massive code base, sure it can tell you how you where solving function ex by translating it to natural language, but it still does not understand any of it.

As far as i know, training it on your dataset will not improve this.

Increasing the token limit is a solvable problem
Sure we just need next level super computers for these large models and the patience of multiple days to wait for output
Not necessarily - you just need hierarchical abstraction memory. I reckon my "token" limit when analysing code is around 7.
Increasing the token limit without needing more resources to run the network is a solvable problem
but you do sure see the problem with a codebase right?
The current token limit comes from a O(N^2) memory requirements for N tokens, there is research that's trying to reduce this towards O(N), for example as the (downvoted) sibling comment suggests. This is not exactly straightforward but not impossible either. It's not a fundamental limitation of language models going forward.
That's not my experience at all. I find Copilot the best at understanding and making sense of my work context, spanning many files. I find it less useful for creating generic functions.