| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bretpiatt 767 days ago

Did you try to use GPT 3.5? Our testing is it isn't great, using GPT 4 or some of the specialized trained versions of GPT 4 (there's one with good reviews called Lisp Engineer) our experience has been different.

It is not replacing engineers, it isn't one where you give it a broad set of requirements and it just goes and builds, it is helping increase productivity, to get folks through areas where they need to bounce ideas off of someone.

We're coding mostly in Python, C++, and .NET Core where I do expect it'll have a much deeper set of training data than it will for Lisp (and even for those languages we're getting marginally better performance from specialized engines than we are general GPT 4).

The non-OpenAI other coding AIs so far are all performing worse for us than GPT4. We've done testing against LeetCode challenges and a bunch of other things.

2 comments

neonsunset 767 days ago

If only these LLMs were decent at C#. Unfortunately, they heavily lean towards very old data, would call obsolete APIs and in a style that is generally against what is considered to be idiomatic and terse.

For example, I once asked Claude 3 to port some simple XML parsing code from Go (because 10s to ask is faster than 60s to type by hand haha) and it produced this https://pastebin.com/3823LBiA while the correct answer is just this https://gist.github.com/neon-sunset/6ba67f23e58afdb80f6be868...

Functionally identical but such cruft accumulates per each single piece of functionality you ask it to implement. And this example is one where the output was at least coherent and doing its job, many more are worse.

link

chrisjj 767 days ago

> Did you try to use GPT 3.5?

Yes.

And succeeded! :)

> We're coding mostly in Python, C++, and .NET Core where I do expect it'll have a much deeper set of training data than it will for Lisp

I can't imagine how malformed bracketing is due to insufficient training set.

And nor it seems can ChatGPT: "I've encountered numerous examples of Lisp code during my training, covering different applications and techniques within the language. Whether it's simple examples or more complex implementations, I've seen quite a bit of Lisp code."

link

lolinder 767 days ago

You tried the very first iteration of an LLM-based chat assistant, were unsatisfied with it because it couldn't match Lisp parentheses, and went on to form an opinion about the value of these tools and implicitly the intelligence of the people who use them. That speaks more to your preconceptions than it does to the state of better tools like Copilot or GPT-4.

You didn't label it (which, btw, is a faux pas), but it's obvious from your replies that this wasn't an Ask HN, it was a Tell HN. You have absolutely no interest in what the rest of us have to say.

Nevertheless, I'll try once more for luck: Basing your opinions about LLMs on your experience with GPT-3.5 is a mistake. If you don't want to use LLMs at all because you have preconceptions, that's fine, but don't pretend that you've sampled LLMs and found them lacking for professional coding when you haven't tried the professional tools.

link

chrisjj 766 days ago

> You tried the very first iteration of an LLM-based chat assistant

Er, V3.5 is "the very first iteration"?

> don't pretend that you've sampled LLMs and found them lacking for professional coding when you haven't tried the professional tools.

I think you misread my post. I didn't mention professional.

And my post was't about a "sample of LLMs". It was about this one in particular.

link

lolinder 766 days ago

> Er, V3.5 is "the very first iteration"?

Yes. ChatGPT-3.5 was the very first LLM-based chat assistant that was announced on Nov 30, 2022 [0]. It hasn't gotten better since then, just more censored and faster.

It followed GPT-1 (which was only interesting to people who were already in the know), GPT-2 (which was neat but widely recognized as pretty useless and again, not something normal people noticed) and GPT-3 (which was cool, but didn't provide a chat interface, it could only complete texts, so it made a decent base for the early versions of Copilot).

[0] "ChatGPT is fine-tuned from a model in the GPT-3.5 series, which finished training in early 2022." https://openai.com/index/chatgpt/

link

fragmede 766 days ago

damn, I wish I could give more than one upvote

link