| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by soulofmischief 650 days ago

> Not on HN. Customary is to use > paragraph quotes like you did. However I will keep that in mind.

Hacker News is not some strange place where the normal rules of discourse don't apply. I assume you are familiar with the function of quotation marks.

> If we're both grading a single student (LLM) in same field (programming), and you find it great and I find it disappointing, it means one of us is scoring it wrong.

No, it means we have different criteria and general capability for evaluating the LLM. There are plenty of standard criteria which LLMs are pitted against, and we have seen continued improvement since their inception.

> It can't consistently write good doc comments. I does not understand the code nor it's purpose, but roughly guesses the shape.

Writing good documentation is certainly a challenging task. Experience has led me to understand where current LLMs typically do and don't succeed with writing tests and documentation. Generally, the more organized and straightforward the code, the better. The smaller each module is, the higher the likelihood of a good first pass. And then you can fix deficiencies in a second, manual pass. If done right, it's generally faster than not making use of LLMs for typical workflows. Accuracy also goes down for more niche subject material. All tools have limitations, and understanding them is crucial to using them effectively.

> It can't read and understand specifications, and even generate something as simple as useful API for it.

Actually, I do this all the time and it works great. Keep practicing!

In general, the stochastic parrot argument is oft-repeated but fails to recognize the general capabilities of machine learning. We're not talking about basic Markov chains, here. There are literally academic benchmarks against which transformers have blown away all initial expectations, and they continue to incrementally improve. Getting caught up criticizing the crudeness of a new, revolutionary tool is definitely my idea of unimaginative.

1 comments

Ygg2 650 days ago

> Hacker News is not some strange place where the normal rules of discourse don't apply. I assume you are familiar with the function of quotation marks.

Language is all about context. I wasn't trying to be deceitful. And on HN I've never seen anyone using quotation marks to quote people.

> Writing good documentation is certainly a challenging task.

Doctests isn't same as writing documentation. Doctest are the simplest form of documentation. Given function named so and so write API doc + example. It could not even write example that passed syntax check.

> Actually, I do this all the time and it works great. Keep practicing!

Then you haven't given it interesting/complex enough problems.

Also this isn't about practice. It's about its capabilities.

> In general, the stochastic parrot argument is oft-repeated but fails to recognize the general capabilities of machine learning.

I gave it write YAML parser given Yaml org spec, and it wrote following struct:

   enum Yaml {
      Scalar(String),
      List(Vec<Box<Yaml>>),
      Map(HashMap<String, Box<Yaml>>),
   }

This is the stochastic parrot in action. Why? Because it tried to pass of JSON like structure as YAML.

Whatever LLM's are they aren't intelligent. Or they have attention spans of a fruit fly and can't figure out basic differences.

link

williamcotton 650 days ago

That’s not a good prompt, my friend!

link

soulofmischief 650 days ago

> Language is all about context. I wasn't trying to be deceitful. And on HN I've never seen anyone using quotation marks to quote people.

It's still unclear how this apparent lack of knowledge of basic writing mechanics would justify your use of quotation marks to attempt a straw man argument wherein you deliberately attempted to convince me that OP said something completely different.

> Doctests isn't same as writing documentation. Doctest are the simplest form of documentation. Given function named so and so write API doc + example. It could not even write example that passed syntax check.

That truly sounds like a skill issue. This no-true-Scotsman angle is silly. I said documentation and tests, I don't know how you got "doctests" out of that. I said "documentation", and "tests". I didn't say "the simplest form of documentation", that is another straw man on your behalf.

> Then you haven't given it interesting/complex enough problems.

Wow, the arrogance. There is absolutely nothing to justify this assumption. It's exceedingly likely that you yourself aren't capable of interacting meaningfully with LLMs for one reason or another, not that I haven't considered interesting or complex problems. I bring some extraordinarily difficult cross-domain problems to these tools and end up satisfied with the results far more often than not.

My argument is literally that cutting-edge LLMs excel with complex problems, and they do in many cases in the right hands. It's unfortunate if you can't find these problems "interesting" enough, but that hasn't stopped me from getting good enough results to justify using an LLM during research and development.

> Also this isn't about practice. It's about its capabilities.

Unfortunately, this discourse has made it clear that you do need considerable practice, because you seem to get bad results, and you're more interested in defending those bad results even if it means insulting others, instead of just considering that you might not quite be skilled enough.

> This is the stochastic parrot in action. Why? Because it tried to pass of JSON like structure as YAML.

That proves its stochasticity, but it doesn't prove it is a "stochastic parrot". As long as you lack the capability to realistically assess these models, it's no wonder that you've had such bad experiences. You didn't even bother clarifying which LLM you used, nor did you mention any parameters of your experiment or even if you attempted multiple trials with different LLMs or prompts. You failed to follow the scientific method and so it's no surprise that you got subpar results.

> Whatever LLM's are they aren't intelligent.

You have demonstrated throughout this discussion that you aren't capable of assessing machine intelligence. If you learned how to be more open-minded and took the time to learn more about these new technologies, instead of complaining about contemporary shortcomings and bashing those who do benefit from the technologies, it would likely open many doors for you.

link

Ygg2 650 days ago

> That truly sounds like a skill issue. This no-true-Scotsman angle is silly. I said documentation and tests, I don't know how you got "doctests" out of that. I said "documentation", and "tests". I didn't say "the simplest form of documentation", that is another straw man on your behalf.

What are you on about? Doctest is the simplest form of documentation and test. I.e. you don't have to write an in-depth test, you just need to understand what the function does. I expect even juniors can write a doctest that passes the compiler check. Not a good, not a passing one doctest, a COMPILING one. It's rate of writing a passing one was even worse.

> Wow, the arrogance. There is absolutely nothing to justify this assumption.

Ok, then. Prove what exactly hard problems did you give it?

I gave my examples, I noticed it fails at complex tasks like YAML parser in an unknown language.

I noticed when confronted with anything harder than writing pure boilerplate, it fails. E.g. it would fail 10% of the time.

> Unfortunately, this discourse has made it clear that you do need considerable practice

You can practice with a stochastic parrot all you want, it won't make it an Einstein. Programming is all about converting requirements to math, and LLMs aren't good at it. Do I need to link stuff like doing basic calculation and counting 'r' in the word 'strawberries'.

The best you can do is half the error rate, but that follows a power law. You need to double the energy to half the error rate. So unless you intend to boil the surface of the Earth to get it to be decent at programming, I don't think it's going to change anytime soon.

> You have demonstrated throughout this discussion that you aren't capable of assessing machine intelligence.

Pure ad hominem. You've demonstrated nothing outside your ""Trust me bro, it's not a bubble"" and ""You're wrong"". I'm using double double quotes so you don't assume I'm quoting you.

> You didn't even bother clarifying which LLM you used.

For YAML parser, I used Chat GPT-4o at my friend's place. For the rest of the tasks I used JetBrains AI assistant, which is a mix of Chat GPT-4, GPT-4o and GPT-3.

link