Hacker News new | ask | show | jobs
by Vanclief 1187 days ago
> LLMs aren’t just the biggest change since social, mobile, or cloud–they’re the biggest thing since the World Wide Web. And on the coding front, they’re the biggest thing since IDEs and Stack Overflow, and may well eclipse them both.

I personally feel the technology is over-hyped. Sure, the ability of LLMs to generate "decent" code from a prompt is pretty impressive, but I don't think they are biger than Stack Overflow or IDEs.

So far my experience is that ChatGPT is great for generating code from languages I not proficient in or when I don't remember how to do something and I need a quick fix. So in a way it feels like a better "Google" but still I would rank it as inferior than Stack Overflow.

I am also hesitant about the statement that it makes us 5 times as productive because we only need to "check the code is good" for two main reasons:

1. It is my belief that if you are proficient enough in the task at hand, it is actually a distraction to be checking "someone else code" over just writing it yourself. When I wrote the code, I know it by heart and I know what it does (or is supposed to do). At least for me, having to be creating prompts and then reviewing the code that generates is slower and takes me out of the flow. It is also more exhausting than just writing the thing myself.

2. I am only able to check the correctness of the code, if am am proficient enough as a programmer (and possibly in the language as well). To become proficient I need to write a lot of code, but the more I use LLMs, the less repetitions I get in. So in a way it feels like LLMs are going to make you a "worse" programmer by doing the work for you.

Does anyone feel that way? Maybe I am wrong and the technology hasn't really clicked for me yet.

31 comments

I don't quite know how to put it, what follows is a rough draft of an idea, maybe someone can help me to reword it, or perhaps it's trash.

Since its inception, computer science has had two "camps": those who believe CS is engineering, and those who believe CS is mathematics. The reason why we are seeing all of this fuss around LLMs is that they are a new front of this feud. This "extends" the usual debate on emerging technologies between Thymoetes and Laocoon.

Something that works 99 times out of 100 is 99% correct from the first perspective and 100% wrong from the second.

LLMs are therefore a step forward if you take the first view, a step back if you take the second.

If you accept this interpretation, an interesting consequence of it is that your outlook on LLMs is entirely dependent on what amounts to your aesthetic judgement.

And it's very hard not to have rather strong aesthetic judgements on what we do 40 hours a week.

Camp 3: Those of us who have viewed coding as a craft.

Math - the study of well defined concepts and their relationships. Solving problems with proofs.

Engineering - solving well characterized problems based on math and physics (which can include materials with known properties, chemistry, approximations, models, …), and well defined areas of composability (circuits, chemical processes, structural design, …)

Craft - solving incompletely characterized problems with math, physics, engineering and enormous amounts of experience, intuition, heuristics, wisdom, patterns, guesses, poorly understood third party modules, partial solutions pulled from random web sites …

Art - Solving subjective problems by any means necessary.

> Engineering - solving well characterized problems based on math and physics (which can include materials with known properties, chemistry, approximations, models, …), and well defined areas of composability (circuits, chemical processes, structural design, …)

Eh, I think you’re overselling how precise and well defined engineering is in other fields. Engineering in other fields is just as much dealing with poorly characterised problems as it is when writing code (it takes quite a lot of characterisation to go from “we want a bridge here”, to an actual damn bridge, and that’s all an engineers work).

Really the core of engineering is just a very broad set of practices and principles that allows people to solve poorly characterised problems using maths, physics, enormous amounts of experience, intuition, heuristics, wisdom, patterns, educated guess etc in a reasonably consistent and repeatable manner. Doesn’t matter if you’re building a web browser, a motherboard, or a bridge. You don’t get a good result without a healthy dollop of wisdom, experience, educated guesses, and a handful of fuckups (which hopefully you notice before you let people use the thing).

Engineering in other disciplines is no less messy, haphazard, and experimental than it is in software. It just isn’t as publicly documented as it is software, probably because it’s hard to build an open source bridge.

Well, most engineering fields have well defined applied math that spans from the problem domain to the solution domain.

Logic in digital circuits.

Algebra and calculus for analog circuits, most physical objects, properties and processes.

Differential equations for dynamical systems and dynamical behaviors.

Sure there is a lot of creativity in engineering, but there is usually a whole area of math known to be suitable for expressing solutions clearly, given the area of engineering.

Contrast with the utter lack of standard notation across software tools and implementations, for describing all the trade offs, gotchas, glue, historical drift & complexity, theories of memory, caching, user affordances, potential overflows, races, etc. that is implied by a program’s code.

Sometimes a language provides islands of engineered code, like message passing in Erlang, or memory management in Rust, or a precise mathematical library like BLAS.

But most aspects in most software programs are created ad hoc, or inherited from someone else’s rats nest of an implementation, and never formalized completely, if at all!

Any clarity in representation quickly leaves planet applied math.

> Sure there is a lot of creativity in engineering, but there is usually a whole area of math known to be suitable for expressing solutions clearly, given the area of engineering.

Take it from someone who’s studied the maths you’ve described and applied it in a professional capacity. Just knowing the maths isn’t anywhere near enough. It’s like knowing how quick-sort works, interesting and useful, not even close to enough to actually build anything.

> Contrast with the utter lack of standard notation across software tools and implementations, for describing all the trade offs, gotchas, glue, historical drift & complexity, theories of memory, caching, user affordances, potential overflows, races, etc. that is implied by a program’s code.

The entire field of computer science is dedicated to developing and using this type of notation to describe and understand the basic principles that underpin every programming language, database and algorithm you’ve ever touched. The notation exists, you just don’t use it. In the same way an engineer in another field doesn’t bother doing structural analysis, or circuit analysis from first principles, they just grab a pre-finished tool and apply it to their problem. Normally by just passing the problem to a computer that does all the heavy number crunching, and checking the outputs make sense.

> But most aspects in most software programs are created ad hoc, or inherited from someone else’s rats nest of an implementation, and never formalized completely, if at all! > > Any clarity in representation quickly leaves planet applied math.

Again, I think you’re vastly overestimating the precision of other fields of engineering. There might be more rigour in design in some places, but only because “just making a thing and see if it works” is expensive, but other engineers absolutely spend huge amounts of time just making things to see if stuff works.

There’s a reason why “safety factors” are a thing, and reason why they’re usually 10x or greater. That safety factor is also the “we’re not sure how well this works, so we made it ten times stronger than we think we need, just in case, factor”. Engineering in other fields is people doing maths all day long, it’s mostly reading data sheets, assembling Lego brick components, and hoping to hell the manufacturer didn’t lie too much on their data sheet. Plus some design and simulation on a computer for good measure.

You wanna see “ad-hoc or inherited from someone else’s rats nest of an implementation” in a different field of engineering. Then go look at any electronics catalog, or lookup YouTube videos of people testing parts against the specsheet and discovering how wildly different they can sometimes be.

Dodgy, badly implemented, never formalised engineering exists everywhere. That’s why bridges collapse when they shouldn’t (Genoa Bridge), why planes crash when they shouldn’t (Boeing 737 Max), why cars emit more emissions than they should (VW), why buildings get emergency modifications after being built to prevent them from being blown over (601 Lexington Avenue). Software engineering does not have a monopoly on botches, last minute hacks, and dodgy workarounds. Engineers in other fields were merrily employing all of them to great effect for centuries before software turned up.

All four are prevalent in our industry.

I recall a self taught dev (or maybe from a bootcamp) coming up with a cascade of nested if-else, nested 8 deep. Someone with a background in CS asked him what he was trying to do and basically concluded that what he was trying to do could be expressed as a state machine. To which the initial dev replied that it was "way too fancy" and that he didn't need the code to be fancy, just work.

+9000 for your post.

The ugly part: The nested-8-deep solution was faster to market and costs less. And, it will be thrown away in 6 months during the "big rewrite after we scale". So the perfect-is-the-enemy-of-good state machine solution written by an expensive engineer has less value. Oompf.

Possibly. Or not. It could be that the elegant expression can be written in less time by the expensive engineer, doesn't need to be rewritten in 6 months, and it may be that both of these people are getting paid the same anyway. So you can make up any "just so" story that you like about a made-up anecdote. If programs A and B implement the same function and both have adequate performance, then the differences between the two artifacts come down to style.
> If programs A and B implement the same function and both have adequate performance, then the differences between the two artifacts come down to style.

What about maintainability? Extensibility and ease of debugging?

I've seen chunks of projects re-written, just because it was simply impossible to extend them without significant efforts!

I have a camp 4: viewing coding as a roadblock. A necessary obstacle to achieving some result. This is how companies view programming. They don't know about it, don't care about it, they just know they have to do a lot of it to produce their next product.
But what you are saying is that all economic activity is a roadblock.

The division of labor into software developers and non software developers isn't any different than farmers vs non farmers or any other profession.

Tacking roadblocks as they come up in service, creating a better product … that sounds a lot to me like engineering except with some words twisted around?
Coding is clearly all those things.
I cringe every time I see someone use the term "craft" in the context of computer programming. It feels like a desperate attempt to seek adoration as a "sub-genius" in a poorly understood field. Computer programming is hard because the platform is either poorly documented or changes very quickly. Too much of my professional work is about finding a "super hero" (their term, not mine) solution to a poorly documented problem. It is tiring, and not-at-all heroic. Coding bootcamps have taught us that when you tear down the gatekeeper walls, more people can write CRUD apps (that the world actually needs) than ever thought before.

As a counterpoint: Look at the history of the libxcb: https://en.wikipedia.org/wiki/XCB

    [Bart] Massey and others have worked to prove key portions of XCB formally correct using Z notation.
That sounds like math to me. Or is it "craft"?
> poorly documented or changes quickly

[otherwise a (code) monkey could do it] is missing the point of programming.

I know managers who code (occasionally) who think similar. I thought so personally before I had actual prolonged experience with professional programming.

It is hard to express it concisely: why it is fundamentally wrong (category error: like thinking that perl regexes can be reduced to DFA--it is impossible in the general case even if DFAs can [sometimes even should] be used in many cases instead).

It is the same reason why waterfall programming fails most of the time. It is the same reason why generating code from UML diagrams produced by analysts is also a failure in the general case. It is the same reason why log normal distribution can be a good model for software estimation https://news.ycombinator.com/item?id=26393332

And no, you can't replace all programmers with a LLM prompt for the same reason (at least until [if ever] it reaches AGI and then humanity would have much bigger problems).

"agile" became a noun but if you look at the origins, you might get why "craft" may be applied to programming. Try "The Pragmatic Programmer" book.

We can agree that coding can involve math, engineering and craft. (And art!)

Mathy projects, formally driven: matrix multiply libraries, symbolic computation, constraint resolution, ...

Engineered projects, formally (or close to it) verifiable: 3D rendering pipeline, distributed database management, garbage collection process, ...

And craft. Which, based on the internet I have experienced, many apps unjustly inflicted upon me, and some memorable restarts between game saves, is most code.

Craft code necessarily involves amateur code, code which isn't economically worth engineering (when you can just throw unit tests at it. Or just wait for user reports!), code referencing weakly characterized libraries or interfaces, and code involving features that have become complex enough that the best reference model for its expected and unexpected behaviors is now itself.

Bzilion's Law of Coding Formalism Levels: "Any ambitious enough software project will descend into an exercise of pure desperate craft. Just before it becomes gambling."

I think this comment may miss the computer forest for the computer science trees. For a large portion of the world, computers aren’t engineering or math, computers are a tool to get something else done.

For those people, unless something fit within an existing (but large!) range of use cases, they were out of luck without having an engineer or mathematician figure it out for them. Suddenly, there is a glimmer on the horizon that all of that possibility the computer science people see every day could be unlocked for the users, and even if it only works 5% of the time, that is enough to get them excited in ways that are hard to describe to the computer science people.

This is a fantastic point, and it's what most software businesses have at their core. They just provide the tools to get something else done. A lot of these smaller places are going to be devastated when people become far more self reliant (or I should say reliant on the AI providers) than them.
My analysis is limited to tech people.

For the rest of the world, while some might be excited by what you describe (and it that works for them, that's great!), I believe in general the interpretation is far simpler: me like shiny.

This is some very CS high-horse thinking. I work with people who are already using it in ways that meaningfully improve their existing workflows. It isn’t doing anything special to someone who makes a living on computers, but it is doing things they couldn’t do without those people.
Genuine question, are you seeing non-coders using it to do any useful coding? As a developer, I find that it is wrong more often than it is right and were it not for my domain knowledge, I would have no idea why (or sometimes when.)
/r/iamverysmart
A “tool to get things done” doesn’t seem to contradict the math or engineering point of view. Which is to say, a screwdriver is an engineered device that is also a tool (it also has a mathematical description I guess, just, a fairly boring one from a pure math point of view I guess).
Sure, but the relationship is different.

Imagine going to school, a boot camp, or being self taught in everything about screwdrivers and screws. You can discuss at length the advantages and disadvantages of different shapes (Robertson bits > all), materials, screw threads, etc. You can custom design a screwdriver and screw for a specific application, taking into account all of the relevant constraints.

Now imagine the guy who needs to tighten a loose cabinet door.

Screwdrivers don’t have nearly the complexity or ability to generate work leverage that computers do, moving even a few percent of those capabilities from the first group to the second is huge. It is, at minimum, Excel huge.

That's a great analogy which I will steal.
This nails it on the head pretty much for me. I'm personally hugely interested in the potential of LLMs to enable me, a non coder, to create programs that might only have marginal utility to others, so are likely not going to get built by anyone who actually knows how to do this stuff, and aren't exactly important enough for me to actually learn how to code (I don't really have the right type of brain for it anyway) but are interesting / useful enough to me to figure out how to get LLMs to make them for me, as I don't really care how they work as long as they do.
A personalized sociopathic fabulist for all!
> the usual debate on emerging technologies between Thymoetes and Laocoon.

Could you expand on this?

> Something that works 99 times out of 100 is 99% correct from the first perspective and 100% wrong from the second

Interesting. From a _manufacturing_ perspective, you can't achieve 100%, you can only get asymptotically closer to it with statistical process control. And of course there are limits to the perfectibility of humans.

This suggests that the big deployment of AI will be in areas where there is no clear boundary between right and wrong answer.

>> the usual debate on emerging technologies between Thymoetes and Laocoon.

> Could you expand on this?

Not important, it's just a rhetorical flourish. In the second book of the Aeneid, Thymoetes is the guy who says (paraphrasing) "let's bring the horse inside" and Laocoon is the guy who says (literally) "beware of Greeks bearing gifts".

> This suggests that the big deployment of AI will be in areas where there is no clear boundary between right and wrong answer.

"AI" is an umbrella term at this point. If by AI we mean LLMs or similar technology, then my hunch is to agree with the statement. I don't think this is particularly controversial though, IIRC Yann LeCun said something similar.

"Timeo Danaos et Dona Ferentes"

Roughly, "I fear Greeks even when they come bearing gifts".

There's never been a camp of computer science that said anything but the truth, which is that CS is applied mathematics.

However, there is a pragmatic school of hacking, which says that results are all the matters. If you're in a startup, you should be pragmatic, and worse is better.

Nobody truly believes that CS is engineering.

Software Engineering is engineering
Software engineering is actually not engineering either.
TIL I need to tell my school to change my degree on diploma

What reasoning are you using to come to conclusion that software is not engineering?

>The creative application of scientific principles to design or develop structures, machines, apparatus, or manufacturing processes, or works utilizing them singly or in combination; or to construct or operate the same with full cognizance of their design; or to forecast their behavior under specific operating conditions; all as respects an intended function, economics of operation and safety to life and property

It is purely software engineering.

Because engineering is a specific discipline that balances physical force, the nature of materials, and costs, to produce a physical thing (a building, a bridge, a sewer system, a reservoir... "Software engineering" is a metaphor for the body of knowledge and ability to design and construct software systems. Unlike engineering, there aren't that many right answers.

Whenever you have to qualify a noun with something else, the result is something narrower than the original noun, and often completely different:

- Software Engineering is not Engineering

- Street Justice is not Justice

- Covert Intelligence is not Intelligence

You can just read what Dijkstra said about it already 30 years ago, not much has changed, and I would have little to add.
If there isn't a possibility of someone dying if you make a mistake it isn't engineering.
The software people who work on embedded systems like medical devices, speakers, sensors, electronics etc. are doing engineering all the time. Although I guess it's mostly EEs or computer engineers who do that kind of work, but many concepts of CS are still relevant there.
Engineering is just applied math too.
That is true of real engineering; unfortunately "software engineering" has been defined differently by some authors, with quite the opposite meaning.
I find your observation about the two camps in computer science quite compelling, and it got me thinking about another analogy that might further illuminate the LLM debate: the evolution of cities.

Urban development can be seen as a balance between careful planning (akin to the mathematics camp) and organic growth (resembling the engineering camp). A city designed with a focus on aesthetics and theoretical frameworks might be visually appealing, but it could lack adaptability. On the other hand, a city that grows organically may not be as cohesive, but it's more practical and responsive to its inhabitants' needs.

This parallel can help us better understand the emergent properties of LLMs, which arise from their complex interactions. By appreciating both the engineering and mathematics perspectives, we can gain a more comprehensive understanding of these properties.

Moreover, the balance between early adoption and risks, as seen in urban development, can also apply to LLMs. Early adopters of LLMs can tap into their potential, but they must also be aware of potential risks, such as biases and ethical concerns.

Oh yeah ChatGPT wrote this answer.

Good reminder of how vacuous even thoughtful-sounding writing can be.
Well, the interesting question is whether LLMs will enable programming to go back to being interesting.

So much of programming is rote boilerplate garbage simply linking things together and so little of it is actual creative thought. If LLMs could actually generate the rote boilerplate, programming would be soooo much better.

Alas, my optimism isn't that high.

The issue I see with attempting to claim that these are merely differences of opinion is that it only takes a single bug in your code for someone well-versed in exploitation to not just steal all your data but often replace your entire program with their own evil one. I spend quite a lot of my outreach efforts essentially having to explain to the people who think software development is somehow unrelated to math that once you accept a bug into your codebase the effects tend to be as non-local as accepting "1+1=3" into a math proof, resulting in lost privacy, lost money, or even lost lives.
Computer Science is, by definition, math.

Coding and software construction is engineering or craft, and is not Computer Science.

LLMs are neither. They are power tools for concept realization.

It's the difference between stone chisels and a suite of shop tools. We had pen and paper, or small steps up from those, and now we have LLMs.

FTFY: 20 hours a week !
I think this is mistaking the current .01 iteration with what the technology will be able to achieve. All sorts of groundbreaking technology looks like a minor improvement over the previously refined version until it gets implemented in a way that takes advantage of its strengths, as opposed to just being plugged into old workflows.

LLMs cannot be judged by their first few incarnations. What can be trained into them currently exceeds imagination. Imagination is our limiting factor.

And I don’t say that from the context of “I jumped on the hype train at the end of last year”. I remember reading the 2017 Google transformer paper and thinking “whoa, this is really happening.” The fact it happened in only 5 years is pretty impressive. Im not sure many papers or innovations got my mind spinning quite like that one.

But there is an unanswered question of how far this technology can go based on its fundamentals. Coding is much like driving, you can't do 80% and let the human do the final 20%, because that final 20% requires reasoning about a well understood design that was implemented throughout the first 80%.

If your fancy AI coder thingy can't really reason about the end task that the code is solving - and there is little to indicate that it does, or that, any moment now, technology will advance to the point that it will - then the 80% will be crap and there exists no human that can finish the last 20%, not even if they put up 200% of the effort required. We still don't have a working AI solution for driving, a well understood and very limited problem domain, never-mind the infinite domain of all problems that can be explained in natural language and solved with software.

What you end up with is a fancier autocomplete, not an AI coder. Boilerplate and coder output might simply increase to take advantage of the new more productive way of generating source code, just like they did for the last decades whenever there was a "revolutionary" new tech, like high level languages, source control, IDEs and debuggers, component distribution etc. etc.

You’re already limiting your imagination to “coding.”

These are data transformers that can transform raw data without coding at all. At what point does a model itself replace code?

It’s sort of like a CPU, right. You can have hardware that specialized, or general purpose hardware that can do anything once instructed. LLMs have the ability to be general purpose data manipulators without first having to be designed (or coded) to perform a task.

> data transformers that can transform raw data without coding at all

How do you know this is 100% reliable, per upthread discussion?

We've already had this problem with Excel in various sciences, which while deterministic has all sorts of surprising behaviors. Genes had to be renamed in order to stop Excel from mangling them: https://www.progress.org.uk/human-genes-renamed-as-microsoft...

AI promises "easier than Excel, but not deterministic". So more people are going to use it to get less reliable results.

Weird argument. Excel is one of the most popular and profitable programs of all time. If your argument is that LLMs are like Excel, the logical conclusion would be that they would be wildly successful.
Quite possibly. But not 100% reliable.
Isn’t it deterministic with the temperature turned down? You can control when it gives a precise vs fuzzy answer.
I didn’t say “LLMs solve all problems” or “there will be no place anywhere for code anymore.”
okay - how do you distinguish between scenarios where it's appropriate and where it's dangerous?
Ok, but then you completely give up the ability for human actors to understand and fine-tune the process. It would necessarily be a stochastic product: we don't know exactly how it works, it seems to output correct results in our testing but we can't guarantee it won't cook your dog in the microwave.
I completely agree that groundbreaking technologies come from an iterative process. However in the case of LLMs I believe we are already at a point where we can judge where the technology is going as its not the first iteration. Sure it will keep getting better and I think that its already a very useful tool.

My problem with it is that they are over hyping its capabilities and trying to market it as "it makes developers 55% faster" because it writes the code for them. I think it would be a better approach to market it as a great tool for automating repetitive tasks and a better way to consume documentation.

How would you respond to the central premise of the article? Which I understood as:

* There may not be a lot of differentiation between different LLMs in the long run

* Where there is differentiation, is in data (both the data used to train it and the data provided within its context window for a given query)

* Ergo marrying search to the LLM, while currently in its infancy, will be a big deal and a big differentiator -- because if you can quickly find the right data to pack into the context window, you will get much better results than what we're seeing today.

The technology hadn't clicked for me either. Today I had to write a script for which it would have taken me maybe 30 minutes or so on my own. I asked ChatGPT (GPT-4) to write it for me, and it got it right in the first try. I just spent a few minutes checking over the code.

It truly is magical when the code just runs. Later I asked it to make several non-trivial changes to the code based on more requirements I thought of, and it aced those on the first go as well. Again, I checked the code for a negligible amount of time - compared to how much it would have taken me to write the code on my own.

I do think humans will slowly get worse at lower-layers of the computer stack. But I don't think there's anything inherently bad with it. Compilers are also doing the work for you, and they are making you bad at writing assembly code - but would you rather live in a world where everyone has to hand-write tedious assembly-code?

Maybe, in the future, writing Python would be like what writing assembly is today. We might go down the layer-cake once in a while to work with Python code. That does not mean we give up on the gains we get from whatever layers are going to be put on top of Python.

The compiler is a deterministic tool (even undefined behaviour is documented). So you can spend some time understanding the abstractions provided to you by your compiler and then you know exactly what it is going to do with your code.

What is the equivalent of this for LLMs? Is there anyway generative models can give a guarantee that this prompt will 100% translate to this assembly? As far as I understand, no. And the way autoregressive models are built I don't think this is possible.

I agree that they are useful for one-offs like you said, and their ability to tailor the solution for your problem (as opposed to reading multiple answers on stackoverflow and then piecing it yourself) is quite deadly, but for anything that is even slightly consequential, you are going to have to read everything it generates. I just can't figure out how it integrates into my workflow.

This is nice, but if you actually like writing code, rather than instructing someone in natural language what you want to have written, then this is not an attractive prospect.

It’s like telling a novelist that they can produce novels much faster now because they only have to think of the rough outline and then do some minor editing on the result. For most, this is antithetical to why they became a novelist in the first place.

You're talking about the distinction between doing something because you love it and doing something as a means to an end.

It's a funny distinction! Knowing something can be automated can take some of the fun out of it, but there are plenty of people who still do stuff for fun when they could buy the end result more cheaply.

For employers, though, it's all a means to an end. Go write for the love of it on your own time.

Except that many people don’t get into their profession as a mere means to an end. They chose the profession because they like it, and they want to spend their lives doing stuff they enjoy. Being employed just as means to an end is not worth the large amounts of time you spend doing it, if you can help it in any way. Let’s not normalize a dystopia here.
And else thread from a couple days ago... https://news.ycombinator.com/item?id=35235534

> I was recently laid off, and I know a few other people laid off. I have years of doing projects and contributing to OSS and being a technically curious learner. I found a new job much faster than my peers who admittedly joined tech for the money and don’t care to learn or grow beyond their next pay raise.

There is a fairly consistent chorus of people getting into software development - not because they enjoy the intellectual challenge that it presents but rather because of the potential for the pay.

As someone who does enjoy software development (I chose this path well before the dot com boom), I believe that we over-estimate the number of people who enjoy it compared to just grinding through writing some code and if something else paid as well, they'd jump in a heartbeat.

The dystopia is already normal.

The firm can't really afford to care too much about why its workers entered their professions. The firm has to care about the cost of its inputs and margin lest it be devoured by a competitor or private equity.

This subthread started by welcoming that you can be more efficient by spending less time writing code and more time prompting an AI and double-checking what it produces. My point is that’s not an attractive outlook for many software developers, and as one of them I certainly don’t welcome it. From that perspective, the progress in AI may turn out to not a benefit for those software developers, in terms of job satisfaction.

The fact that companies may see that differently is beside the point, and I don’t particularly expect them to care for my preferences. I will however certainly continue to choose employers that happen to accommodate my preferences.

The article compares to Stack Overflow, but this comment makes it look more like a comparison to compilers which is a much bigger deal than some website, and actually worth paying attention to.

Anyway, people still write assembly kernels, so it is just that they only do it for cases that really matter. And there are a lot more coders than there were back when every program was assembly. So, it seems like great news.

Your reply might get me to pay OpenAI to use GPT4 lol
In my experience programmers hate to read each other's code. That's why rewrites are so popular. Do they really want to read an AI's? I bet the AI writes even worse comments your predecessor.

One of the more toilsome bits of coding I do personally is rebasing. I have a patch to add application-time temporal tables to the Postgres project, and I've been rebasing it for several years now. It's a pretty big patch (actually a series of four patches), so there are almost always non-trival conflicts to deal with. If ChatGPT could do that for me it would be awesome.

But it's probably the hardest thing for an LLM to do. It's not a routine program that has been written thousands of times across Github projects and StackOverflow posts. Every rebase is completely new.

OTOH it would be awesome if git had just a bit more intelligence around merge conflicts. . . .

What I would like to see is an AI which actively or passively assists you, like an improved Intellisense. Something which looks over your shoulder, figures out what you're trying to achieve and points at errors in your reasoning or stuff you did not consider.

It can summarize the thing you're looking at, tell you how to improve it regarding readability and performance.

On a press of a button you can zoom out of the code into an UML like overview and it will tell you what's going on and how it is connected. If you don't get it, it knows how to make you understand.

Then you can tell it in a few words what you want to achieve and it will assist you in finding a solid solution which matches the coding style of the rest of your project. And while you're coding and lose sight, it will help you achieve the goal.

The current state is sub-par in my opinion. I can write good code and don't need an AI to write it for me. But what I want is something which assists me with understanding code, improving code or extend code without taking the steering wheel away from me.

Your first point I agree with, I've already encountered chunks of AI generated code and I don't want to read them.

Second point about the comments, actually I'm seeing the AI write much better comments (i.e. some) than most devs (none).

> actually I'm seeing the AI write much better comments (i.e. some) than most devs (none).

Some comments are far worse than no comments at all. I would agree that even semi-decent comments are far better than nothing. However, "no-new-information" comments are just noise, and misleading comments have a huge negative effect. I would not be surprised if an AI produced a large number of the former, and perhaps some of the latter.

In my experience, people hate to read each other’s unnecessarily complex code. Nobody complained about a well written (ie easily understandable) codebase ever. This is why reviews exist among others. Comments are only useful if there is some magic because of performance reasons. In any other case, if comment seems to be necessary, then code should rather be refactored. It’s a code smell. Abstract classes, indirect loops in runtime call stack without IoC, and templates are also dangerous animals. Nobody complained so far for codebase where these were minimised, especially if microservice architecture was introduced which also inherently tames DRY on some level.
> comments are code smell [paraphrasing]

Good news: there are kinds of extremely useful comments that do not repeat the code (your comments should not repeat the code). Comments are to express context/intent behind the code: the "why", the high level "what", and almost never the exact "how" (read code for that).

It looks like you only ever encountered the "how" comments. No amount of code refactoring would get you the "why" (context) comments.

GIT tells the why.
Yes, commit message is also a good place to provide the context for the commit.
I am working on some JS that needs to be obfuscated. After mangling the whole thing, suddenly I was gripped with horror: I remembered that GPT is pretty good at deobfuscation.

I put my mangled and minified code in.

What it emits is not only perfectly readable and 95% accurate (the 5% was due to missing context, max input limit)—it was significantly better than my code.

Of course the structure was the same, but ChatGPT chose much more sensible variable names in almost every case. I found it much easier to understand its version of my code than my own.

I guess I accidentally discovered a refactoring technique?

At least the AI won't complain about the refactor haha
It was trained on human data about the same subject so we have every reason to expect it'd complain
"Does anyone feel that way? Maybe I am wrong and the technology hasn't really clicked for me yet."

No, I'm trying mightily to do what Yegge is talking about in the context of the programming work I do everyday. First v3 then v4. I've given up until maybe v7 or something.

The problem is it doesn't have experience with my code-base. Sure, tell it to open a file and return a stream, it'll do that (after I fix the using statements), but for what I'm doing every day it doesn't even begin to know what to do.

And because I'm careful about KISS and SOLID I don't really need a lot of simple code generation. I don't see 5x productivity. I actually don't see much advantage over the built in tools in VS.

Maybe I'm doing it wrong, or maybe this make sense for people who write a lot of boilerplate, but that's not a lot of what I do.

You might look at Github Copilot then. It actually looks at my project and helps me write within the context of all of the other code that I've written.
It can appear reasonably smart on the surface, but all it is is a stochastic parrot. It cannot reason with you about the code.

To best illustrate what I mean, watch this chess match[0] it's quite riveting.

Since it read millions of matches, it can predict a legal move most of the time, and even some good moves some of the time, but it cannot "understand" the rules of chess, and makes some hilariously illegal moves, especially if the match lasts longer.

[0] https://www.reddit.com/r/AnarchyChess/comments/10ydnbb/i_pla...

On a similar vein, I tried to get chatgpt to play wordle. The result looked something like:

Me: crane

GPT: _ _ _ _ e

Me: moist

GPT: _ _ _ r _

Me: glyph

GPT: you guessed it, the word was glyph

Now, maybe GPT 4 or other future developments will give better results, but to me this highlights exactly what you're saying. LLMs do not have an internal structure in their 'minds' that they're pondering about. It's a very impressive engine for guessing the next character to produce into a stream.

There's definitely usages for this, but not what a lot of people are saying.

> you guessed it, the word was glyph

My pet conspiracy theory is that is is wired to please the user, to get better coverage from the media and social media.

I think there's something to this. I have a theory that LLMs are implicitly trained primarily to impress people, since that's what motivates those who work on them, excites the general public, and convinces conferences to publish papers.
In a sense, this is exactly what RLHF is, right?
I'm thinking of something at a larger scale. In some sense models that "wow" society get more interest and funding.
I don’t think so. In Wordle you have to guess the word in six attempts. It’s a fun game and often simple.

So it could be that ChatGPT picked up on a pattern in the training data where after a couple of guesses, a lot of the time people pick the right word.

So statistically it might go like. Guess a word. Probably not the right one. Guess a couple more and suddenly it’s statistically likely to be the correct word, and because of that the LLM ends up outputting the congrats and so on

As a slight correction, it isn't next character but rather next token.

https://help.openai.com/en/articles/4936856-what-are-tokens-...

> Tokens can be thought of as pieces of words. Before the API processes the prompts, the input is broken down into tokens. These tokens are not cut up exactly where the words start or end - tokens can include trailing spaces and even sub-words. Here are some helpful rules of thumb for understanding tokens in terms of lengths:

> ...

> Wayne Gretzky’s quote "You miss 100% of the shots you don't take" contains 11 tokens.

https://platform.openai.com/tokenizer

It isn't going character by character, but rather token by token - both for input and for output.

This also helps explain why it has trouble with breaking a word apart (as in the case of wordle) because it doesn't "think" of glyph as 5 letters but rather two tokens that happen to be 'gly' and 'ph' with the ids of [10853, 746].

This an interesting example because, due to the API being stateless, it cannot have any solution “in mind.” It’s going to reply as if someone else started the game and it’s taking their place for the current turn.

Even if it reasoned ideally, it has to check if your answer is consistent, and if it is, decide whether to let you win or to find another solution. If your answer is inconsistent, it effectively has to play wordle by thinking of another answer.

It doesn’t seem all that surprising that it didn’t figure it out.

I tried to get it to remember its word by asking it to give it to me in b64: (GPT-4)

> We will play the word game wordle. Start by picking a 5 letter word. Display it in b64 so I can check later that you kept the same word all along. I will submit my guesses and you will show the letters that are correct, we will keep going until I have found the word

> Great, I've picked a 5-letter word and encoded it in base64. Here's the encoded word: Q2l0eWw=. You can now start submitting your guesses!

But after a few tries it was clear it was not working so I gave up:

> I give up, what's the word?

> Before I reveal the word, let me first decode the base64 string I provided earlier to ensure that I'm giving you the correct word. The base64 string was: Q2l0eWw=. After decoding the base64 string, the word I chose is "citly". However, I apologize for the confusion caused during the game, as "citly" is not a valid English word. I made an error while choosing the word. Please accept my apologies for any inconvenience caused during the game.

However asking it for a python program to play the game worked perfectly the first time, try it here: https://replit.com/@CorentinSmith1/gpt-wordle#main.py

[As an aside, is the GPT feedback functionality correct? For example, if the secret word is "peals" and the user's guess is "apple" the feedback is going to be "a?p?p?l+e?". However, I would expect the feedback to be "a?p?_l+e?". Because the first 'p' exists in the secret but is in the wrong place and the second 'p' does not exist in the secret word.]
One thing someone could try is giving it a few examples of how to score a wordle guess, to see if it figures out the pattern.
Exactly and I personally think that will always be the largest limiter to how good can the technology get. No matter how good the stochastic parrot gets, its still a parrot.
It is humbling, if not humiliating, that a stochastic parrot can reproduce such a significant chunk of human intelligence. The association elevates stochastic parrots more than it denigrates LLMs.
(1) GPT-4 already had a large improvement over ChatGPT

(2) Changing the prompting reduced the illegal moves to almost 0

(3) There have been experiments that show GPT has a internal "state" of the world and can do simple reasoning puzzles. This model of the world evolves with each generation.

I understand the skepticism, but don't let that blind you to the reality of the technology. I'm a skeptic at heart, and I could immediately tell GPT was a game-changer. It can already replace half of the ML models that are used at my job and do it better (if it was economical enough).

> To become proficient I need to write a lot of code, but the more I use LLMs, the less repetitions I get in. So in a way it feels like LLMs are going to make you a "worse" programmer by doing the work for you.

I've been experiencing this myself recently. I've been using co-pilot in some side projects. I've noticed myself getting more 'lazy' as I use it more.

Recently I used it when doing some old (2015) advent of code puzzles I hadn't done before. I would read the puzzle prompt and have a pretty good idea of what I wanted to do. I wrote out some comments for functions and co-pilot was able to write what I needed with minimal changes.

Even though I read through co-pilot's code and understood what it was doing I don't feel like I really retained anything from the time spent. If anything, I feel like co-pilot stunts my learning.

The keyword is "hype". It seems like any new "thing", no matter how useless, will get its hype cycle rolling.

Crypto, NFT, Blockchain, AR, Metaverse, -- from the top of my head -- now AI. The point of hype is to attract investment. Big Corps must be driven by the fear of missing out on yet another world changing shiny new thing.

Is your assertion that because there have been other hyped things in the past, that nothing which is spoken of positively will ever actually be useful? Because you're gonna miss some pretty big stuff with those sort of glasses on. You know what else was hyped? Most everything you use today. Sometimes people use something and are absolutely blown away by it and are excited to talk about it. Not everything is 100% fake yet, I promise.
I think there is a degree of fatigue from the stream of breathless "this is going to change the world, if you disagree you're wrong or don't understand, and if you don't participate you'll wind up poor" takes. We're barely one year out from nearly identical language around NFTs and "web3".

IMO these AI technologies have obviously more tangible utility than some of the other hyped things on the list, however a lot remains to be seen about where they go.

I've found it to be extremely useful when _either_ you know the language really well but you're kind of exploring some new domain, or when you know the domain really well and you're new to the language.

When you know both it's just really good autocomplete, which is great but not a huge game changer. If you know neither then you're not in a position to assess the output. But when you're still learning either the tool or the space I've found GPT to be a good tool for leveraging one expertise to create the other.

There is a massive gulf between "correct code" and "correct implementation" in many real-world scenarios.

Business logic and baking in domain expertise into your data model is most of the work. Making the code work efficiently doesn't matter if your code doesn't even do what it's supposed to.

Normally this is an argument in favor of human-in-the-loop LLM-based development -- "the human just needs to curate and verify!" However it seems all too easy to me (especially having witnessed it more than a few times) that subtle discrepancies emerge between the stakeholders' desires for the function of the code and the developers' understanding of those requests. Hopefully we reach a best-case scenario where that's all developers need to focus on, but more likely we'll see some pretty egregious things slip through the cracks (the wave will likely start with security/privacy issues before the phenomenon is recognized) as this technology matures into the common workplaces.

> only need to "check the code is good"

... because we all know proving correctness is the easy part of writing software!

I can't wait to read about software engineers finding out some MBA had a huge codebase written by a language model and a few offshored contractors only to realize it's incredibly bugged and being hired to "just go and find the mistakes the error the ai made, should be easy all the code is written".

Nobody can deny the fact that ChatGPT can easily generate solutions for bazillions of relatively simple problems in various programming languages. What bothers me is how often it is completely wrong and how confident it is about its solution.

A sample example. I asked it to generate Terraform code for registering an organizational unit in AWS Control Tower. This is impossible because the API of Control Tower is very limited. But ChatGPT was very happy to generate a solution pretending to use the official AWS module with a made up resource. Of course, the "solution" was not working at all. But if I ask it to do a trivial task, such as attaching an OU to an organization using AWS Organizations, it can do it perfectly well. And this, for me, is the difference between a human programmer and a machine that is good at certain tasks.

Imagine a narcissistic human programmer who is a compulsive liar and won't admit to 1) being wrong 2) something being impossible or 3) not knowing something, and instead just making up plausible sounding business synergy bullshit to please the PHB.

That's pretty much every ChatGPT-programming sample I've read so far.

This one thinks character `i` in elisp regexps is matched with `\i`.

I agree - it’s hard to enter a flow state while reviewing someone else’s - or some AI’s - code. That’s a major reason why I haven’t started using LLMs for code, personally.

I am glad someone else feels this way. Maybe it’s not going to be as big a paradigm shift as I originally expected.

It will probably raise the floor a lot. The least competent (not meaning in a bad way! I was one of those) coders will be a lot more competent all of a sudden.
I'm skeptical. It's easy to make something sound correct at a first glance but that has subtle fundamental flaws that invalidate it.

Knowing humans and the LGTM phenomenon, these kinds of issues will slip by quite readily.

The current ChatGPT is just a preview of what's possible. 2 years from now it will be able to create a DB, a set of microservices and web and mobile frontends, deploy these on a cloud platform and app stores and test them, all from a 30 min chat with a person, going over a business idea on very high level.

Think about for example how Windows 1.0 looked. For an expirienced DOS user it was offering very little. Expirienced DOS users were saying GUIs are over hyped. Today there are probably a few dozen people worldwide who use a computer without a GUI (or a voice interface).

ChatGPT&Co will obviously make 90% of the software developers out there obsolete in just a few years. An industrial revolution is happening in the software industry.

When you need to rename a button you’ll spend another 30 minutes talking about your app because ChatGPT does not “understand” code.
I didn't get five times as productive yet. It's something closer to a few percent or less, which makes LLMs about as useful as syntax highlighting. It's nice to have, but not essential.

We will see in a few years.

I think there are ways in which LLMs will be very important, especially if we are able to get access to raw models /embeddings. That will let the models be extended to create new models and use cases. For example, personally I want to search Google and not ask a chatbot questions. LLMs could still be useful for identifying SEO spam and removing it from search results. Thus LLMs improve search but aren't giving me a watered down summary of everything I'm looking for.
> To become proficient I need to write a lot of code, but the more I use LLMs, the less repetitions I get in. So in a way it feels like LLMs are going to make you a "worse" programmer by doing the work for you.

You will definitely learn from LLM suggestions. The mantra „Read other people’s code“ is accurate IMO - as long as the code is at least ok-ish. I‘ve learned a ton from code that ChatGPT generated for me already.

I think you make fantastic points (Sourcegraph CTO, here). This is one of the reasons why we focused on code understanding rather than code generation for the initial version of Cody (in contrast to, say, GH Copilot).

For code understanding tasks, the issues with standalone LLMs is that they have a certain amount of "memory" which is limited to their training data (SO and OSS)—and even that can be unreliable.

A big "a-ha" moment for us was the realization that LLMs get much more helpful and reliable when coupled with a competent context fetching mechanism that can surface relevant code snippets from your own codebase. This makes Q&A much more factually accurate (and code generations that learns from the patterns in your codebase). We don't think LLMs will ever replace human coders, but we think they can be super helpful in eliminating a lot of the tedious, boring, duplicative writing and reading code that devs do every day. The entirety of Sourcegraph (not just the LLM part) is focused on eliminating these pain points.

Code requires too much precision and is entangled with legal hurdles

The value here is that the llm can act as a knowledge graph were common sense is preloaded on almost every topic, so that the user can add node and edges on the graph in natural language and perform extraction in natural language

And you don't need fine tuning as long as you can fit the topic in their token space, and with gpt4 reaching 32k tokens you can load a huge amount of text and perform queries on it.

That's what makes the tax return example so interesting. The model has already learned a lot of common and uncommon sense so it will not need the instruction on how to process the text or parse the query.

Forget coding, but everything else is great for.

These are good points.

I think though that LLM-based tools will eventually formalize to achieve a greater precision at what's required. I suspect that they could be a base for a new crop of different, much-higher-level programming languages.

Programming languages went a long way; somebody from 1960 would have hard time putting things like Haskell or even SQL into the same conceptual bin as the original Fortran. We routinely see them as programming languages though. I don't see why this trend can't continue upwards, relegating even more legwork onto the machine while talking to it in reasonably precise, standardized, domain-specific terms.

They aren't for you in this context then. The value of an LLM that can write passable code is not to take an experienced developer and make them better, it's to take someone who can't code at all and allow them to generate code. Whereas it might make you 10% better (or whatever your estimate is), it makes them infinity times better as it allows them to do it at all, even if it's not very good.

Think of it like an accessibility aid. It doesn't help people who don't need them, but for those who do it's life changing.

I'm quite proficient in Python and Django (main tools I use daily).

Yet I find myself asking ChatGPT every now and then "hey how do I do <foo>", where <foo> is something I last needed to do a year or more ago. I can recognize the correct answer but don't need to search docs/net for it.

The reason this is faster (for me) than Googling or using Dash/Zeal is that the answer is already in the context of what I'm trying to do, whereas if I'm only looking at the docs, I will probably need to go through several pages to get a complete picture.

It is overhyped (thanks to every rando broadcasting how amazed they are). There is no causal learning happening. The randos will takes their own sweet time to work it out.
> 1. It is my belief that if you are proficient enough in the task at hand, it is actually a distraction to be checking "someone else code" over just writing it yourself. When I wrote the code, I know it by heart and I know what it does (or is supposed to do). At least for me, having to be creating prompts and then reviewing the code that generates is slower and takes me out of the flow. It is also more exhausting than just writing the thing myself.

I'm sure there were programmers who said the same thing in regards to high-level programming languages.

> 2. I am only able to check the correctness of the code, if am am proficient enough as a programmer (and possibly in the language as well). To become proficient I need to write a lot of code, but the more I use LLMs, the less repetitions I get in. So in a way it feels like LLMs are going to make you a "worse" programmer by doing the work for you.

Maybe that becomes irrelevant the more that the skill of the programmer shifts from handwriting "correct" code to supervising code generators while proofreading their work, and of course providing effective acceptance criteria. There's also a massive bias towards failed predictions of the past that serves to discredit predictions that may see a greater degree of manifestation. For every time someone says "but people predicted this before and it didn't pan out", I can point to technology that did fundamentally change how an industry works and even make jobs obsolete.

Seems to me a lot of programmers on HN are refusing to believe that their ability to be proficient with code may be either outdated or supplanted by the efficiency of a system that writes code that is not necessarily "elegant" in human terms.

> So in a way it feels like LLMs are going to make you a "worse" programmer by doing the work for you.

Most programmers aren't great at what they do to start with, whereas LLMs can only get better from here on.

To me it's so funny when people say ChatGPT will make developers 5x more productive, because those people are basically just admitting they're not good at their jobs and assume the same holds true for everyone.
not really, this is more like having a strangely knowledgable yet naive junior employee - I can tell gpt-4 to put something together that gets me 90% of what I need faster than I could possibly even type it, it’s reducing my known tasks
look at what the web was 20 years ago and then look at what it is now. I dont get why people in the tech field where there advances every year look at gpt and say oh it does not do this or that like wtf is the tech stagnant will it not improve. You guys should be the ones that say if it can do this today how will it improve what it will be able to do tomorrow. Most advances come when there is war/competition in the next decade 100s of billions will be spend on this do you really think their will be no improvement?
Im sure you have a good point, but its difficult to grasp with such hasty writing :/
I've stopped using Stack Overflow almost completely (vs 10 times a day) and I don't miss it.
Hard disagree. I actually think the 80/20 mentioned in this article is low if done correctly. Let's think about the software engineering process for a minute and work out which bits to automate. Here's GPT3.5 on the process (PROMPT: List the steps going from a goal or dream to a software product in software engineering)

1. Idea Generation: The first step in creating a software product is to come up with an idea or goal that the software will achieve.

2. Research: Once you have an idea, it is important to conduct research to determine the feasibility of the idea and identify any potential challenges.

3. Planning: After research, planning is necessary to determine the scope of the project, the timeline, and the resources required.

4. Design: The design phase involves creating a detailed plan for the software, including the user interface, functionality, and architecture.

5. Development: In the development phase, the software is created by writing code, testing, and debugging.

6. Testing: After development, the software must undergo rigorous testing to identify and fix any issues.

7. Deployment: Once the software is tested and ready, it is deployed to the target audience.

8. Maintenance: Finally, the software must be maintained to ensure that it continues to function properly and meets the needs of the users.

Each of those steps has a back and forth with a LLM that can enhance and speed up things. You're talking about 4 as being problematic, but right now there's a lot of "human in the loop" type issues that people are encountering.

Imagine having the following loop:

1. LLM has generated a list of features to implement. AI: "Does this user story look good?" Human: "Y"

2. For each feature, generate an short English explanation of the feature and steps to implement it. Your job as a human is just to confirm that the features match what you want. "Should the shopping cart

3. For each step, LLM generates tests and code to implement the feature. AI: "Shall I implement the enter address feature by doing ..." Human "Y"

4. Automatically compile the code and run the tests until all tests implemented and feature is complete according to spec.

5. Automatically document the code / feature. Generate release notes / automated demo of feature. Confirm feature looks right. AI: "Here's what I implemented... Here's how this works... Does this look good?"

6. Lint / simplify / examine code coverage / examine security issues in the the code. Automatically fix the issues.

I think you also miss that the LLM can be prompted to ask you for more details. e.g. PROMPT: "I'm building a shopping cart. Ask me some questions about the implementation."

1. What programming language are you using for the implementation of the shopping cart?

2. Are you using a specific framework for the shopping cart or are you building it from scratch?

3. How are you storing the products and their information in the shopping cart?

4. How are you handling the calculation of taxes, shipping costs, and discounts in the shopping cart?

5. What payment gateway(s) are you integrating with the shopping cart?

Which can then be fed back to the LLM to make choices on the features or just plain enter the answer. PROMPT: "For each question give me 3 options and note the most popular choice.", and then your answers are fed back in too. At each point you're just a Y/N/Option 1,2,3 monkey.

More succinctly, in each step of the software game, it's possible to codify practices that result in good working software. Effectively LLMs allow us to build out 5GL approaches[1] + processes. And in fact, I'd bet that there's a meta task that would end up with creating the product that does this using the same methodology manually. e.g. PROMPT: "Given what we've discussed so far, what is the next prompt that would drive the solution to the product that utilizes LLMs to automatically create software products towards completion" ;)

[1]: https://en.wikipedia.org/wiki/Fifth-generation_programming_l...

(Feel free to reply and vocalize your respose rather than just slapping a downvote)
Did you try GPT-4 yet? It's a huge increment over 3.5/ChatGPT
I’ve tried 4 and I really can’t say the results are a qualitatively better than 3.5 for the tasks I’ve been trying (which have been trying to get it to generate documentation for my project).

In fact, I find 3.5 turbo the best overall model as a tool, because to quality of responses really depends on the quality of prompts, and the quality of prompts is improved by reacting to responses, which come more quickly in 3.5-turbo. So while ChatGPT-4 is still writing the first not-good response, ChatGPT-3.5-Turbo will be on the 2nd or 3rd and it will be much more cogent.

It's way better. But equally slower!
Yep its actually able to create ideas that have never been done before.
Do share!
just imagine LLMs output as input to any other device.
Yes to 1. and 2.