>Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by-step answer examples, achieved the state-of-the-art performances in arithmetics and symbolic reasoning, difficult system-2 tasks that do not follow the standard scaling laws for LLMs https://arxiv.org/abs/2205.11916
It's just software. It takes skill and labor to achieve the outputs you want. Is someone who uses photoshop all day not an artist? Or someone who writes text for a compiler not a programmer?
You've got a huge blind spot if you think prompt engineer isn't already a thing.
>You've got a huge blind spot if you think prompt engineer isn't already a thing.
It may be a "thing", because generating BS is a viable business model and ChatGPT makes it more efficient.
..but I submit as a working hypothesis, that it is completely impossible to gain knowledge you do not already possess from a language model, no matter how clever your prompting.
I'm very interested in counter-examples, but I have seen a few that turn out to be fake already.
I don't really understand what you're saying. Clearly the results one produces with the software has value, but you say it's all "BS." Does that just mean you don't like it, or what? If someone asks me to produce some concept art for a project they've just started, and I write some prompts to produce that concept art, what is "BS" about the art I produced? What does gaining knowledge have to do with it?
I don’t know how to homebrew. If I ask ChatGPT to help me get started homebrewing, it lists helpful steps to start homebrewing. I can ask it to expand on any of those steps until the breakdown is actionable.
Checking some of the facts it gives me against other sites it’s all correct, but better organized and more accessible. There’s your counter-example. This works for basically any well-documented process.
That's so obviously false that I literally can't imagine how you could believe it. GPT-3 certainly isn't 100% accurate, but neither is it so perfectly unreliable that no one could ever get it to produce a relevant fact not in the prompt. And even if it were, it would probably still be potentially useful for learning languages.
> GPT-3 certainly isn't 100% accurate, but neither is it so perfectly unreliable that no one could ever get it to produce a relevant fact not in the prompt.
I think I understand the sense in which you claim it produces relevant facts not in the prompt.
It's not that we differ on easily observable behavior of the system.
It's that I question if GPT-3 is "producing" these identifiable facts, and if the user is "producing" them instead, whether they can possibly be "relevant".
>It's that I question if GPT-3 is "producing" these identifiable facts, and if the user is "producing" them instead, whether they can possibly be "relevant".
I'm not sure what you're trying to say. That GPT-3 is just vomiting stuff up out of its training set and not producing any new knowledge? But that's totally irrelevant to the issue of whether it can transmit knowledge to a user, who presumably hasn't memorized the entire training set.
>That GPT-3 is just vomiting stuff up out of its training set and not producing any new knowledge?
Hmm. Seems obvious to me that it's producing new output, but that output isn't knowledge and it can't be.
Sometimes ChatGPT tells me something that turns out to be correct and relevant. And I get excited, and then I Google it and what it told me is the first hit on Stack Overflow.
There's a subtle point here, that other people might say "well, ChatGPT is ok, but no better than Google" or something like that. But I differ on that. The key is that I don't know it's Stack Overflow until I check independently. So it's giving it too much credit to say it's as good as Google, and the amount of information it can output is not lower bounded by its training set, but is actually zero due to being adjacent to an infinite amount of BS that by its nature always requires external mechanisms to separate out.
New knowledge synthesized from existing knowledge. For example, you might know of A and B and maybe think that A -> B or B -> A based on their co-occurrence, but an AI might make you realize that C -> A as well as C -> B.
Ok, that's straightforward, I just don't care for the idea that AI can do it or even help.
You might synthesize new knowledge.
When ChatGPT produces new output, it's not synthesizing new knowledge. It can't even output the knowledge it was trained with, as long as it lacks the ability to tag it in a trustworthy way.
It's not that it's always BS, it's that it's almost always BS and if you don't know the answer in advance or independently, you can't distinguish it from anything within the model.
for a side project, it not only gave me the migrations, it suggested all the column names/datatypes, so basically I just said create me a laravel migration that's for an organization, this is a multi-tenant SaaS app, where an organization is basically a team, or tenant. You can think of these as companies as well, now make a migration that has columns that might generally be included in a company or organization.
It not only spit out the model but also the casts/fillable attributes on the model, as well. It even helped me work through an idea, that I didn't know what it was called, I was thinking it was EAV but instead it's metaform/metafields, to basically create something like how wordpress has the ability to dynamically create content 'types', django/wagtail can do this to, w/ chatgpt I think I've nailed down how to do this using polymorphism with the least amount of headache.
I'm wanting to create a CRM/CMS/ERP solution that can be very 'moldable' to different use cases, and this looks to be a good use, either way just being able to discuss with the ai my 'options', was like a major brain dump and increased the power of my flow.
YMMV, but if you can't get it to work like this, doesn't mean it doesn't, just means it doesn't for you, and while I can save 2-3 hours for every hour previously worked, that's valuable to me, esp as a freelancer who charges per project, not hourly.
There are lots of people with misconceptions of LLMs. It will take time to adjust.
I reached the same conclusion as yourself, but do see a totally different path to take regarding information propagation (how GPT works). For example, cells merge information monotonically. This is how neural networks balance too, but could be applied in new/undiscovered ways.
Welcome to the last decade of title inflation. Everyone is a "manager" now. A "marketing manager", "product manager", "account manager". No more secretary, it's "executive assistant". It's a perk a company can offer, conferring higher status, at no expense to themselves. So the equilibrium is for other companies to do the same, otherwise the company that gives this cost-free perk outcompetes for talent.
People are graduating watered-down educations, earning inflated cash, with inflated titles. It all helps people believe they're higher status, that they have a university degree and are a manager earning $80k, surely they're getting close to the top of the totem pole now. But they have a worse standard of living and education equivalent to high school in the '60s.
>This morning, I brushed and flossed, which makes me a Plaque Removal Engineer. I then used my skills as a Room Tidyness Engineer to make the bed. After that, I engineered the harness onto my dog and took her on a walk: Canine Fitness Engineer. I engineered the water to a higher temperature using the kettle, and poured it over some coffee grounds to create a chemical reaction in my morning brew: Yeah, that's right, Caffeine Engineer. After this incredibly productive morning, I got in the car and drove to my job as a computer programmer.
At this point the word "engineer" has lost its original meaning. Until there's a formal theory of how we can interact with LLMs and you make use of that in a systematic fashion, "prompt engineering" is really closer to "prompt artist."
> At this point the word "engineer" has lost its original meaning. Until there's a formal theory of how we can interact with LLMs and you make use of that in a systematic fashion, "prompt engineering" is really closer to "prompt artist."
Interesting angle. Are you saying there are rarely any "software engineers" out there, that they are all merely "software artists"? Cause none of these uses a formal theory for their craft. If they were then all those highly opinionated discussions of whether to use goto in C or what are the greatest flaws of node.js would just not exist.
Correct, in my eyes "software engineering" in the sense of "person who glues libraries together to build systems" should not yet be called an engineering discipline because there isn't yet a rigorous enough theory on why something should be designed one way as opposed to another. We are still in the stage of figuring out best practices and making things more rigorous (maybe the functional programming folks will end up contributing something here, but I don't know enough about category theory).
There are other narrower senses of "software engineer" such as "person who optimizes code" and to me those more qualify as engineering because we not only have a decent enough theoretical background (see Agner Fog's work) but also can experimentally verify things. On the other hand it's a lot harder to quantitatively say if one design is better than another.
I think there's also some work in terms of rigorously modeling concurrent/distributed systems (Lamport's TLA+) work which I'd like to see more of.
The naming might be flawed but it's like a batista I guess. You can spend a lot of money on the machine but without an expert to use it you will not get the best out of it.
Complex SQL requires knowledge about relational algebra (Cartesian products, set theory, domain relational calculus yada yada) and understanding of how RDBMS and their query planners work. At least, if performance is important to you.
I don't see this in this prompt engineering. In my limited experience (I played a few hours with Stable Diffusion and more hours with the OAI davinci-003 model), you can get good at it within a few days.
At this point, we're very much at an exploratory stage of LLM queries - you could of course be a ML/DL researcher or engineer that's intimately knowledgeable with the current architectures, but still - they're very large and very complex due to the sheer parameter size, so you'd still have to map out what inputs will predictably give what outputs on a finished model.
I'd imagine that being a "prompt engineer" entails finding out and mapping the structures that gives you the desired result. Think of it as a novice user of search engines VS expert user of search engines.
I spent a few hours learning SQL. I can get “good” at writing SQL queries within a few days. Do you want to hire me as someone who’s primary role is to write SQL queries?
I still remeber when Google appeared there were people offering search services for a fee (maybe they were called Search Engineers but I don't recal seeing that).
You tell me how you'll generate better photos, improve dialogue coherence across multiple speakers, and control camera direction and movement (something we're using LLMs for too as we experiment with special-purposed models).
All of this is not known a priori, by the way. And I won't accept building a database or lookup table as an answer.
I also want to know how you'll test, benchmark, and refine.
You also need to budget for inference complexity.
I'm waiting :)
I can do this myself, but it is a full time job. I am so busy with all other aspects of my business I'm looking for people to bring on board.
I normally don't work for free, but just as a sample, here's one I've been crafting for a few weeks...
Epic 4k HD photo, high res and epic, cool extra awesome photorealistic 5k or 6k, realistic, in the style of a really good photographer.
Nah I'm just playing, your company looks pretty cool, I just think a dedicated job for coming up with prompts (which is only going to become easier anyway with better ways to control output) is silly.
That's not a real prompt. A real prompt would look more like this:
At the top left of a 4k picture, place a dot with the RGB value of (0.5, 0.8,
0.2), where color components are expressed on a scale of 0-1 inclusive.
Then, on the top line, second position from the left, place a
dot with the RGB value of (0.7, 0.3, 0.4).
(approx. 8 million more to follow...)
I'm not kidding either. If there really is a real "prompt engineer" job, I am sure it's going to be like this, with a fig leaf of some sort. We saw how this worked during the brief period when everybody was doing a blockchain project. Oracle added blockchain features to their database. Now I'm sure they all have amnesia, but there are remnants.
Lawyers have a bad reputation, sure, but there’s a lot of education about the interpretation of our law and the absurdly large corpus of legal documentation that must be read in order to even become a lawyer is far and above anything you describe.
So you are presenting the fact that lawyers must digest an absurdly large corpus of legal documentation, but also maintaining that is something a human lawyer has some sort of advantage over a LLM?
>So you are presenting the fact that lawyers must digest an absurdly large corpus of legal documentation, but also maintaining that is something a human lawyer has some sort of advantage over a LLM?
No.
A LLM has more opportunity itself to replace a Lawyer, the person typing the prompt is not necessarily required to be as educated. Though a case can be made that you need to validate the information.
As it happens we have an opportunity to tell how this works. Software engineering has seen many abstractions of which each comes with its own complexity in verification.
What tends to happen is that people don't really do a lot of verification, we are just "mostly right" very fast and leave an immense amount of inefficiency and indirection behind us.
the person typing the prompt is not necessarily required to be as educated.
If I need someone to help me interact with a legal LLM I will want to (and probably be able to, for 300k) hire someone with a law degree. In fact I anticipate many lawyers in the future will effectively become “prompt engineers” for legal LLMs.
A LLM is a generator of misinformation that is maximally difficult to distinguish from real information.
How do you use this as a lawyer?
I mean, as a stereotypical evil lawyer in a world of naïve people who don't learn from experience, you could maybe use it to win cases until you destroy the justice system.
Speaking as somebody who spent thousands of dollars for a lawyer on a matter, with basically no results, and also who used chatGPT, personal research, and common sense to solve this same matter for free, I can only say if a LLM is "a generator of misinformation that is maximally difficult to distinguish from real information" a lawyer is simply "a human who has been trained to maximally drain your wallet without regard for any other matter", of course, neither is true and there exists far more nuance for both.
Sure, there are matters I would only trust a lawyer to handle, but there are a great many I wouldn't.
Further, the average quality of a human lawyer will likely remain the same tomorrow as it is today, while AI will only get better. LLM today, perhaps some hybrid stack tomorrow, it's only a matter of time before an AI lawyer is the way to go for just about any legal matter. And let me be clear, that time might be 10 years, or may be 100+, but it is coming.
This is a strange statement. No one is training LLMs to generate “misinformation”. It’s the opposite - it’s trained to generate the most likely next word, given the preceding 2000 words - using billions of examples from a real world training corpus. So it will try to generate as much information as what’s present in the corpus. Maybe even more, but that’s debatable.
>No one is training LLMs to generate “misinformation”.
That is phrased like it is stating a fact about the training process, but it is a statement about the intent of the training, isn't it? So I don't see it as rebutting my comment.
>It’s the opposite - it’s trained to generate the most likely next word
Sure, of course, what else? But if you take any correct statement about something and modify it slightly, it's not very likely it will still be correct.
It seems intuitive to me that there are going to be a million billion (understatement) wrong things next to anything correct in the inputs. As a sort of combinatorial, mathematical thing. You just (in principle) count all the ways to be wrong that are similar to being right.
Nobody trained it to get anything right! It doesn't matter what people expect if they don't have a procedure to do it.
If a statement is adjacent to things that are also "correct", that almost implies a lack of information in the original statement. It seems born out in the impressive BS'ing - the key to BS'ing is saying things that can't really be wrong.
To be an effective prompt engineer you need to have expertise in two different domains - large generative ML models, and in whatever it is people want to generate (e.g. art).
>https://sites.google.com/view/automatic-prompt-engineer
Not exactly a "toaster go brrrr" job, but it could be obsolete one day
WaPo does need to chill though. There’s barely any Prompt Engineer jobs
Edit: If anyone's curious, I've been following this for prompt stuff: https://github.com/dair-ai/Prompt-Engineering-Guide