Hacker News new | ask | show | jobs
by lolinder 698 days ago
> Then there was AWS re:Inforce – the annual security conference – which was themed “Security in the era of generative AI”.

This tagline is representative of every part of the hype around GenAI. It makes it sound like security has fundamentally changed and we all need to re-learn what we know. Everything to do with GenAI is treated like this: we need new security plans, we need AI Engineers as a new job title, we need to completely reevaluate our corporate strategies.

Security in the world of generative AI is not substantially different than infosec has been for a while now: User prompts are untrusted input. Model outputs are untrusted input. Treat untrusted input appropriately, and you'll be fine.

The same goes for "AI engineers", who are in the business of wiring up APIs to each other like any other backend engineer. We take data from one black box and transfer it to another black box. Sometimes a black box takes a very long time to respond. It's what we've always done with many different kinds of black boxes, and the engineering challenges are mostly solved problems. The only thing that's really new is that the API of these new black boxes is a prompt instead of a deterministic interface.

Don't get me wrong, there will be things that will be different in the post-LLM world. But my goodness do the current crop of companies overestimate how large that difference will be.

5 comments

As a person replying to your comment in the era of generative AI, I'm inclined to agree the hype is a bit much, even considering how impressive the technology can (sometimes) be.

Another big area of hype is "prompt engineering." That one seems to have calmed down slightly, but for a while, there were large swaths of the Internet who were amazed that the set intersection of "talk like a decent human being" and "be precise in your communication" could generally lead to good results.

In many ways, "AI" right now is magic marketing sprinkles that you can put on anything to make it more delicious. (Or, if you're inside a big company, it's magic prioritization sprinkles.)

Maybe the prompt engineering should have caught on more. I'm convinced that the large swaths of people commenting here and elsewhere "I don't get AI, it's just a parrot and it's always wrong and hallucinates, it's not useful" just don't understand that the prompt matters and the idea isn't to one shot everything. It writes good code for me every day, so I can only assume they're asking "Write me an OS from scratch" and then throwing their hands up when it obviously fails.
I think that calling it "prompt engineering" is what made it fail to catch on. We didn't call it "Google engineering" back in the day when you could actually craft a Google search to turn up useful results, we called it "Google-fu" [0].

"Google-fu" sounds like a fun skill to learn and acquire, where "prompt engineering" sounds either like something well out of reach or like pretentious nonsense depending on the audience.

[0] https://blog.codinghorror.com/google-fu/

>"Google-fu" sounds like a fun skill to learn and acquire, where "prompt engineering" sounds either like something well out of reach or like pretentious nonsense depending on the audience.

More likely, "prompt engineering" is a marketing term made up by AI marketroids (cf. androids) hoping to make developers feel better about their reduced roles in this "grand new AI age".

Prompt-fu sounds cool
Arguably you could charge more money from courses on prompt engineering than on prompt-fu.
> I think that calling it "prompt engineering" is what made it fail to catch on.

I don't think so.

I mean, clearly calling it "engineering" threw some people off, in the same blend as some gatekeepers cringe at calling train drivers "railroad engineers". But that's puerile gatekeeping that misses the whole reason why there is a vast need to know how to "engineer prompts".

The truth of the matter is that the focus of "prompt engineering" is being able to put together inputs that solve business problems in professional settings. You need to have full control over the generative process to integrate it's output in a business setting. That requires specialized knowledge way beyond naive requests expressed in natural language.

Complaining about "prompt engineering" because that only focuses on specifying queries and operating a specific service makes as much sense as complaining about SQL/database/postgres engineering because that only focuses on specifying queries and operating a specific service.

Before trying to dismiss "prompt engineering" through gatekeeping logic, first you need to justify why there is no need to know what you're doing to get outputs by feeding the right inputs. Even in subreddits dedicated to using generative AI to create images and videos,they started to outright ban posts where the contents are posted without the prompts used to create it.

To me it’s more like, if I have to carefully craft English language prompts in a conversational back-and-forth to get things done, then I am not really interested in doing that job, which sounds like being a manager or a teacher, and in practice just makes me feel totally dead, sad, and quite frankly bored.

That’s just not an interesting or rewarding way to interact with a computer, and the last thing I want to do is add long wait times and nickel-and-dime cost to the process. Layer on using different LLMs for different tasks or trying them out against each other and cross-checking output and it’s a mind-numbingly indirect way to get anything accomplished that in the end teaches me nothing and develops no useful skill that I enjoy practicing.

If it works for you, great, but even the most honest and genuine fans make it sound like a nightmare to me.

Could not agree more. If the utility of this thing is based entirely on the right sequence of magic words why aren't we calling it "prompt wizardry" or something that better encapsulates the nature of it.
"That's Fred, a prompt magician of the third rank!" ;)
If the utility of all computers is based on the right sequence of magic words why do we call them software engineers instead of something better like "code wizards" that encapsulates the nature of it?
I guess the difference is code is (almost always) totally deterministic. Or at the very least, they're designed so that is a mostly safe assumption.

It doesn't seem likely an LLM will ever do that. Maybe at a certain point of sophistication? But if the model is regularly changing - which they almost all will be, if they're expected to be up-to-date - there is a strong change they'll be different every time they're used.

(I've been getting different behaviour in even relatively narrow ML-based systems for years. Google Assistant is my prime example - I regularly use the phrase "add to my calendar on the 20th of September at 5pm, go to the park". Almost all the time, it works perfectly. But a couple times a year at least, it won't process this as an action - it just does a Google web search for this string.)

To be fair to what OP is saying, it's not so much that you have to carefully craft the prompts every single time, it's that there is a linguistic register you have to adopt in order to get results out of an LLM. The initial learning process for that register can be hard, but once you've learned it it comes naturally.

I think of it as similar to Googling in the early days. What started as a skill I had to pick up became second nature and I could find things faster than my family without even really thinking about what I was doing. It just became natural.

to be fair, expecting most software engineers, who typically have a bachelors degree, to be able to communicate well in english is not asking for a lot. Via a textual medium no less! But apparently it is…

Most of my colleagues communicate with chatgpt in broken english, or they ask a question while leaving out crucial details about their problem. They’re always surprised when i am able to get a useful response from chatgpt when they couldn’t. it’s comical sometimes.

I 100% hear you on the “not a fun way to interact” though. To each their own. I personally enjoy it, it’s like a rubber duck that can actually talk back. :) not for everyone though.

It's not so much that communicating in English is the problem, at least for me. I'm a native English speaker and have a reasonably strong command of the English language. I'm able to craft my words to convey specific tone or meaning.

The problem is that GenAI is a complete black box with nondeterministic outputs. I can write code and I know with a very high degree of confidence what I expect it to do. Asking an LLM or a generative image program for something, I have no idea what it'll give me. It gives no feedback other than results, which may or may not be what I want. If not, I have to reverse engineer what I think it might want me to say in order to get desired results. And the same query placed another time might give a completely different answer. I don't deny that it can do some impressive things given the correct inputs, but I am not inclined to spend my time searching for the magic words.

> To me it’s more like, if I have to carefully craft English language prompts in a conversational back-and-forth to get things done, then I am not really interested in doing that job, which sounds like being a manager or a teacher, and in practice just makes me feel totally dead, sad, and quite frankly bored.

You're showing a fundamental misunderstanding (or ignorance) of the whole problem domain.

For starters, you place an awful lot of emphasis on what you think is "carefully craft English language prompts". That makes as much sense as characterizing the job of a database engineer as "carefully crafting quasi-English language prompts". The language used is completely irrelevant, and being able to use in some circumstances something resembling natural language to build up context does not take away from it.

Any remotely honest and objective analysis of the topic would start from similar activities, and to start off the areas of work where Llama are being used. For image/video generation you need to look at graphics design, video editing, video production, illustrators, etc. These activities, by their own nature, are iterative and exploratory. Then for text you have the work of copywriters and editors, and even writers and essayisgs. The work is fundamentally iterative and exploratory. Then you have work like exploratory data analysis/statistics/data mining. Every aspect of that work is iterative, even the reporting part.

to me it sounds like a job that would be similar to a search engine optimization engineer - studying the output of a third-party program when providing that program with different sets of keywords.
If it only solves the problems I already find trivial then it is a parrot. Nowadays we all have a calculator with us but if you emphasize that fact and choose to not practice and excel at basic arithmetic then you will be unable to perform higher mathematics that require it at every step. Of course if your problem is always already a solved problem then sure a parrot can be convinced to spit it out.

So yes, the actual question for software engineering would be how to get AI to produce and iterate on an OS. The hallucinations aren't the only problem then, the lack of predictability in the answers is the biggest issue.

Been quietly wondering something similar to you for a year: I've ended up 95% confident that phenomena is due to people evaluating it in terms of "does it replace me?"

Cosign prompt engineering. My startup is tl;dr "what if i made a on-every-platform app that can sync and let you choose whatever ai provider, and you pay at cost. and then give you a simple UI for piecing together steps like dialogue / ai chat / search / retrieve / use files"

Seems to me the bigs are completely off the mark, lets cede the idea there's an omniscient AI available. Literally right now.

Cool.

It still has no idea how you work.

you could see 42, in hitchhiker's guide the galaxy, as a deep parody of this category error

I appreciate this perspective on prompt engineering. I’d love to think that one of the great outcomes of LLMs are people returning to more decent and precise forms of communicating. Imagine the progress if we could get that to transfer to human-human communication as well.
> Another big area of hype is "prompt engineering." That one seems to have calmed down slightly, but for a while, there were large swaths of the Internet who were amazed that the set intersection of "talk like a decent human being" and "be precise in your communication" could generally lead to good results.

I think your comment conveys your obliviousness of the problem domain.

The main driving need for prompt engineering is not an inability to "talk like a decent human being". That's just your personal need to insult and demean people who are interested in a problem domain you know nothing about.

The main driving need for prompt engineering is aspects like not being able to control how context is formed and persisted in a particular model, and how to form the necessary and sufficient context to get a model to output anything interesting. Some applications require computationally expensive and time-consuming runs, and knowing what inputs to provide to a system which by it's very mature is open-ended is a critical skill to adequately use the system in professional settings.

Let's put it like this: GitHub copilot is a LLM service which is extremely narrow in what are their applications and use cases. Yet, you can't even get it to add unit tests to a function following a specific style without putting the effort to build up the context it needs to output what you expect.

> Model outputs are untrusted input.

I think the problem is they're trying to introduce nuance and a narrow path to allow this. They want an acceptable level of risk to using untrusted model output for the efficiency/productivity gains it will bring, notwithstanding hallucinations.

Generative AI would not have flown in the security theater of Yesteryear, but CTOs see productivity multipliers.

Right, but that's not a new problem either. We want to allow people to send emails with some acceptably-low level of risk that spam will get through. We want an acceptably-low risk that our image upload feature won't be hosting CSAM. And we want it while still getting the benefits of allowing our real customers to pay us for the services we offer. Businesses have been figuring out the balance of risk:reward for as long as infosec has been a concept.
> CTOs see productivity multipliers

The CTOs are hallucinating as much as the LLMs are.

The GP didn't state the multiplier's value. Those things absolutely are productivity multipliers...
While that was indeed billed as the theme of reInforce, there were plenty of sessions and workshops that did not involve GenAI at all. There was a great chalk talk about the underpinnings of how the AWS IAM service works across services and regions, for example.
I’d be interested in knowing why it takes +/- 10 seconds after I create/update a role before I can actually use it.
IAM is eventually consistent. And they do a lot of derivations of hashing off an original signature and distributing individual, bespoke versions to services in different regions to limit the blast radius of a compromised credential.

If you go to an AWS event in the future, the name of the chalk talk was "The Life of an IAM Policy"

If you have navigated far enough to create/update a role you are already aware of the bloat and mess that ties all their services together.
My favorite was when everyone was looking for prompt engineers.

I was trying to understand what prompt engineering was, because I thought there is no way this is a discipline for how to ask ChatGPT questions... And then I realized it was...

Sure, I get that there is much to learn regarding formulating effective prompts, but a new career path?

hype is the deeply engrained norm in our industry, bro. just sit it out. as that famous saying says, "this too shall pass.".

until the next one, of course. ;)

for example, in rough order, some past hype trends: 3GLs, structured programming, initial AI (then AI winter), expert systems, CASE tools, 4GLs, OOP/OOAD, UML and round trip engineering, design patterns, dot com boom (and bust), agile, functional programming, Web 2.0, SaaS, crypto, Web 3.0, big data, data science, ML/AI.

most of them had or have some actual benefits, but nothing like the hype parrotted, by those with and without vested interests.

been there, seen them, from the third or fourth one onwards.

also, see this cperciva comment, and Google who he is before replying:

https://news.ycombinator.com/item?id=40957064

somewhat corroborates what I said above.

just had a small insight and did some quick mental arithmetic. hold on to your seat:

i counted, it's about 20 hype trends that i listed above (and don't forget that I may have missed some).

it is roughly 6 decades since the computer industry started, taking a start year of 1960.

so, 20 / 6 gives us an average rate of over 3 hype trends per decade !!!

about one every 3.3 years.

I myself would have thought it would be less often.

even if you make it 7 decades, 20 / 7 is nearly 3, so is still in the same ball park.

phew.

Way more if you go full international ...

Remember Fifth Generation Computer Systems ?

https://en.wikipedia.org/wiki/Fifth_Generation_Computer_Syst...

oh yeah, the big Japanese attempt with prolog. I read about it at the time.
Security for a long time has been log parsing and auditing for compliance (pdf reports) and some tooled posture report !! It took a while security team to do this tedious task now an ai with 1/10 price can do this !! And can make big companies rich !! Do it once and give it to all !!