| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alach11 720 days ago
	There's no doubt that LLMs massively expand the ability of agencies like the NSA to perform large-scale surveillance at a higher quality. I wonder if Anthropic (or other LLM providers) ever push back or restrict these kinds of use cases? Or is that too risky for them?

5 comments

dinglestepup 720 days ago

That ship has probably sailed. If Llama3 is performing on par with GPT-3.5, then there is no real benefit for companies to restrict access to slightly better proprietary models.

link

hmottestad 720 days ago

GPT-4 is “holy shit, this actually works, could be better but it’s so good I almost can’t believe it” while GPT-3.5 is “when it works it’s pretty great, just a pity it almost never does”.

So I would assume that three letter agencies would love to take something like GPT-4 and fine tune it based on all the data they have about existing terrorists.

link

brink 720 days ago

I'm still dealing with hallucinations nearly every time I use it.

link

p1esk 720 days ago

I get maybe one hallucination per twenty chats with gpt4.

link

thfuran 720 days ago

I haven't tried more than a handful of queries, but I think I've gotten 100% rate of hallucination or generic useless response to specific question.

link

p1esk 720 days ago

Can I try your question? Just curious.

link

jakderrida 720 days ago

Do you mean 3.5? While I still face issues with GPT 4, I can't even remember the last time it hallucinated. I'm not saying it can't. But, yeah, that's crazy that they're specifically targeting your IP address like that.

link

baq 720 days ago

NSA should be training their own GPT-4 or better model as we speak and should have been doing it for a long while now. Anything else is borderline incompetence.

link

kridsdale3 720 days ago

NSA can't hire the right talent capable of producing that product for the same reason they have trouble finding white-hat security people to hire: You can't work for the government and do drugs in your personal time. Enough of the pie of elite researchers are in to wacky mind-bending that it's a real recruitment problem.

link

jakderrida 720 days ago

Also, imagine the public shitstorm when people see headlines that NSA has overturned their policy against microdosing. They're not gonna understand what tf is going on and trying to explain it away sure af isn't gonna happen because they'll always believe that all drugs are bad and defending zero tolerance policies are the hallmark of being one of "the good guys".

link

alach11 719 days ago

You don't think compensation is the bigger issue?

link

causal 720 days ago

And given the volume of data they likely sift through, I'd also expect them to want very small, high-throughput models for identifying targets for larger models to examine.

On the flip side, LLMs must give the NSA a new challenge: a flood of garbage text generated by no-one in particular. Perhaps there will be more effort to put surveillance directly on-device as tapping networks yields more noise.

link

ttyprintk 720 days ago

I’d expect they’re using huge models to train many small ones, one for each threat actor. Those small models could decide whether their actor is detected, or it’s time to slot in a different one.

link

TeMPOraL 720 days ago

On the grasping side, they are probably in the best position to train a GPT-5, given the amount and type of data they're presumed to have.

link

wkat4242 720 days ago

Will it really though? So far I've seen most of the "revolutionise" claims to be mainly hot air and marketing.

It's possible that LLMs will suddenly make a leap in reliability and usability (e.g. much higher context window without corresponding massive increases in memory usage). But I have yet to see it.

So far it's great at some specific usecases. Interacting with humans, rewriting or making up text. Summarising. A hit & miss at everything else.

Don't get me wrong, I love AI tech and I'm heavily experimenting with it (both at work and at home with local models). But as with most hyped technologies I find the benefits far overblown in marketing stories.

Our leadership jumped on Microsoft Copilot (the one for Office 365 because they have tens of different copilots :) ) like a pack of hungry wolves afraid to miss the boat. And the result was.... kinda meh. It's kinda promising and impresses with simple play school stuff ("make me a presentation about home safety") and totally and utterly fails when you try to do anything serious work related. Sooo many times I get "Sorry I can't do this right now", "Sorry I need more training for this", "I can't do this for you but this is how you can do it yourself!" or it does something but like totally wrong.

Meanwhile we have a bunch of MS training people running around evangelising and telling us how great everything is and making excuses for everything that goes wrong :) You can almost see them breathe a sigh of relief every time something works as it should. That's not what we were promised.

Maybe it will get there, but I don't see it happening tomorrow to be honest. LLMs were an impressive leap but their achilles heels have become clear and it's proving difficult to overcome them.

I'm really enjoying surfing the knife's edge of technology (as I was and still am with metaverse) but I don't yet see this as a game changer except in a few specific industries. People editing text for a living certainly have a need to worry.

I also wonder what will happen with future AI training. Now that more and more websites are filled with AI-generated content that is often at best "mediocre", and considering future AI models will be trained on that, will they be able to improve their accuracy or struggle to maintain it?

link

alach11 720 days ago

I use LLMs extensively in my field to automate all sorts of tasks. Need to classify a million PDF documents for cheap? Write a prompt and submit a batch job. Need to read 30,000 drilling reports to automatically scan for hazards? Done in 60 minutes.

These are tasks that would have taken months of development or millions of dollars in manual effort before. It's not just hype.

link

JBorrow 720 days ago

Boy, I can’t wait for the foundation of my house to disappear because the LLM mis-classified a drilling report as non-hazardous.

What’s the deal here with liability and accountability? That’s a serious problem when considering using these for anything other than toy problems.

link

bongodongobob 720 days ago

You don't actually think the LLM is reviewing those 30k documents do you? You tell it to write a program (which is easy to audit) to pull the info from the PDFs or whatever. I don't get why this crowd is so goddamn unimaginative with LLMs.

link

svieira 720 days ago

> You tell it to write a program (which is easy to audit) to pull the info from the PDFs

Wherein you discover that unless you ask it to consider the fact that PDFs are ... very hard to parse [1] [2] you get something that misses whole blocks of text or turns them into something they aren't and the rest of the program misses chunks of the document.

[1]: https://news.ycombinator.com/item?id=22473263 [2]: https://web.archive.org/web/20200303102734/https://www.filin...

link

bongodongobob 719 days ago

Why are you expecting they are all very different? They're all likely very similar.

link

goatlover 720 days ago

Because I've heard of enough lazy uses of LLMs to be suspicious. Auditing the program means being sure that the info pulled from those documents is reviewed properly. Also, a complete lack of regard for other people's privacy.

link

bongodongobob 719 days ago

No idea where privacy enters in here.

link

jakderrida 720 days ago

>Boy, I can’t wait for the foundation of my house to disappear because the LLM mis-classified a drilling report as non-hazardous

LMAO! It's so hilarious that people like you forget that the alternative is relying on bureaucracies managed by people that get things wrong more often and are both too lazy and too stubborn to process your application to review your drilling report again.

If using both human-level and AI-level analysis is cheaper and much more accurate (but still imperfect), I'm willing to settle for a better system than oppose all change and die holding out for a perfect system.

link

JBorrow 720 days ago

What are 'people like me'? It's not like I know nothing about large language models, I just think using them for civil engineering is a bad idea...

link

AlotOfReading 720 days ago

One thing I've struggled with while applying LLMs to business problems is how others have dealt with identifying and managing system failures.

Let's say some of your drilling reports contain a pattern that indicates balrog activity, which the LLM misses. The legal or insurance context requires you to monitor and address potential balrog activity. How do you plan for these failures?

In almost every case I've seen, the plan is to not have a plan, which is another way of saying that the data doesn't matter so long as no one complains about the results.

link

thomashop 720 days ago

Same way you manage human failures?

link

AlotOfReading 720 days ago

The way we manage human failures are with rules, checklists, and accountability. LLMs struggle with all of these, and I get the sense that spending 6mos to develop long lists of rules isn't what the parent comment has in mind with "just write a prompt"

link

noodlesUK 720 days ago

I think that for low-risk classification tasks and similar, something like an LLM is a great tool, and I can absolutely see it being extremely useful for intelligence work where sifting through stuff is very hard. However, I would not at all trust AI to make actually important decisions independently.

link

ablation 720 days ago

A genuine question and not meant as a snipe: as hallucinations are an inherent “feature” of LLMs, how can you be sure of the accuracy of the model’s interpretation of those 30,000 drilling report hazards? Or what is the acceptable level of risk?

link

bongodongobob 720 days ago

You have it write a program to analyze it. I think a lot of people fail to understand that you don't always need the LLM to do the thing, have it write a program to do the thing for you.

link

xorcist 720 days ago

That's not very likely to succeed, is it? LLMs can do a lot of things, but writing software that not only parses semi-proprietary file formats but also analyze unstructured data sounds more than little bit far fetched. I'd be impressed if just the first, and by far the easiest, part of that can be accomplished.

link

bongodongobob 719 days ago

It's extremely likely to succeed because there is a documented format. I can't believe how pessimistic this site is about this stuff. Yeah, you're not going to one shot it with a prompt. If that's your expectation, you're confused.

link

sobellian 720 days ago

Okay, but you still need to debug the program. If your program must give correct results you still need to check the program output against every case. There's no free lunch there.

link

energy123 720 days ago

Speaking generally: The program doesn't always have to give correct results. The program just needs to reduce 30k documents down to 200 documents for human review.

You're comparing LLMs to a hypothetical alternative where a human reviews all 30k documents in detail. But the real alternative is often just a worse quality sieve where more errors blunder their way through the existing flawed processes. LLMs can improve on that.

link

bongodongobob 720 days ago

You're right. That's why to be sure I don't use software. All paper and pencil. So I can be sure. I have no idea what your point is.

link

thomashop 720 days ago

How can you be sure with humans doing the work?

link

wkat4242 720 days ago

That's where the law comes in. You can prosecute a human for negligence. What about an AI?

link

criddell 720 days ago

Would you trust your LLM to file your taxes for you?

link

thomashop 720 days ago

Yes because without an LLM I don't do it.

link

goatlover 720 days ago

How did you do your taxes a couple years ago?

link

thomashop 720 days ago

I never did. Had to pay fines.

link

goatlover 720 days ago

I hope with all the time and money being saved, you're having humans check the results.

link

wkat4242 720 days ago

Yes but that is one of those niche tasks I meant.

Once again they are selling it like something that's for everyone right now. This is the problem. THe same with the metaverse. It has some really great usecases, but they made it out like next year we would all ditch our phones and work exclusively in a VR headset. Obviously that didn't happen, as the tech was nowhere near that and probably people don't want it either.

Also, if you really need to be sure that those 30.000 drilling reports really didn't contain any hazards, you still have to go through it all yourself. Don't forget LLMs aren't reproducible.

But no, my point was exactly that it's not just hype. There are genuine useful usecases, I totally agree.

As there were for metaverse, and probably even for blockchain (NFT not so sure tho :) I always thought they were really a solution looking for a problem). The key thing about a hype is that they overblow the potential benefits way too much though. I see this happening here once again.

link

nuz 720 days ago

They're pretty clear about being pro safety to the extreme, and mass surveillance to protect american interests and abuse of LLM tech (e.g. open source misuses) are probably within the umbrella of ends justifying the means logic anthropic employs.

link

jsheard 720 days ago

When you see the kinds of things that are developed in the name of "defense" it's easy to see how AI "safety" could become a similar sort of doublespeak.

link

CuriouslyC 720 days ago

AI safety already is double speak. The primary meaning is "safety" for investors who don't want to be associated with something distasteful. The other meaning is basically a thin cover.

link

jsheard 720 days ago

Well you can look forward to worse, give it another decade and Lockheed Martin will be extolling their commitment to AI safety while announcing their new generation of fully autonomous kill drones. For defense, of course.

link

jp42 720 days ago

dumb question. I can understand LLM can be used for disinformation as it can generate text/image at scale. can you explain how it can do large scale surveillance?

link

causal 719 days ago

LLMs can be fed a conversation and understand the intent of its participants, even if no particular keywords are used. Before this, surveillance was limited by how many human agents you could have sifting through recorded data.

Put another way: most people only get charged with a crime if it's worth a law-enforcement officer's time to catch you, but many small violations are ignored in favor of higher priorities. We may have to contemplate a future where AI is clever enough to notice everything that can be construed as a violation of some law and put on a prosecutor's backlog.

Schneier talks about this as well: https://www.schneier.com/blog/archives/2023/12/ai-and-mass-s...

link

spidersouris 720 days ago

I wouldn't say that they can be used to do large-scale surveillance, but they can definitely facilitate it, especially with CV integration. I think one can easily imagine the following scenario: you fill a LLM with photos from people (taken from a public camera for instance), it finds the closest matches (via a web search for instance, as Gemini does). From then, you can easily gather the most essential information: first and last name, age, usernames... And then use this information to structure even more precise prompts and find even more potentially interesting data: posts on forums, relatives... And with this data, you can create an exhaustive database with a plethora of information and data about these people.

That's what any good stalker or person experienced with social engineering is able to do right now, but it takes a lot of time and energy. Resorting to LLMs would considerably decrease both. And it gets easier the more people you have information about.

link

ttyprintk 720 days ago

Specifically, vision transformers (ViT) outperforming established CNN.

link