Hacker News new | ask | show | jobs
by sudb 42 days ago
So interestingly, I know of at least one application in a charity that deals with trafficking where grok was happy to do one-shot classification tasks where all other models refused to cooperate.

I think there's a surprising number of actually useful applications in this sort of grey area for a slightly-less guardrailed, near-frontier model (also the grok-fast models are cheap!).

4 comments

I am software dev and i was doing a security check on my own application (work) I was running in localhost and gave it access to the code.

every single model refused to attempt to run any sort of test to check if it was a n issue other than grok.

You couldn't even ask Claude how CopyFail worked. Even more general questions around it kept getting rejected.
A couple of days ago, using codex at work, all of a sudden it said my session had been flagged for security reasons. I wasn’t doing anything cybersecurity related, nor testing any vulnerabilities or anything like that, just trying to build a pretty simple web app
It seems really dumb for the models to not due security related things. What if I want it to do a security audit of my own software that I'm building?
codex will actually help you look but it will refuse to actually try and exploit it.

it won't for example create a POC python script that you normally would use to prove the issue.

Gemini especially has a habit of blocking my pretty mundane requests, claiming they’re attempts to jailbreak or create malicious code.

Grok also does quite well at code reviews in my experience because it’s not so aggressively ”aligned”.

I couldn't get Gemini nor ChatGPT to do OCR of children's books (I literally own the books, so there's no copyright issue - all just fair use!).

The OCR was complex enough (bad quality photos) that "simple" OCR models couldn't do it.

Fortunately, Claude obliged (as well as Mistral OCR was helpful!)

There are lots of uncensored models out there. I don't think grok is leading in that front. They kind of pick and choose which things they want to support based on elons world views. Elon used to hang out with sex traffickers so of course grok is fine talking about it. Probably even offers strategies for them does free accounting has money laundering strategies etc...
What are the leading uncensored models? How well do they perform for you?
I don't use any but they do exist and there are scientific papers discussing them. I heard about them through r/localllama
>There are lots of uncensored models out there.

Like what?

Something as easy where normal people can login to a website and app and just use?

I don't think companies are hosting them because imagine the liability. Could be wrong though. Again I don't know much about these things I just know they exist.
Yes that is my point.

It is the dropbox comment all over again.

"Well you can just self-host to get uncencored same as Grok without NAZI!! Elon Musk!!"

Just like you can spin up an FTP to get your own Dropbox.

Well... very few people are going to actually do that.

I've been working on my own misaligned model and grok is definitely different enough with a syspronpt compared to all the other frontier models that I've considered using it to generate synthetic training data, however it leans really heavy into LLMisms which makes it not really worth it. Tangentially I also really like the idea of llms as librarians they are trying out with grokapedia.
Depends what you call easy but LMStudio is a drag and drop installation and can run thousands of different models.
Deepseek is fairly uncensored. I tried pushing it and reached my limits before it did.
Is this satire? Ask it about June 4 1989, Taiwan independence, or Winnie the Pooh.
Not that you're wrong, but I think they were talking about it from a technical POV. I use deepseek to write exploits and red team("Malicious" code). It's alignment is under different values so it's nice to be able to at least swap between models for different uses.