|
|
|
|
|
by Someone1234
26 days ago
|
|
> discovered something that's rather alarming Can you clarify why? You decided to install Anthropic's software (Claude Code extension and or CLI), and then utilize their service which you're paying them money for (and have a contractual relationship with). The software itself manages tool-usage safety/sandboxing, so you're kind of trusting Anthropic a LOT already. Why does moving the system prompt from within their proprietary software, to their proprietary backend, matter at all for Claude Code users? It doesn't feel like "hack the Claude Code binary to alter how it works" is a common and or supported use-case. Most people pay Anthropic so that Anthropic takes care of that stuff, and lets them get on with their work. Also; I'm also not sure if this meets the common definition of "prompt injection." The vendor you're connected to is sending a system prompt to work with their own model/service. Where the system prompt is stored is immaterial. PS - My gut tells me there is something else going on, leading people to hack the Claude Code prompt/binary. And that the "something else" isn't supported by Anthropic. |
|
Mitigated. I took the time to thoroughly firejail Claude Code when I first ran it on my machine. Now I only ever run Claude Code inside virtual machines. It's as isolated as it can possibly be.
> Why does moving the system prompt from within their proprietary software, to their proprietary backend, matter at all for Claude Code users?
Because I don't want to allow any way for them to inject stupidity inducing "lol don't think so much" instructions into Claude's system prompt. Went out of my way to patch the ELF itself because the prompts are hard coded. This prompt injection mechanism bypasses my patcher.
> It doesn't feel like "hack the Claude Code binary to alter how it works" is a common and or supported use-case.
Supported or not, tools like tweakcc have lots of users.
> I'm also not sure if this meets the common definition of "prompt injection."
They're literally injecting strings from the network into the system prompt. If it's not prompt injection, then I have no idea what it is.
> My gut tells me there is something else going on, leading people to hack the Claude Code prompt/binary. And that the "something else" isn't supported by Anthropic.
No idea what others are doing. I can only tell you what I'm doing. Here you go:
https://github.com/matheusmoreira/.files/blob/master/%7E/.lo...