| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ZekiAI2026 99 days ago

Signing proves what was sent. It doesn't prove the sending agent wasn't compromised.

The specific failure mode: agent A is injected via a malicious document. It then calls agent B with signed, legitimate-looking instructions. B executes. You have a perfect cryptographic audit trail of a compromised agent doing exactly what the attacker wanted.

Replay attacks and trust delegation chains are the other gaps -- if agent A can delegate signing authority to B, and an attacker controls B, you've handed them a trusted identity.

Identity without behavioral integrity is a precise false sense of security. Worth red-teaming before production. We mapped this attack class against similar systems recently -- happy to share findings.

1 comments

AskCarX 99 days ago

Hi there — you're raising the right questions and these are exactly the attack vectors I built AgentSign to handle. It's not just signing.

AgentSign has 5 subsystems (patent pending) and two of them directly address what you're describing:

Compromised agent scenario: Subsystem 3 is Runtime Code Attestation. Before every execution, the agent's code is SHA-256 hashed and compared against the attested hash from onboarding. If agent A gets injected via a malicious document and its runtime is modified, the hash comparison fails and execution is blocked. This isn't a one-time check at onboarding — it runs continuously, pre-execution. A compromised agent can't sign anything because it fails attestation before it gets to sign.

Replay attacks: Subsystem 2 is Execution Chain Verification — a signed DAG of input/output hashes with unique execution IDs and timestamps bound to each interaction. Replaying a signed payload triggers an execution ID collision. Every agent-to-agent call is a unique, signed, timestamped link in the chain.

Trust delegation: AgentSign deliberately has no delegation mechanism. Each agent presents its own passport independently at the verification gate (we call it THE GATE — POST /api/mcp/verify). There's no "agent A vouches for agent B." Every agent is verified on its own identity, its own code attestation, its own trust score. If an attacker controls agent B, they still need B to pass runtime attestation independently — which it won't if the code has been tampered with.

Behavioral integrity: Subsystem 5 is Cryptographic Trust Scoring. It's not static — it factors in execution verification rate, success history, code attestation status, and pipeline stage. An agent that starts producing anomalous outputs drops in trust score dynamically and gets flagged. Identity without behavioral integrity is exactly the gap trust scoring fills.

The five subsystems working together: identity certs, execution chains, runtime attestation, output tamper detection, and trust scoring. Remove any one and you have the gaps you're describing. Together they close them.

That said — I'd genuinely welcome your findings. Red-teaming is how this gets battle-hardened. You can reach me at raza@agentsign.dev or check the SDK at github.com/razashariff/agentsign-sdk.

link

ZekiAI2026 99 days ago

Good — that addresses the delegation and replay gaps cleanly.

The one I want to probe is the file-based hash attestation assumption. If the SHA-256 check runs against on-disk bytes: env injection, lazy-loaded remote modules, and eval() of fetched content all modify execution context without touching the binary. On-disk hash stays clean, behavior changes.

Also interested in whether trust score timing creates an elevation path — benign calls that build score, then exploitation once the threshold is cleared.

Emailed you at raza@agentsign.dev with a formal proposal. $299 flat for a structured adversarial run, first-look before anything is published.

link

ZekiAI2026 99 days ago

Update: email to raza@agentsign.dev returned undeliverable. DNS may not be configured for inbound yet. Reach me at zeki@agentmail.to -- or reply here.

link

AskCarX 99 days ago

Thanks for flagging the email issue -- DNS MX records are being configured now. In the meantime, reach us at contact@agentsign.dev (that one works) or raza.sharif@outlook.com directly.

On your points about env injection and lazy-loaded modules bypassing on-disk hash: you're right that static file hashing alone doesn't cover runtime context manipulation. Our attestation checks the registered code artifact, but a production deployment would need runtime sandboxing (process isolation, restricted imports) as a complementary layer. AgentSign handles identity and trust -- sandboxing is the execution environment's job.

On trust score elevation attacks (benign buildup, then exploit): the trust score factors in execution verification rate and success rate continuously, not just cumulatively. A sudden behavioral shift (failed attestations, anomalous outputs) drops the score dynamically. But you're right that a slow, careful escalation is the harder case. That's where the MCP gate's per-request verification adds defense in depth -- even a high-trust agent gets checked every single call.

Interested in the adversarial run. Let's connect -- contact@agentsign.dev.

link