Hacker News new | ask | show | jobs
by aytuakarlar 127 days ago
Great questions! Thats exact trade-offs we're navigating. Re: Intent-aware exceptions: CSL uses hierarchical policy composition for this. Example from our banking case study: CSL:

DOMAIN BankingGuard {

  VARIABLES {
    action: {"TRANSFER", "WITHDRAW", "DEPOSIT"}
    amount: 0..100000
    country: {"TR", "US", "EU", "NK"}
    is_vip: {"TRUE", "FALSE"}
    kyc_level: 0..5
    risk_score: 0..1
    device_trust: 0..1
  }

  // Hard boundary: never flexible
  STATE_CONSTRAINT no_sanctioned_country {
    WHEN country == country
    THEN country MUST NOT BE "NK"
  }
  
  // Soft boundaries: context-dependent
  STATE_CONSTRAINT transfer_limit_non_vip {
    WHEN action == "TRANSFER" AND is_vip == "FALSE"
    THEN amount <= 1000
  }
  
  STATE_CONSTRAINT transfer_limit_vip {
    WHEN action == "TRANSFER" AND is_vip == "TRUE"
    THEN amount <= 10000
  }
  
  // Multi-dimensional guards (amount + device trust)
  STATE_CONSTRAINT device_trust_for_medium_transfer {
    WHEN action == "TRANSFER" AND amount > 300
    THEN device_trust >= 0.7
  }
}

Variables like is_vip, risk_score, device_trust are injected at runtime by your application logic, not inferred by the LLM. The LangChain integration looks like:

safe_tools = guard_tools( tools=[transfer_tool], guard=guard, inject={ "is_vip": current_user.tier == "VIP", # From auth "risk_score": fraud_model.score(context), # From ML model "device_trust": session.device_score, # From fingerprinting "country": geoip.lookup(ip) } )

So the agent can't "decide" it's VIP or that the device is trusted. Those come from external systems. The policy just enforces the combinations. Your database/logging example: You'd add a purpose variable and carve out:

STATE_CONSTRAINT no_user_table_writes { WHEN action == "WRITE" AND table == "users" THEN purpose MUST BE "AUDIT_LOG" }

If you don't inject enough context, rules become binary (allow/deny). If you inject too much, the policy becomes a replica of your business logic. We're finding the sweet spot is 6-10 context variables that encode the "security-critical dimensions" (user tier, risk, trust, geography).

Re: Z3 performance: Z3 runs at compile-time, not runtime. The workflow is: Policy compilation (once): Z3 proves logical consistency → generates pure Python functors Runtime (per request): Functor evaluation only, no symbolic solver Typical policy (<20 constraints): <1ms per evaluation. We haven't stress-tested 100+ concurrent yet (Alpha), but since runtime is stateless Python, it should scale horizontally. The bottleneck would be the LangChain overhead, not CSL. Your concern about permissiveness is spot-on. We're addressing this in Phase 2 (TLA+) by adding temporal logic: instead of "block all DB writes," you can express "allow writes if preceded by read within 5 actions." This gives you state-aware permissions without making rules combinatorially complex. The current Z3 engine is intentionally conservative. TLA+ will add the flexibility production systems need. Appreciate the pushback—this is exactly the feedback we need at Alpha stage. If you have a specific use case in mind, I'd love to test CSL against it.