|
Hi HN, I’m working on an open-source tool Veritensor:
https://github.com/arsbr/Veritensor The goal is to help teams secure the AI/ML supply chain as models, datasets, and tooling increasingly come from third parties. What it currently does:
-Detects malicious code, hidden payloads, and unsafe operations inside ML models (e.g. Pickle, PyTorch, Keras) using static analysis and a custom execution engine
-Verifies model integrity and detects tampering or supply-chain risks
-Scans datasets for data poisoning, anomalies, and potential PII leaks
-Analyzes documents used in RAG pipelines (PDF, DOCX, PPTX) for prompt injection and embedded threats
-Inspects Jupyter notebooks for unsafe code, secrets, and risky patterns
-Signs container images using Sigstore Cosign
-Integrates into CI/CD pipelines and ML validation workflows This is an early-stage project and very much a work in progress. It does not aim to replace runtime sandboxing, isolation, or human review, and it's not intended to be a silver bullet. I’m interested in feedback from people running ML systems in production:
-What parts of the AI supply chain are you most concerned about today?
-Are there checks or threat models you feel are missing here?
-Which parts of this approach seem flawed, incomplete, or unlikely to work in production?
-Would a tool like this be useful in your production workflows, or would it be hard to adopt in practice?
-Any suggestions on how to improve the project or make it more practical for real-world use would be really appreciated. Thanks for you time! |