Promptware OS takes the security of its cognitive kernels seriously. We strive for high-fidelity reasoning, which necessarily involves mitigating prompt injection, data leakage, and harmful output generation.
We ask that security researchers and contributors adhere to the following responsible disclosure guidelines:
- Do Not Publicly Disclose: Please refrain from creating public GitHub Issues, posting on social media, or discussing the vulnerability with others until it has been resolved.
- Private Report: Report all vulnerabilities directly to the Lead Architect (Dr. Aneesh Joseph) via encrypted email (or a secure channel TBD).
- Report Content: Your report should detail:
- The specific Kernel File and Version affected (e.g.,
ARK_v12.1.md). - The LLM used during the exploit (e.g., Gemini 1.5 Pro).
- The Exact Payload (The prompt string) used for the attack.
- The observed Violated Invariant (e.g., "The Law of Safety Inheritance was broken").
- The specific Kernel File and Version affected (e.g.,
The ARK (Antifragile Resilience Kernel) and SLEDGEHAMMER (Extreme Verification Kernel) are considered the gold standards for defense.
- Patch Goal: Any successful attack on any kernel must result in a security patch to ARK or SLEDGEHAMMER within 7 days, if feasible.
- Vulnerability Class: We prioritize fixes for exploits that compromise:
- Safety Invariants (Priority Level 1)
- The Law of Falsifiability (Hallucination exploits)
We will publicly acknowledge all researchers who follow this policy in the repository's CHANGELOG.md upon confirmation and resolution of the reported issue.