diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 1aa233b..7b2c825 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -17,47 +17,53 @@ ## Overview | 概述 -DocSentinel is an AI-powered system that automates security assessment of documents, questionnaires, and reports **across the entire Secure Software Development Lifecycle (SSDLC)**. The system is orchestrated by **LangChain + LangGraph**, providing stateful, graph-based agent workflows with stage-aware routing. This document describes the **system architecture**: high-level design, components, data flow, integrations, and deployment. For product goals and requirements, see [SPEC.md](./SPEC.md). +DocSentinel is an **AI-powered SSDLC (Secure Software Development Lifecycle) platform** that automates security activities across all six phases of the software development lifecycle. Built on **LangChain** and **LangGraph**, the system orchestrates phase-specific AI agents to perform security assessments of documents, questionnaires, and reports — from requirements analysis and threat modeling to vulnerability monitoring and incident response — providing stateful, graph-based agent workflows with stage-aware routing. This document describes the **system architecture**: high-level design, components, data flow, integrations, and deployment. For product goals and requirements, see [SPEC.md](./SPEC.md). --- ## Goals & Context | 目标与背景 -- **Goal**: Reduce manual effort for security teams by automating first-pass assessment of security-related documents (questionnaires, design docs, compliance evidence) and producing structured reports (risks, compliance gaps, remediations) — covering all 6 SSDLC stages. -- **Context**: Enterprise security teams must align with policies, standards, and frameworks (e.g. NIST, OWASP, SOC2) while reviewing many projects per year; the system provides a unified knowledge base (RAG), multi-format parsing, pluggable LLMs (cloud or local), and **SSDLC-aware assessment pipelines** powered by LangGraph. +- **Goal**: Provide AI-assisted security coverage across the entire SSDLC — Requirements, Design, Development, Testing, Deployment, and Operations — reducing manual effort for security teams while improving coverage and consistency. Automate first-pass assessment of security-related documents (questionnaires, design docs, compliance evidence) and produce structured reports (risks, compliance gaps, remediations). +- **Context**: Enterprise security teams must embed security into every phase of delivery (Shift-Left), aligned with frameworks like NIST SSDF, OWASP SAMM, Microsoft SDL, and SOC2. The system provides phase-specific agents orchestrated by LangGraph, a unified knowledge base (RAG) with phase-specific collections, multi-format parsing, pluggable LLMs (cloud or local) via LangChain, and **SSDLC-aware assessment pipelines**. --- ## High-Level Architecture | 高层架构 -The system is organized in layers: **Access** → **Core (LangGraph Orchestrator, SSDLC Pipeline, Memory, Skills, Knowledge Base, Parser)** → **LLM abstraction** → **LLM backends**. External integrations (AAD, ServiceNow) connect at the access and orchestration boundaries. +The system is organized in layers: **Access** → **SSDLC Orchestration (LangGraph)** → **Core Services (KB, Parser, Memory, Skills)** → **LLM Abstraction (LangChain)** → **LLM Backends**. External integrations (AAD, ServiceNow, SAST/DAST tools) connect at the access and orchestration boundaries. -![Architecture overview](https://github.com/arthurpanhku/DocSentinel/raw/main/docs/images/architecture-overview.png) - -*Figure 1: Architecture overview (see repo `docs/images/architecture-overview.png`)* - -### Mermaid: Logical view +### Mermaid: Logical View ```mermaid flowchart TB - subgraph Users["👤 Users"] + subgraph Users["Users"] Staff["Security Staff"] - APIUser["API / Integrations"] + APIUser["API / CI-CD / MCP"] end subgraph Access["Access Layer"] API["REST API\n(FastAPI)"] MCP["MCP Server\n(stdio)"] end - subgraph Core["DocSentinel Core"] - Orch["LangGraph\nOrchestrator"] - SSDLC["SSDLC Pipeline\n(6 stages)"] - Mem["Memory"] - Skill["Skills"] + subgraph Orchestration["SSDLC Orchestration (LangGraph)"] + Router["Phase Router"] + subgraph Agents["Phase Agents"] + A1["Requirements\nAgent"] + A2["Design\nAgent"] + A3["Development\nAgent"] + A4["Testing\nAgent"] + A5["Deployment\nAgent"] + A6["Operations\nAgent"] + end + State["Shared State\n& Checkpointing"] + end + subgraph Core["Core Services"] KB["Knowledge Base\n(RAG)"] Parser["Parser"] + Mem["Memory"] + Skill["Skills"] end - subgraph LLM["LLM Layer"] - Abst["LLM Abstraction\n(LangChain)"] + subgraph LLM["LLM Layer (LangChain)"] + Abst["LLM Abstraction"] end subgraph Backends["LLM Backends"] Cloud["OpenAI / Claude / Qwen"] @@ -66,55 +72,127 @@ flowchart TB subgraph Integrations["Integrations"] AAD["AAD (SSO)"] SN["ServiceNow"] + Tools["SAST / DAST\nTools"] end Staff --> API APIUser --> API APIUser --> MCP - API --> Orch - MCP --> Orch - Orch --> SSDLC - SSDLC --> Skill - Orch <--> Mem - Orch --> KB - Orch --> Parser - Orch --> Abst + API --> Router + MCP --> Router + Router --> A1 + Router --> A2 + Router --> A3 + Router --> A4 + Router --> A5 + Router --> A6 + A1 & A2 & A3 & A4 & A5 & A6 <--> State + A1 & A2 & A3 & A4 & A5 & A6 --> KB + A1 & A2 & A3 & A4 & A5 & A6 --> Parser + A1 & A2 & A3 & A4 & A5 & A6 --> Skill + A1 & A2 & A3 & A4 & A5 & A6 --> Abst Abst --> Cloud Abst --> Local - Orch -.-> AAD - Orch -.-> SN + Router -.-> AAD + Router -.-> SN + A4 -.-> Tools ``` --- +## SSDLC Agent Design | SSDLC Agent 设计 + +### LangGraph State Machine | LangGraph 状态机 + +The core orchestration is a **LangGraph StateGraph** where each SSDLC phase is a node. Edges define the workflow — sequential, parallel, or conditional based on project context and user configuration. + +```mermaid +stateDiagram-v2 + [*] --> Router + Router --> Requirements: phase=requirements + Router --> Design: phase=design + Router --> Development: phase=development + Router --> Testing: phase=testing + Router --> Deployment: phase=deployment + Router --> Operations: phase=operations + Router --> FullSSDLC: phase=full + + state FullSSDLC { + [*] --> Requirements + Requirements --> Design + Design --> Development + Development --> Testing + Testing --> Deployment + Deployment --> Operations + Operations --> [*] + } + + Requirements --> Reviewer + Design --> Reviewer + Development --> Reviewer + Testing --> Reviewer + Deployment --> Reviewer + Operations --> Reviewer + Reviewer --> [*] +``` + +**Key Design Decisions:** + +- **Shared State**: All agents read/write to a shared `SSDLCState` TypedDict managed by LangGraph. This enables cross-phase traceability (e.g. threat from Design is linked to test case in Testing). +- **Checkpointing**: LangGraph's built-in checkpointing persists state across requests, enabling long-running multi-phase assessments. +- **Conditional Routing**: The Router node inspects the request (phase, project type, risk level) and routes to the appropriate agent(s). For full SSDLC, agents execute in sequence with optional parallel sub-steps. +- **Human-in-the-Loop**: LangGraph interrupt points allow human review before progressing to the next phase. + +### Phase Agent Details | 阶段 Agent 详情 + +| Agent | SSDLC Phase | Key Tools / Skills | Input Examples | Output | +| :--- | :--- | :--- | :--- | :--- | +| **Requirements Agent** | Requirements | Compliance matcher, risk classifier, requirements extractor | PRDs, BRDs, user stories | Security requirements list, compliance obligations, risk classification | +| **Design Agent** | Design | STRIDE analyzer, architecture reviewer, SDR generator | Architecture docs, design specs, data flow diagrams | Threat model (STRIDE/DREAD), SDR report, security architecture findings | +| **Development Agent** | Development | Secure coding checker, SAST triage, code reviewer | Source code, coding guidelines, SAST reports | Secure coding findings, SAST triage results, coding recommendations | +| **Testing Agent** | Testing | SAST/DAST parser, pentest analyzer, remediation tracker | SAST/DAST reports, pentest findings | Prioritized vulnerabilities, remediation plan, fix verification | +| **Deployment Agent** | Deployment | Config reviewer, hardening checker, sign-off generator | Deployment configs, infra-as-code, release checklists | Configuration findings, hardening gaps, release sign-off report | +| **Operations Agent** | Operations | CVE analyzer, incident assistant, log auditor | CVE feeds, incident reports, security logs | Vulnerability alerts, incident analysis, audit findings | + +--- + ## Component Design | 组件设计 ### 1. Access Layer | 接入层 -- **REST API** (FastAPI): Request validation, routing to assessment / KB / health / skills endpoints. -- **MCP Server** (Model Context Protocol): Standard stdio interface for autonomous agents (Claude Desktop, Cursor, OpenClaw) to discover and call tools (`assess_document`, `query_knowledge_base`). -- **Note**: v3.0 removed the Streamlit frontend; DocSentinel is now a **headless API + MCP service**. Authentication (AAD/JWT) and rate limiting are defined but not yet wired into endpoints. +- **REST API** (FastAPI): Request validation, routing to SSDLC assessment / KB / health / skills endpoints. Phase-aware endpoints (e.g. `POST /assessments/{phase}`). +- **MCP Server** (Model Context Protocol): Standard stdio interface for autonomous agents (Claude Desktop, Cursor, OpenClaw) to discover and call SSDLC tools. +- **Note**: v4.0 is a **headless API + MCP service**. Authentication (AAD/JWT) and rate limiting are defined but not yet wired into endpoints. -### 2. Orchestrator (LangGraph) | 任务编排 +### 2. SSDLC Orchestrator (LangGraph) | SSDLC 编排器 - Built on **LangChain + LangGraph**: stateful, graph-based agent workflow with conditional edges. -- Accepts assessment tasks (files + optional SSDLC stage / skill ID). -- **Graph nodes**: Parser → SSDLC Router → Policy+History Agent ∥ Evidence Agent → Drafter Agent → Reviewer Agent. -- Policy and Evidence nodes run **in parallel** (LangGraph fan-out/fan-in). -- SSDLC Router node determines the lifecycle stage and injects stage-specific skill + checklist. +- **Graph Definition**: `StateGraph` with nodes for Router, 6 phase agents, and Reviewer. Graph nodes: Parser → SSDLC Router → Policy+History Agent ∥ Evidence Agent → Drafter Agent → Reviewer Agent. +- **State Schema**: `SSDLCState` TypedDict containing parsed documents, phase findings, threat models, cross-phase references, and metadata. +- **Conditional Edges**: Route based on requested phase, project risk level, or full SSDLC mode. SSDLC Router node determines the lifecycle stage and injects stage-specific skill + checklist. +- **Parallel Execution**: Policy and Evidence nodes run **in parallel** (LangGraph fan-out/fan-in). Within phases, sub-tasks (e.g. KB lookup + document parsing) run concurrently via `asyncio.gather`. +- **Checkpointing**: Persistent state via LangGraph `MemorySaver` or database-backed checkpointer. - Assessment submission is **non-blocking** — returns task_id immediately, processes in background. - Singleton `KnowledgeBaseService` and cached LLM client shared across requests. ### 3. Memory | 记忆体 -- **Working memory**: Task state stored in-memory (`_tasks` dict). Not persisted across restarts. +- **Working memory**: LangGraph shared state (`SSDLCState`) persisted via checkpointing. +- **Cross-phase context**: Findings from earlier phases are carried forward automatically (e.g. Design threats → Testing test cases). - **History reuse**: Past assessment reports are indexed into a dedicated Chroma collection and retrieved as context for new assessments. -- **Status**: Redis / DB persistence is planned but not yet implemented. SQLModel models (`User`, `AuditLog`) are defined but not connected. +- **Status**: LangGraph `MemorySaver` for MVP; database-backed checkpointer for production. ### 4. Skills & Personas | 技能与角色 - **Persona-based Assessment**: Defines "who" is assessing (e.g. ISO 27001 Auditor vs. AppSec Engineer). - **Built-in Persona Skills**: 4 hardcoded personas (ISO 27001 Auditor, AppSec Engineer, GDPR DPO, Cloud Architect) in `skills_registry.py`. +- **Phase-specific Personas**: Each SSDLC phase has dedicated personas: + - Requirements: Compliance Analyst, Risk Assessor + - Design: Threat Modeler, Security Architect + - Development: Secure Code Reviewer, SAST Analyst + - Testing: Pentest Analyst, Vulnerability Manager + - Deployment: Release Security Reviewer, Hardening Specialist + - Operations: Vulnerability Monitor, Incident Responder - **Built-in SSDLC Skills**: 6 stage-specific skills (one per SSDLC stage) with tailored `system_prompt`, `risk_focus`, checklists, and `compliance_frameworks`. - **Custom Skills**: File-backed (`data/skills.json`) CRUD via REST API. - **Dynamic Orchestration**: LangGraph injects skill-specific context into RAG queries and LLM prompts based on the selected persona and SSDLC stage. @@ -122,7 +200,14 @@ flowchart TB ### 5. Knowledge Base (RAG) | 知识库 - **Vector Store**: ChromaDB for chunk-level similarity search (sentence-transformers embeddings). -- **Graph RAG**: LightRAG for entity-relationship aware retrieval (controls → policies → vulnerabilities). Enabled via `ENABLE_GRAPH_RAG` config. +- **Graph RAG**: LightRAG for entity-relationship aware retrieval (controls → policies → vulnerabilities → threats). Enabled via `ENABLE_GRAPH_RAG` config. +- **Phase-specific Collections**: Separate knowledge collections per SSDLC phase: + - `kb_requirements`: Compliance frameworks, security policies, requirement templates + - `kb_design`: Security patterns, threat catalogs, architecture guidelines + - `kb_development`: Secure coding standards, OWASP guidelines, language-specific practices + - `kb_testing`: Vulnerability databases, testing methodologies, remediation guides + - `kb_deployment`: CIS benchmarks, hardening guides, configuration standards + - `kb_operations`: CVE databases, incident playbooks, audit checklists - **Hybrid Query**: When Graph RAG is enabled, results from both vector and graph retrieval are merged and deduplicated. - **History Reuse**: Indexes past assessment responses into a dedicated Chroma collection. - **Singleton**: Single `KnowledgeBaseService` instance shared across the application lifecycle. @@ -131,6 +216,7 @@ flowchart TB - **Primary engine**: Docling — preserves tables, headings, and supports OCR for scanned PDFs. Outputs structured Markdown. - **Fallback engine**: Legacy parsers (PyMuPDF, python-docx, openpyxl, python-pptx) for when Docling is unavailable. +- **SAST/DAST Report Parsers**: Dedicated parsers for SARIF format, SonarQube JSON, Checkmarx XML, Burp Suite XML, OWASP ZAP reports. - **Engine selection**: Configurable via `PARSER_ENGINE` (`auto` / `docling` / `legacy`). `auto` tries Docling first, falls back to legacy. - Shared pipeline for both assessment input and KB document ingestion. @@ -138,25 +224,31 @@ flowchart TB - Single interface for chat/completion via **LangChain** (`ChatOpenAI` / `ChatOllama`). - LangChain is also the foundation for LangGraph agent nodes — each node uses LangChain's `Runnable` interface. +- **LangChain Tools**: Phase agents use LangChain tools for structured interactions (KB query, document parsing, report generation). +- **Prompt Management**: LangChain `ChatPromptTemplate` with phase-specific system prompts and few-shot examples. - **Cached client**: LLM instance is `@lru_cache`d — one client per process lifetime. -- **Confidence**: Reviewer agent outputs a confidence score (0.0–1.0) as part of its JSON response; no separate LLM call needed. - Supported providers: OpenAI (and compatible APIs), **Ollama** (local). ```mermaid flowchart LR - subgraph Core["Core"] - Orch["Orchestrator"] + subgraph Agents["Phase Agents"] + A["Requirements / Design / Dev / Test / Deploy / Ops"] end - subgraph LLM["LLM Abstraction"] - Abst["Unified API"] + subgraph LangChain["LangChain"] + Tools["Tools"] + Prompts["Prompt Templates"] + LLM["LLM Abstraction"] end subgraph Providers["Providers"] O["OpenAI"] Ol["Ollama"] end - Orch --> Abst - Abst --> O - Abst --> Ol + A --> Tools + A --> Prompts + Tools --> LLM + Prompts --> LLM + LLM --> O + LLM --> Ol ``` --- @@ -212,53 +304,51 @@ The **SSDLC Router** is a LangGraph node that: ## Data Flow | 数据流 -End-to-end flow for an assessment: +End-to-end flow for an SSDLC assessment: ```mermaid sequenceDiagram participant U as User participant API as REST API - participant Graph as LangGraph + participant LG as LangGraph Router + participant Agent as Phase Agent participant Parser as Parser participant Router as SSDLC Router participant KB as Knowledge Base - participant LLM as LLM + participant Skill as Skill + participant LLM as LLM (LangChain) + participant Review as Human Review - U->>API: POST /assessments (files, ssdlc_stage?) + U->>API: POST /assessments (files, phase, scenario_id, ssdlc_stage?) API-->>U: 202 Accepted (task_id) - API->>Graph: background task - - Graph->>Parser: parse(files) - Parser-->>Graph: parsed docs - - Graph->>Router: detect/validate stage - Router-->>Graph: stage skill + checklist - - par Policy+History Agent - Graph->>KB: query(policy + history + stage) - KB-->>Graph: policy_chunks, history_chunks - and Evidence Agent - Graph->>Graph: extract evidence from docs - end - - Graph->>LLM: Drafter Agent (context + evidence + checklist) - LLM-->>Graph: draft report - - Graph->>LLM: Reviewer Agent (validate + confidence) - LLM-->>Graph: final report + confidence score - - Graph-->>API: store result + API->>LG: background task: route(phase, files, context) + LG->>Parser: parse(files) + Parser-->>LG: parsed docs + LG->>Agent: execute(parsed_docs, state) + Agent->>KB: query(phase_collection, relevant policy) + KB-->>Agent: chunks + Agent->>Skill: apply(persona, parsed_docs, chunks) + Skill->>LLM: prompt + context + LLM-->>Skill: structured findings + Skill-->>Agent: phase findings + Agent-->>LG: update state (findings, threats, gaps) + LG->>LG: checkpoint state + LG-->>API: assessment report U->>API: GET /assessments/{task_id} API-->>U: report + U->>Review: review & approve/reject ``` -1. User submits files (and optional `ssdlc_stage` / skill ID). API returns `task_id` immediately (non-blocking). +**Full SSDLC Flow:** + +1. User submits files and selects phase(s) or "full SSDLC" mode (and optional `ssdlc_stage` / skill ID). API returns `task_id` immediately (non-blocking). 2. **Parser** converts files to unified Markdown/text format (Docling or legacy). -3. **SSDLC Router** determines the lifecycle stage and loads stage-specific skill + checklist. -4. **Policy+History Agent** queries KB (vector + graph RAG) and **Evidence Agent** scans documents — these run **in parallel** via LangGraph fan-out. -5. **Drafter Agent** synthesizes findings into a structured report via LLM, guided by the stage checklist. -6. **Reviewer Agent** validates and scores the report (confidence 0.0–1.0). -7. User polls `GET /assessments/{task_id}` to retrieve the completed report. +3. **LangGraph Router** determines which phase agent(s) to invoke and loads stage-specific skill + checklist. +4. For full SSDLC, agents execute sequentially (Requirements → Design → Development → Testing → Deployment → Operations), with findings from each phase carried forward in shared state. **Policy+History Agent** queries KB (vector + graph RAG) and **Evidence Agent** scans documents — these run **in parallel** via LangGraph fan-out. +5. Each **Phase Agent**: parses documents → queries phase-specific KB → applies skill persona → calls LLM → produces structured findings. +6. **Drafter Agent** synthesizes findings into a structured report via LLM, guided by the stage checklist. +7. **Reviewer** node validates completeness, assigns confidence (0.0-1.0), cross-references findings across phases. +8. Report with cross-phase traceability is returned for **human-in-the-loop** review. User polls `GET /assessments/{task_id}` to retrieve the completed report. --- @@ -268,7 +358,8 @@ sequenceDiagram flowchart LR subgraph DocSentinel["DocSentinel"] API["API"] - Orch["Orchestrator"] + LG["LangGraph\nOrchestrator"] + Agents["Phase Agents"] end subgraph IdP["Identity"] AAD["Azure AD / Entra ID"] @@ -276,14 +367,23 @@ flowchart LR subgraph PM["Project Management"] SN["ServiceNow"] end + subgraph SecTools["Security Tools"] + SAST["SAST\n(SonarQube, Checkmarx)"] + DAST["DAST\n(Burp, ZAP)"] + Scanner["Vuln Scanners\n(Nessus, Qualys)"] + end User["User"] -->|Login / Token| AAD AAD -->|JWT / SSO| API - Orch -->|Project metadata| SN - SN -.->|Optional: write-back| SN + LG -->|Project metadata| SN + Agents -->|Parse reports| SAST + Agents -->|Parse reports| DAST + Agents -->|CVE feeds| Scanner ``` - **AAD**: SSO and API token validation (OAuth2/OIDC). -- **ServiceNow**: Read project metadata (type, compliance scope, owner); optional write-back of assessment results to tickets. +- **ServiceNow**: Read project metadata (type, compliance scope, owner); optional write-back of assessment results. +- **SAST/DAST Tools**: Import scan results in SARIF, native JSON/XML formats for automated triage by Testing Agent. +- **Vulnerability Scanners**: CVE feed integration for Operations Agent monitoring. See [docs/04-integration-guide.md](./docs/04-integration-guide.md) for configuration and field mapping. @@ -293,13 +393,13 @@ See [docs/04-integration-guide.md](./docs/04-integration-guide.md) for configura Security is designed along five areas (detailed in [PRD §7.2](./SPEC.md)): -| Area | Summary | -| :-------------------- | :---------------------------------------------------------------------------------------------------------------- | +| Area | Summary | +| :--- | :--- | | **Identity & access** | AAD/SSO, RBAC (analyst, lead, project owner, API consumer, admin), token/API key, data isolation by project/role. | -| **Data** | TLS for transport; secrets not in code; minimal retention; optional local-only LLM for data sovereignty. | -| **Application** | Input validation, injection prevention, dependency/SCA, safe error responses, security headers, rate limiting. | -| **Operations** | Audit log (who/what/when), operational logging without sensitive content, alerting, backup and recovery. | -| **Supply chain** | Trusted dependencies, vulnerability handling, license compliance. | +| **Data** | TLS for transport; secrets not in code; minimal retention; optional local-only LLM for data sovereignty. | +| **Application** | Input validation, injection prevention (including prompt injection), dependency/SCA, safe error responses, security headers, rate limiting. | +| **Operations** | Audit log (who/what/when), LangGraph state transition logging, alerting, backup and recovery. | +| **Supply chain** | Trusted dependencies, vulnerability handling, license compliance. | --- @@ -308,28 +408,32 @@ Security is designed along five areas (detailed in [PRD §7.2](./SPEC.md)): ```mermaid flowchart TB subgraph Client["Client"] - Browser["API Client / MCP Agent"] + Browser["Browser / CLI / CI-CD / MCP Agent"] end subgraph Server["Server / Container"] - App["DocSentinel\n(FastAPI)"] + App["DocSentinel\n(FastAPI + LangGraph)"] Chroma["Chroma\n(vector store)"] + Checkpoint["LangGraph\nCheckpoint Store"] end subgraph External["External"] AAD["AAD"] SN["ServiceNow"] LLM["LLM (OpenAI / Ollama)"] + SecTools["SAST/DAST Tools"] end Browser --> App App --> Chroma + App --> Checkpoint App --> AAD App --> SN App --> LLM + App --> SecTools ``` -- **Runtime**: Python 3.10+, FastAPI, Uvicorn. -- **Storage**: Vector store (Chroma) persisted on disk or network volume; optional Redis for memory/session. -- **Network**: Outbound to AAD, ServiceNow, and LLM endpoints; TLS recommended for production. -- **Deployment**: Single node / container for MVP; scale out by separating API and worker if needed. +- **Runtime**: Python 3.10+, FastAPI, Uvicorn, LangGraph, LangChain. +- **Storage**: Vector store (Chroma) persisted on disk; LangGraph checkpoint store (memory/SQLite/PostgreSQL); optional Redis for sessions. +- **Network**: Outbound to AAD, ServiceNow, LLM endpoints, and SAST/DAST tools; TLS recommended for production. +- **Deployment**: Single node / container for MVP; scale out by separating API and agent workers if needed. See [docs/05-deployment-runbook.md](./docs/05-deployment-runbook.md) for environment, configuration, and runbook. @@ -337,14 +441,14 @@ See [docs/05-deployment-runbook.md](./docs/05-deployment-runbook.md) for environ ## References | 参考 -| Document | Description | -| :--------------------------------------------------------------------------------------------------- | :-------------------------------------------------------------- | -| [SPEC.md](./SPEC.md) | Product requirements, pain points, features, security controls. | -| [docs/01-architecture-and-tech-stack.md](./docs/01-architecture-and-tech-stack.md) | Technology choices and module layout. | -| [docs/02-api-specification.yaml](./docs/02-api-specification.yaml) | OpenAPI spec. | -| [docs/03-assessment-report-and-skill-contract.md](./docs/03-assessment-report-and-skill-contract.md) | Report schema and Skill I/O. | -| [docs/04-integration-guide.md](./docs/04-integration-guide.md) | AAD, ServiceNow integration. | -| [docs/05-deployment-runbook.md](./docs/05-deployment-runbook.md) | Deployment and operations. | +| Document | Description | +| :--- | :--- | +| [SPEC.md](./SPEC.md) | Product requirements, SSDLC phases, features, security controls. | +| [docs/01-architecture-and-tech-stack.md](./docs/01-architecture-and-tech-stack.md) | Technology choices and module layout. | +| [docs/02-api-specification.yaml](./docs/02-api-specification.yaml) | OpenAPI spec. | +| [docs/03-assessment-report-and-skill-contract.md](./docs/03-assessment-report-and-skill-contract.md) | Report schema and Skill I/O. | +| [docs/04-integration-guide.md](./docs/04-integration-guide.md) | AAD, ServiceNow, SAST/DAST integration. | +| [docs/05-deployment-runbook.md](./docs/05-deployment-runbook.md) | Deployment and operations. | --- diff --git a/CHANGELOG.md b/CHANGELOG.md index 551ff41..6163ba3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,24 +6,38 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). --- -## [4.0.0] — 2026-03-29 +## [4.0.0] — 2026-03-30 + +### Major Change +This release pivots DocSentinel into an **AI-powered SSDLC (Secure Software Development Lifecycle) platform**, with full-phase coverage and intelligent agent orchestration. ### Added -- **SSDLC Pipeline**: Full Secure Software Development Lifecycle support with 6 stage-specific assessment flows: +- **SSDLC Full Lifecycle Support**: Six dedicated phase agents — Requirements, Design, Development, Testing, Deployment, and Operations — each with specialized skills, prompts, and knowledge base collections. - **Requirements**: Security requirements completeness, compliance mapping, risk analysis. - **Design**: Architecture security review, STRIDE/DREAD threat modeling, encryption/permission design. - **Development**: Secure coding standards, anti-injection/XSS controls verification. - **Testing**: SAST/DAST report triage, penetration test findings evaluation, vulnerability verification. - **Deployment**: Release readiness review, configuration security, key management, hardening. - **Operations**: Vulnerability monitoring, incident response evaluation, patch management, log audit. +- **LangGraph Orchestration**: Stateful graph-based workflow engine replacing the custom orchestrator. Supports conditional routing, parallel execution, checkpointing, and human-in-the-loop interrupts. +- **LangChain Integration**: Unified LLM abstraction, prompt templates, tool integration, and RAG chains via LangChain framework. +- **Threat Modeling (STRIDE/DREAD)**: Design Agent performs automated threat modeling with STRIDE categorization and DREAD risk scoring. +- **SAST/DAST Report Parsers**: Dedicated parsers for SARIF, SonarQube JSON, Checkmarx XML, Burp Suite XML, and OWASP ZAP reports. +- **Phase-specific KB Collections**: Separate knowledge base collections per SSDLC phase (`kb_requirements`, `kb_design`, `kb_development`, `kb_testing`, `kb_deployment`, `kb_operations`). +- **Cross-phase Traceability**: Findings from earlier phases automatically link to later phases (e.g. Design threats → Testing test cases → Operations monitoring rules). +- **Phase-specific Skills**: 12 built-in personas across 6 SSDLC phases (Compliance Analyst, Threat Modeler, Secure Code Reviewer, Pentest Analyst, Release Reviewer, Vulnerability Monitor, etc.). - **SSDLC Stage Skills**: 6 built-in stage-specific skills with tailored system prompts, risk focus areas, and compliance framework mappings. - **SSDLC Auto-detection**: Router node can auto-detect the SSDLC stage from document content when not explicitly specified. ### Changed -- **LangGraph Orchestrator**: Replaced custom `asyncio.gather` orchestration with **LangChain + LangGraph** stateful graph-based agent workflows. Graph nodes: Parser → SSDLC Router → Policy+Evidence (parallel fan-out) → Drafter → Reviewer. +- **Orchestrator**: Replaced custom multi-agent pipeline with LangGraph `StateGraph` supporting conditional edges, shared state (`SSDLCState`), and persistent checkpointing. Graph nodes: Parser → SSDLC Router → Policy+Evidence (parallel fan-out) → Drafter → Reviewer. +- **Assessment Reports**: Extended schema (v2.0) with `phase` field, `ThreatModel` object, `Vulnerability` array, and `CrossPhaseRef` for cross-phase traceability. - **Assessment API**: `POST /assessments` now accepts optional `ssdlc_stage` parameter. - **MCP Tools**: `assess_document` tool now accepts optional `ssdlc_stage` parameter. - **Report Schema**: Added `ssdlc_stage` field to `AssessmentReport.metadata`. +- **PRD (SPEC.md)**: Rewritten as v4.0 with full SSDLC phase definitions, LangGraph/LangChain stack, and phase-specific user stories. +- **Architecture (ARCHITECTURE.md)**: Rewritten as v4.0 with LangGraph state machine design, phase agent details, and SAST/DAST integration points. +- **All documentation**: Updated to reflect SSDLC platform positioning, LangGraph orchestration, and LangChain framework. --- diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 513708a..251b1a8 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,8 +1,8 @@ # Contributing to DocSentinel | 参与贡献 -Thank you for your interest in contributing. We welcome issues, pull requests, and feedback. +Thank you for your interest in contributing to DocSentinel — an AI-powered SSDLC platform built on LangChain and LangGraph. We welcome issues, pull requests, and feedback. -感谢你对 DocSentinel 的关注。我们欢迎提交 Issue、Pull Request 以及任何反馈。 +感谢你对 DocSentinel 的关注——这是一个基于 LangChain 和 LangGraph 构建的 AI 驱动 SSDLC 平台。我们欢迎提交 Issue、Pull Request 以及任何反馈。 --- @@ -12,7 +12,8 @@ Thank you for your interest in contributing. We welcome issues, pull requests, a 1. **Report bugs or suggest features**: Open a new [Issue](https://github.com/arthurpanhku/DocSentinel/issues) using the Bug report or Feature request template; include steps to reproduce or use case when possible. 2. **Submit code**: Fork the repo, create a branch, make your changes, and open a Pull Request to `main`. See "Development setup" and "Commit guidelines" below. -3. **Docs and examples**: Improvements to README, SPEC, code comments, or examples are welcome. +3. **Docs and examples**: Improvements to README, SPEC, ARCHITECTURE, code comments, or examples are welcome. +4. **SSDLC Skills**: Submit new phase-specific skills (personas) for any of the 6 SSDLC phases. See "Submit a Skill" below. ### Development setup @@ -21,9 +22,10 @@ Thank you for your interest in contributing. We welcome issues, pull requests, a ```bash python3 -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate - make install # Install all dependencies + make install # Install all dependencies (includes LangChain, LangGraph) pre-commit install # Install git hooks ``` +- **Key dependencies**: LangGraph (agent orchestration), LangChain (LLM abstraction), FastAPI (API), ChromaDB (vector store), Docling (parser). - **MCP Development**: To test the MCP server, install in editable mode: ```bash @@ -49,10 +51,18 @@ make lint # Check code style - **PRs**: Please fill in the PR template (what changed, how to verify, docs updated or not). If related to an Issue, reference it in the description. - **Code style**: Match existing style; optionally use [Black](https://github.com/psf/black) for Python formatting. +### Submit a Skill + +Have a great security persona for an SSDLC phase? We welcome contributions: + +1. Create a skill JSON following the schema in [docs/03-assessment-report-and-skill-contract.md](docs/03-assessment-report-and-skill-contract.md). +2. Tag it with the appropriate `ssdlc_phase` (requirements, design, development, testing, deployment, operations). +3. Submit via [Skill Template Issue](https://github.com/arthurpanhku/DocSentinel/issues/new?template=new_skill_template.md) or add to `examples/templates/`. + ### Branching and releases - The main development branch is **`main`**. -- Releases are made via **Git tags** (e.g. `v0.1.0`) and [GitHub Releases](https://github.com/arthurpanhku/DocSentinel/releases); release notes are in [CHANGELOG.md](CHANGELOG.md). +- Releases are made via **Git tags** (e.g. `v4.0.0`) and [GitHub Releases](https://github.com/arthurpanhku/DocSentinel/releases); release notes are in [CHANGELOG.md](CHANGELOG.md). --- @@ -62,7 +72,8 @@ make lint # Check code style 1. **报告问题或建议功能**:在 [Issues](https://github.com/arthurpanhku/DocSentinel/issues) 中新建 Bug 报告或功能建议,使用模板并尽量提供复现步骤或使用场景。 2. **提交代码**:Fork 本仓库,在本地创建分支,修改后提交 PR 到 `main`。请先阅读下方「开发环境」与「提交规范」。 -3. **文档与示例**:改进 README、SPEC、注释或补充示例同样欢迎。 +3. **文档与示例**:改进 README、SPEC、ARCHITECTURE、注释或补充示例同样欢迎。 +4. **SSDLC 技能**:为 6 个 SSDLC 阶段中的任何一个提交新的阶段专用技能(角色)。见下方「提交 Skill」。 ### 开发环境 @@ -71,14 +82,15 @@ make lint # Check code style ```bash python3 -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate - make install # 一键安装所有依赖(含开发依赖) + make install # 一键安装所有依赖(含 LangChain、LangGraph、开发依赖) pre-commit install # 安装 Git 提交钩子 ``` +- **核心依赖**:LangGraph(Agent 编排)、LangChain(LLM 抽象)、FastAPI(API)、ChromaDB(向量库)、Docling(解析器)。 - **MCP (Model Context Protocol) 开发**: 调试 MCP Server 时,建议使用 `docsentinel-mcp` 命令行工具: ```bash pip install -e . # 以编辑模式安装当前包 - docsentinel-mcp --help # 验证安装 + docsentinel-mcp --help # 验证安装 ``` ### 运行测试 @@ -99,7 +111,15 @@ make lint # 检查代码风格 - **PR**:请填写 PR 模板(改了什么、如何验证、是否更新文档)。若对应 Issue,在描述中注明并链接。 - **代码风格**:保持与现有代码一致;可选使用 [Black](https://github.com/psf/black) 格式化 Python 代码。 +### 提交 Skill + +如果你有适用于某个 SSDLC 阶段的安全角色,欢迎贡献: + +1. 按照 [docs/03-assessment-report-and-skill-contract.md](docs/03-assessment-report-and-skill-contract.md) 中的 Schema 创建 Skill JSON。 +2. 标注对应的 `ssdlc_phase`(requirements、design、development、testing、deployment、operations)。 +3. 通过 [Skill Template Issue](https://github.com/arthurpanhku/DocSentinel/issues/new?template=new_skill_template.md) 提交或添加到 `examples/templates/`。 + ### 分支与发布 - 主开发分支为 **`main`**。 -- 发版通过 **Git tag**(如 `v0.1.0`)与 [GitHub Releases](https://github.com/arthurpanhku/DocSentinel/releases) 完成;版本说明见 [CHANGELOG.md](CHANGELOG.md)。 +- 发版通过 **Git tag**(如 `v4.0.0`)与 [GitHub Releases](https://github.com/arthurpanhku/DocSentinel/releases) 完成;版本说明见 [CHANGELOG.md](CHANGELOG.md)。 diff --git a/README.md b/README.md index 16eb151..7f3b461 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@

DocSentinel
- AI-powered SSDLC security assessment for documents and questionnaires + AI-powered SSDLC platform — Secure your software from requirements to operations

@@ -20,6 +20,8 @@ GitHub repo MCP Ready Agent Integration + LangChain + LangGraph

@@ -32,11 +34,21 @@ ## What is DocSentinel? -**DocSentinel** is an AI-powered assistant for security teams. It automates the review of security-related **documents, forms, and reports** across the **entire Secure Software Development Lifecycle (SSDLC)** — from requirements and design through development, testing, deployment, and operations. It compares inputs against your policy and knowledge base, and produces **structured assessment reports** with risks, compliance gaps, and remediation suggestions. +**DocSentinel** is an AI-powered **SSDLC (Secure Software Development Lifecycle) platform** for security teams. It automates security activities across all six phases of the software development lifecycle using intelligent AI agents orchestrated by **LangGraph** and powered by **LangChain**. It automates the review of security-related **documents, forms, and reports** — from requirements and design through development, testing, deployment, and operations — comparing inputs against your policy and knowledge base to produce **structured assessment reports** with risks, compliance gaps, and remediation suggestions. -🚀 **Agent Ready**: Supports **Model Context Protocol (MCP)** to be used as a "skill" by OpenClaw, Claude Desktop, and other autonomous agents. +Instead of only reviewing documents at the pre-release stage, DocSentinel embeds security from day one: + +| SSDLC Phase | What DocSentinel Does | +| :--- | :--- | +| **Requirements** | Extract security requirements, identify compliance obligations (GDPR, PCI DSS, SOC2) | +| **Design** | Automated threat modeling (STRIDE/DREAD), security architecture review, SDR reports | +| **Development** | Secure coding assessment, SAST findings triage, coding guidance | +| **Testing** | SAST/DAST report analysis, penetration test review, vulnerability prioritization | +| **Deployment** | Configuration security review, hardening assessment, release sign-off | +| **Operations** | Vulnerability monitoring, incident response assistance, log audit | + +Built as a **headless API + MCP service**, DocSentinel integrates into your CI/CD pipelines, AI agents (Claude Desktop, Cursor, OpenClaw), and existing security workflows. -- **SSDLC lifecycle coverage**: 6 stages (Requirements, Design, Development, Testing, Deployment, Operations) with stage-specific skills and checklists. - **LangGraph orchestration**: Stateful, graph-based agent workflows with conditional branching per SSDLC stage. - **Multi-format input**: PDF, Word, Excel, PPT, text — parsed into a unified format for the LLM. - **Knowledge base (RAG)**: Upload policy and compliance documents; the agent uses them as reference when assessing. @@ -49,39 +61,49 @@ Ideal for enterprises that need to scale security assessments across many projec ## Why DocSentinel? -| Pain Point | DocSentinel Solution | -| :---------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------- | -| **Fragmented criteria**
Policies, standards, and precedents are scattered. | Single **knowledge base** ensures consistent findings and traceability. | -| **Heavy questionnaire workflow**
Business fills form → Security reviews → Business adds evidence → Security reviews again. | **Automated first-pass** and gap analysis reduces manual back-and-forth rounds. | -| **Pre-release review pressure**
Security needs to review and sign off on technical docs before launch. | **Structured reports** help reviewers focus on decision-making, not line-by-line reading. | -| **Scale vs. consistency**
Many projects and standards lead to inconsistent or delayed manual reviews. | **Unified pipeline** with configurable scenarios keeps assessments consistent and auditable. | -| **SSDLC coverage gaps**
Security involvement is uneven across lifecycle stages; early stages get less scrutiny. | **Stage-aware assessment** covers all 6 SSDLC stages with dedicated skills and checklists. | +| Pain Point | DocSentinel Solution | +| :--- | :--- | +| **Fragmented SSDLC coverage**
Most tools only address testing/deployment. | **Full lifecycle agents** cover all 6 SSDLC phases with dedicated AI personas. | +| **Fragmented criteria**
Policies, standards, and precedents are scattered. | Single **knowledge base** ensures consistent findings and traceability. | +| **No automated threat modeling**
Threat models are created ad-hoc. | **Design Agent** generates STRIDE/DREAD threat models from architecture docs. | +| **Heavy questionnaire workflow**
Endless review cycles. | **Automated first-pass** and gap analysis reduces manual back-and-forth rounds. | +| **SAST/DAST report overload**
Too many findings, too little context. | **Testing Agent** triages, prioritizes, and maps findings to threat models. | +| **Pre-release review pressure**
Everything lands on security at the end. | **Shift-left** approach catches issues early in requirements and design. **Structured reports** help reviewers focus on decision-making. | +| **Scale vs. consistency**
Manual reviews vary by reviewer. | **LangGraph workflows** and **unified pipeline** ensure consistent, auditable assessment across projects. | +| **SSDLC coverage gaps**
Security involvement is uneven across lifecycle stages; early stages get less scrutiny. | **Stage-aware assessment** covers all 6 SSDLC stages with dedicated skills and checklists. | -*See the full problem statement and product goals in [SPEC.md](./SPEC.md).* +*See the full problem statement and SSDLC phase details in [SPEC.md](./SPEC.md).* --- ## Architecture -DocSentinel is built around a **LangGraph orchestrator** that coordinates parsing, SSDLC stage routing, the knowledge base (RAG), skills, and the LLM. You can use cloud or local LLMs and optional integrations (e.g. AAD, ServiceNow) as your environment requires. +DocSentinel is built on **LangGraph** for stateful agent orchestration and **LangChain** for unified LLM access. Six phase-specific agents are coordinated by a graph-based state machine with cross-phase context sharing. The orchestrator coordinates parsing, SSDLC stage routing, the knowledge base (RAG), skills, and the LLM. You can use cloud or local LLMs and optional integrations (e.g. AAD, ServiceNow) as your environment requires. ```mermaid flowchart TB - subgraph User["👤 User / Security Staff"] + subgraph User["User / Security Staff"] end subgraph Access["Access Layer"] API["REST API / MCP"] end - subgraph Core["DocSentinel Core"] - Orch["LangGraph\nOrchestrator"] - SSDLC["SSDLC Pipeline\n(6 stages)"] - Mem["Memory"] - Skill["Skills"] + subgraph Orchestration["SSDLC Orchestration (LangGraph)"] + Router["Phase Router"] + A1["Requirements Agent"] + A2["Design Agent"] + A3["Development Agent"] + A4["Testing Agent"] + A5["Deployment Agent"] + A6["Operations Agent"] + end + subgraph Core["Core Services"] KB["Knowledge Base (RAG)"] Parser["Parser"] + Skill["Skills"] + Mem["Memory"] end - subgraph LLM["LLM Layer"] - Abst["LLM Abstraction\n(LangChain)"] + subgraph LLM["LLM Layer (LangChain)"] + Abst["LLM Abstraction"] end subgraph Backends["LLM Backends"] Cloud["OpenAI / Claude / Qwen"] @@ -89,33 +111,30 @@ flowchart TB end User --> API - API --> Orch - Orch --> SSDLC - SSDLC --> Skill - Orch <--> Mem - Orch --> KB - Orch --> Parser - Orch --> Abst - Abst --> Cloud - Abst --> Local + API --> Router + Router --> A1 & A2 & A3 & A4 & A5 & A6 + A1 & A2 & A3 & A4 & A5 & A6 --> KB & Parser & Skill + A1 & A2 & A3 & A4 & A5 & A6 --> Abst + Abst --> Cloud & Local ``` **Data flow (simplified):** -1. User uploads documents and optionally specifies SSDLC stage. -2. **Parser** converts files (PDF, Word, Excel, PPT, etc.) to text/Markdown. -3. **SSDLC Router** determines lifecycle stage and loads stage-specific skill + checklist. -4. **LangGraph** runs the agent graph: Policy+Evidence in parallel, then Drafter+Reviewer. -5. Returns **assessment report** (risks, gaps, remediations, confidence, SSDLC stage). +1. User selects SSDLC phase(s) and uploads documents (or optionally lets the SSDLC Router auto-detect the stage). +2. **Parser** converts files (PDF, Word, Excel, PPT, SAST/DAST reports, etc.) to text/Markdown. +3. **LangGraph Router** dispatches to the appropriate **Phase Agent(s)**, loading stage-specific skill + checklist. +4. Phase Agent queries **KB** (phase-specific collections) and applies **Skills**; Policy+Evidence run in parallel, then Drafter+Reviewer. +5. **LLM** (via LangChain) produces structured findings with cross-phase traceability. +6. Returns **assessment report** (risks, threats, gaps, remediations, confidence, SSDLC stage). *Detailed architecture: [ARCHITECTURE.md](./ARCHITECTURE.md) and [docs/01-architecture-and-tech-stack.md](./docs/01-architecture-and-tech-stack.md).* --- -## ✨ Core Capabilities +## Core Capabilities -### 🔄 Full SSDLC Lifecycle Coverage -Assess documents at every stage of the Secure Software Development Lifecycle: +### SSDLC Full Lifecycle Coverage +Six dedicated AI agents, each with phase-specific skills, prompts, and knowledge base collections. Run individual phases or a full end-to-end SSDLC assessment: - **Requirements**: Security requirements, compliance mapping, initial risk analysis. - **Design**: Architecture review, STRIDE/DREAD threat modeling, SDR. - **Development**: Secure coding standards, code review findings. @@ -123,34 +142,49 @@ Assess documents at every stage of the Secure Software Development Lifecycle: - **Deployment**: Release readiness, config security, hardening. - **Operations**: Incident response, vulnerability monitoring, log audit. -### 🛡️ Automated Security Assessment +### Automated Security Assessment Submit security questionnaires, design documents, or audit reports. DocSentinel analyzes them using configured LLMs and identifies: - **Security Risks**: Classified by severity (Critical, High, Medium, Low). - **Compliance Gaps**: Missing controls against frameworks like ISO 27001, PCI DSS. - **Remediation Steps**: Actionable advice to fix identified issues. -### 🧠 RAG-Powered Knowledge Base -Upload your organization's internal security policies, standards, and past audits. The agent indexes these documents to provide **context-aware assessments**, citing specific policy clauses in its findings. - -### 🔗 LangGraph Agent Orchestration +### Intelligent Agent Orchestration (LangGraph) +- **Stateful workflows**: LangGraph state machine maintains context across phases +- **Cross-phase traceability**: Threats from Design link to test cases in Testing and monitoring rules in Operations +- **Conditional routing**: Agents activate based on project risk level, compliance requirements, or user selection +- **Human-in-the-loop**: Interrupt points for human review at phase boundaries +- **Checkpointing**: Long-running assessments persist state and resume + +### RAG-Powered Knowledge Base +Upload your organization's security policies, standards, and past audits. Phase-specific collections ensure each agent retrieves the most relevant context: +- Requirements: compliance frameworks, security policies +- Design: threat catalogs, security patterns +- Development: secure coding standards (OWASP) +- Testing: vulnerability databases, remediation guides +- Deployment: CIS benchmarks, hardening guides +- Operations: CVE databases, incident playbooks + +### LangGraph Agent Orchestration Powered by **LangChain + LangGraph** — stateful, graph-based agent workflows with conditional routing per SSDLC stage. Parallel execution of Policy and Evidence agents, followed by Drafter and Reviewer agents. -### 🔌 API-First & MCP Ready -Designed as a headless service. Integrate it into your CI/CD pipelines via REST API, or use it as a **super-tool** within AI agents (like Claude Desktop, OpenClaw) using the Model Context Protocol (MCP). +### API-First & MCP Ready +Designed as a headless service. Integrate into CI/CD pipelines via REST API, or use as a **super-tool** within AI agents (Claude Desktop, Cursor, OpenClaw) via MCP. --- -## 🤖 Agent Integration (MCP) +## Agent Integration (MCP) -Connect DocSentinel to **Claude Desktop**, **Cursor**, or **OpenClaw** to use it as a powerful security skill. +Connect DocSentinel to **Claude Desktop**, **Cursor**, or **OpenClaw** to use it as a powerful SSDLC security skill. -### 💡 What can it do? +### What can it do? Once connected, you can ask your AI agent: -> "Read the attached `system-design.pdf` and assess it for compliance risks using DocSentinel." +> "Analyze the attached `requirements.pdf` for missing security requirements using DocSentinel." > -> "Check `api-spec.yaml` against our internal `access-control-policy.pdf` in the Knowledge Base." +> "Run a STRIDE threat model on `system-design.pdf` using the Design Agent." +> +> "Triage these SonarQube SAST findings and prioritize by risk." -### 🛠️ Configuration Guide +### Configuration Guide #### 1. Claude Desktop Add to your `claude_desktop_config.json`: @@ -185,8 +219,6 @@ Add to your `claude_desktop_config.json`: ### Option A: One-Click Deployment (Recommended) -Run the deployment script to start the full stack (API + Vector DB + optional Ollama). - ```bash git clone https://github.com/arthurpanhku/DocSentinel.git cd DocSentinel @@ -196,7 +228,7 @@ chmod +x deploy.sh - **API Docs**: [http://localhost:8000/docs](http://localhost:8000/docs) -### Option B: Docker Manual +### Option B: Manual Setup **Prerequisites**: **Python 3.10+**. Optional: [Ollama](https://ollama.ai) (`ollama pull llama2`). @@ -214,26 +246,27 @@ uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 --- -### Example: submit an assessment - -You can use the sample files in [examples/](examples/) to try the API. +### Example: Submit an SSDLC assessment ```bash -# Use sample file from repo +# Run a Design phase assessment (threat modeling) curl -X POST "http://localhost:8000/api/v1/assessments" \ - -F "files=@examples/sample.txt" \ - -F "scenario_id=default" + -F "files=@examples/architecture-doc.pdf" \ + -F "phase=design" \ + -F "scenario_id=threat-modeling" # Response: { "task_id": "...", "status": "accepted" } -# Get the result (replace TASK_ID with the returned task_id) +# Get the result curl "http://localhost:8000/api/v1/assessments/TASK_ID" ``` -### Example: upload to KB and query +### Example: Upload to KB and query ```bash -# Use sample policy from repo -curl -X POST "http://localhost:8000/api/v1/kb/documents" -F "file=@examples/sample-policy.txt" +# Upload a security policy to the requirements KB collection +curl -X POST "http://localhost:8000/api/v1/kb/documents" \ + -F "file=@examples/sample-policy.txt" \ + -F "collection=kb_requirements" # Query the KB (RAG) curl -X POST "http://localhost:8000/api/v1/kb/query" \ @@ -247,42 +280,48 @@ curl -X POST "http://localhost:8000/api/v1/kb/query" \ A hosted deployment is available on [Fronteir AI](https://fronteir.ai/mcp/arthurpanhku-docsentinel). -## Project layout +## Project Layout ```text DocSentinel/ ├── app/ # Application code │ ├── api/ # REST routes: assessments, KB, health, skills -│ ├── agent/ # LangGraph orchestrator, skills registry & service -│ │ └── ssdlc/ # SSDLC pipeline: stage router, stage skills, checklists +│ ├── agent/ # LangGraph orchestrator, phase agents, skills +│ │ ├── orchestrator.py # LangGraph state machine & phase routing +│ │ ├── agents/ # Phase-specific agent implementations +│ │ ├── ssdlc/ # SSDLC pipeline: stage router, stage skills, checklists +│ │ ├── skills_registry.py # Built-in skills per SSDLC phase +│ │ └── skills_service.py # Skill CRUD and management │ ├── core/ # Config, guardrails, security, DB │ ├── kb/ # Knowledge Base (Chroma + LightRAG graph RAG) -│ ├── llm/ # LLM abstraction (OpenAI, Ollama) -│ ├── parser/ # Document parsing (Docling + legacy fallback) +│ ├── llm/ # LangChain LLM abstraction (OpenAI, Ollama) +│ ├── parser/ # Document parsing (Docling + SAST/DAST + legacy) │ ├── models/ # Pydantic / SQLModel models │ ├── main.py # FastAPI app entry point │ └── mcp_server.py # MCP Server for agent integration ├── tests/ # Automated tests (pytest) -├── examples/ # Sample files (questionnaires, policies) +├── examples/ # Sample files (questionnaires, policies, reports) ├── docs/ # Design & Spec documentation │ ├── 01-architecture-and-tech-stack.md │ ├── 02-api-specification.yaml │ ├── 03-assessment-report-and-skill-contract.md │ ├── 04-integration-guide.md │ ├── 05-deployment-runbook.md +│ ├── 06-agent-integration.md │ └── schemas/ ├── .github/ # Issue/PR templates, CI (Actions) ├── Dockerfile -├── docker-compose.yml # API only -├── docker-compose.ollama.yml # API + Ollama optional -├── CONTRIBUTING.md # Contribution guidelines -├── CODE_OF_CONDUCT.md # Code of conduct +├── docker-compose.yml +├── docker-compose.ollama.yml +├── CONTRIBUTING.md +├── CODE_OF_CONDUCT.md ├── CHANGELOG.md -├── SPEC.md +├── SPEC.md # PRD with SSDLC phase definitions +├── ARCHITECTURE.md # System architecture with LangGraph design ├── LICENSE ├── SECURITY.md ├── requirements.txt -├── requirements-dev.txt # Dev dependencies +├── requirements-dev.txt └── .env.example ``` @@ -290,37 +329,49 @@ DocSentinel/ ## Configuration -| Variable | Description | Default | -| :--------------------------------------------- | :------------------------------------- | :---------------------------------- | -| `LLM_PROVIDER` | `ollama` or `openai` | `ollama` | -| `OLLAMA_BASE_URL` / `OLLAMA_MODEL` | Local LLM | `http://localhost:11434` / `llama2` | -| `OPENAI_API_KEY` / `OPENAI_MODEL` | OpenAI | — | -| `CHROMA_PERSIST_DIR` | Vector DB path | `./data/chroma` | -| `PARSER_ENGINE` | Parser: `auto`, `docling`, or `legacy` | `auto` | -| `ENABLE_GRAPH_RAG` | Enable LightRAG graph retrieval | `true` | -| `SSDLC_DEFAULT_STAGE` | Default SSDLC stage if not specified | `auto` | -| `UPLOAD_MAX_FILE_SIZE_MB` / `UPLOAD_MAX_FILES` | Upload limits | `50` / `10` | +| Variable | Description | Default | +| :--- | :--- | :--- | +| `LLM_PROVIDER` | `ollama` or `openai` | `ollama` | +| `OLLAMA_BASE_URL` / `OLLAMA_MODEL` | Local LLM | `http://localhost:11434` / `llama2` | +| `OPENAI_API_KEY` / `OPENAI_MODEL` | OpenAI | -- | +| `CHROMA_PERSIST_DIR` | Vector DB path | `./data/chroma` | +| `PARSER_ENGINE` | Parser: `auto`, `docling`, or `legacy` | `auto` | +| `ENABLE_GRAPH_RAG` | Enable LightRAG graph retrieval | `true` | +| `LANGGRAPH_CHECKPOINT_DIR` | LangGraph checkpoint persistence | `./data/checkpoints` | +| `SSDLC_DEFAULT_PHASES` | Default phases for full assessment | `requirements,design,development,testing,deployment,operations` | +| `SSDLC_DEFAULT_STAGE` | Default SSDLC stage if not specified | `auto` | +| `UPLOAD_MAX_FILE_SIZE_MB` / `UPLOAD_MAX_FILES` | Upload limits | `50` / `10` | *See [.env.example](./.env.example) and [docs/05-deployment-runbook.md](./docs/05-deployment-runbook.md) for full options.* --- +## Tech Stack + +| Layer | Technology | Purpose | +| :--- | :--- | :--- | +| **Agent Orchestration** | LangGraph | Stateful graph-based SSDLC workflow engine | +| **LLM Framework** | LangChain | Unified LLM abstraction, prompts, tools, RAG | +| **Web/API** | FastAPI | Async REST API with auto OpenAPI | +| **Vector Store** | ChromaDB + LightRAG | Hybrid vector + graph RAG | +| **Parsing** | Docling + legacy fallback | Multi-format document parsing | +| **LLM Providers** | OpenAI, Ollama | Cloud and local LLM support | +| **Language** | Python 3.10+ | Primary development language | + +--- + ## Documentation and PRD -- **[ARCHITECTURE.md](./ARCHITECTURE.md)** — System architecture: high-level diagram, Mermaid views, component design, data flow, security. -- **[SPEC.md](./SPEC.md)** — Product requirements: problem statement, solution, features, security controls. +- **[ARCHITECTURE.md](./ARCHITECTURE.md)** — System architecture: LangGraph design, SSDLC agents, data flow, deployment. +- **[SPEC.md](./SPEC.md)** — Product requirements: SSDLC phases, features, security controls. - **[CHANGELOG.md](./CHANGELOG.md)** — Version history; [Releases](https://github.com/arthurpanhku/DocSentinel/releases). -- **Design docs** [docs/](./docs/):Architecture, API spec (OpenAPI), contracts, integration guides (AAD, ServiceNow), deployment runbook. Q1 Launch Checklist: [docs/LAUNCH-CHECKLIST.md](./docs/LAUNCH-CHECKLIST.md). +- **Design docs** [docs/](./docs/): Architecture, API spec (OpenAPI), contracts, integration guides, deployment runbook. --- ## Development & Testing -To verify your installation or contribute to the project, run the test suite: - ### Option A: One-Click Test (Recommended) -Automatically sets up a test environment and runs all checks. - ```bash chmod +x test_integration.sh ./test_integration.sh @@ -328,23 +379,18 @@ chmod +x test_integration.sh ### Option B: Manual ```bash -# 1. Install dev dependencies pip install -r requirements-dev.txt - -# 2. Run all tests pytest - -# 3. Run specific test (e.g. Skills API) -pytest tests/test_skills_api.py +pytest tests/test_skills_api.py # Run specific test ``` ## Contributing Issues and Pull Requests are welcome. Please read [CONTRIBUTING.md](CONTRIBUTING.md) for setup, tests, and commit guidelines. By participating you agree to the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md). -🤖 **AI-Assisted Contribution**: We encourage using AI tools to contribute! Check out [CONTRIBUTING_WITH_AI.md](CONTRIBUTING_WITH_AI.md) for best practices. +AI-Assisted Contribution: We encourage using AI tools to contribute! Check out [CONTRIBUTING_WITH_AI.md](CONTRIBUTING_WITH_AI.md) for best practices. -📜 **Submit a Skill Template**: Have a great security persona? Submit a [Skill Template](https://github.com/arthurpanhku/DocSentinel/issues/new?template=new_skill_template.md) or add it to `examples/templates/`. We welcome real-world (sanitized) security questionnaires to improve our templates! +Submit a Skill Template: Have a great security persona for an SSDLC phase? Submit a [Skill Template](https://github.com/arthurpanhku/DocSentinel/issues/new?template=new_skill_template.md) or add it to `examples/templates/`. --- @@ -367,10 +413,10 @@ This project is licensed under the **MIT License** — see the [LICENSE](./LICEN --- -## Author and links +## Author and Links - **Author**: PAN CHAO (Arthur Pan) - **Repository**: [github.com/arthurpanhku/DocSentinel](https://github.com/arthurpanhku/DocSentinel) - **SPEC and design docs**: See links above. -If you use DocSentinel in your organization or contribute back, we’d love to hear from you (e.g. via GitHub Discussions or Issues). +If you use DocSentinel in your organization or contribute back, we'd love to hear from you (e.g. via GitHub Discussions or Issues). diff --git a/README_zh.md b/README_zh.md index f720b72..d090c26 100644 --- a/README_zh.md +++ b/README_zh.md @@ -10,7 +10,7 @@

DocSentinel
- 面向文档与问卷的自动化安全评估 + AI 驱动的 SSDLC 平台 — 从需求到运维,全生命周期守护软件安全

@@ -19,12 +19,13 @@ Python 3.10+ GitHub repo MCP Ready - Agent Integration + LangChain + LangGraph

- - + +

@@ -32,119 +33,122 @@ ## DocSentinel 是什么? -**DocSentinel** 是面向安全团队的 AI 助手。它自动化审阅与安全相关的**文档、表格和报告**(如安全问卷、设计文档、合规证据),结合策略与知识库进行比对,并产出**结构化评估报告**,包含风险项、合规差距与整改建议。 +**DocSentinel** 是面向安全团队的 **AI 驱动 SSDLC(安全软件开发生命周期)平台**。它使用由 **LangGraph** 编排、**LangChain** 驱动的智能 AI Agent,自动化覆盖软件开发生命周期全部六个阶段的安全活动。 -🚀 **Agent Ready**: 支持 **Model Context Protocol (MCP)**,可作为“技能”被 OpenClaw、Claude Desktop 等智能体直接调用。 +不再只是上线前审阅文档,DocSentinel 从第一天起就嵌入安全: -- **多格式输入**:PDF、Word、Excel、PPT、文本,解析为统一格式供大模型使用。 -- **知识库(RAG)**:上传策略与合规文档,评估时作为参考检索。 -- **多模型支持**:通过统一接口使用 OpenAI、Claude、千问或 **Ollama**(本地)。 -- **结构化输出**:JSON/Markdown 报告,含风险项、合规差距与可执行整改建议。 +| SSDLC 阶段 | DocSentinel 能力 | +| :--- | :--- | +| **需求阶段** | 提取安全需求,识别合规义务(GDPR、PCI DSS、SOC2) | +| **设计阶段** | 自动化威胁建模(STRIDE/DREAD),安全架构评审,SDR 报告 | +| **开发阶段** | 安全编码评估,SAST 结果分拣,安全编码指导 | +| **测试阶段** | SAST/DAST 报告分析,渗透测试审阅,漏洞优先级排序 | +| **部署阶段** | 配置安全审查,加固评估,发布签字 | +| **运维阶段** | 漏洞监控,应急响应辅助,日志审计 | -适合需要在大量项目中扩展安全评估、而人力有限的企业。 +以 **Headless API + MCP 服务** 形式构建,可集成到 CI/CD 管道、AI 智能体(Claude Desktop、Cursor、OpenClaw)及现有安全工作流中。 --- -## 为什么用 DocSentinel? +## 为什么选择 DocSentinel? -| 痛点 (Pain Point) | DocSentinel 的应对 (Solution) | -| :---------------------------------------------------------------- | :------------------------------------------------- | -| **评估依据分散**
策略、标准与先例散落各处。 | 统一**知识库**承载策略与控制项,评估一致、可追溯。 | -| **问卷流程繁重**
业务填表 → 安全评估 → 业务补证据 → 安全再审。 | **自动化初评**与差距分析,减少多轮往返。 | -| **上线前审阅压力**
安全需审阅并签批大量技术文档。 | **结构化报告**让审阅聚焦决策,而非逐行阅读。 | -| **规模与一致性**
项目多、标准多,人工易不一致或延迟。 | **可配置场景**与统一流水线,保证一致与可审计。 | +| 痛点 | DocSentinel 方案 | +| :--- | :--- | +| **SSDLC 覆盖碎片化**
大多数工具只覆盖测试/部署阶段。 | **全生命周期 Agent** 覆盖 6 个 SSDLC 阶段,配备专属 AI 角色。 | +| **无自动化威胁建模**
威胁模型临时创建,缺乏结构化。 | **设计阶段 Agent** 从架构文档自动生成 STRIDE/DREAD 威胁模型。 | +| **问卷流程繁重**
多轮往返审阅。 | **自动化初评**与差距分析,减少人工审阅轮次。 | +| **SAST/DAST 报告泛滥**
发现太多,上下文太少。 | **测试阶段 Agent** 分拣、排序并关联到威胁模型。 | +| **上线前集中压力**
所有安全审查压到最后。 | **左移**策略在需求和设计阶段就发现问题。 | +| **规模与一致性矛盾**
人工评估因审阅者不同而不一致。 | **LangGraph 工作流**确保跨项目一致、可审计的评估。 | -*完整问题陈述与产品目标见 [SPEC.md](./SPEC.md)(产品需求与规格)。* +*完整 SSDLC 阶段说明见 [SPEC.md](./SPEC.md)。* --- ## 架构 -DocSentinel 以**编排器**为核心,协调解析、知识库(RAG)、技能(如问卷与策略比对)与 LLM。可按环境选用云端或本地大模型,以及可选集成(如 AAD、ServiceNow)。 +DocSentinel 基于 **LangGraph** 实现有状态的 Agent 编排,基于 **LangChain** 实现统一 LLM 访问。六个阶段专用 Agent 由图形化状态机协调,支持跨阶段上下文共享。 ```mermaid flowchart TB - subgraph User["👤 User / Security Staff | 用户"] + subgraph User["用户 / 安全团队"] end - subgraph Access["Access Layer | 接入层"] + subgraph Access["接入层"] API["REST API / MCP"] end - subgraph Core["DocSentinel Core | 核心"] - Orch["Orchestrator | 编排"] - Mem["Memory | 记忆"] - Skill["Skills | 技能"] - KB["Knowledge Base (RAG) | 知识库"] - Parser["Parser | 解析"] + subgraph Orchestration["SSDLC 编排层 (LangGraph)"] + Router["阶段路由"] + A1["需求 Agent"] + A2["设计 Agent"] + A3["开发 Agent"] + A4["测试 Agent"] + A5["部署 Agent"] + A6["运维 Agent"] end - subgraph LLM["LLM Layer | 大模型层"] - Abst["LLM Abstraction | LLM 抽象"] + subgraph Core["核心服务"] + KB["知识库 (RAG)"] + Parser["解析器"] + Skill["技能"] + Mem["记忆"] end - subgraph Backends["LLM Backends | 后端"] + subgraph LLM["LLM 层 (LangChain)"] + Abst["LLM 抽象"] + end + subgraph Backends["LLM 后端"] Cloud["OpenAI / Claude / Qwen"] Local["Ollama / vLLM"] end User --> API - API --> Orch - Orch <--> Mem - Orch --> Skill - Orch --> KB - Orch --> Parser - Orch --> Abst - Abst --> Cloud - Abst --> Local + API --> Router + Router --> A1 & A2 & A3 & A4 & A5 & A6 + A1 & A2 & A3 & A4 & A5 & A6 --> KB & Parser & Skill + A1 & A2 & A3 & A4 & A5 & A6 --> Abst + Abst --> Cloud & Local ``` **数据流(简要):** -1. 用户上传文档,可选选择场景或项目。 -2. **Parser 解析器**将文件(PDF、Word、Excel、PPT 等)转为统一文本/Markdown。 -3. **编排器**从**知识库**(RAG)加载相关片段并调用**技能**。 -4. **LLM**(OpenAI、Ollama 等)生成结构化结论。 -5. 返回**评估报告**(风险、合规差距、整改建议)。 +1. 用户选择 SSDLC 阶段并上传文档。 +2. **LangGraph 路由器**分发到对应的**阶段 Agent**。 +3. **解析器**将文件(PDF、Word、Excel、SAST/DAST 报告等)转为文本/Markdown。 +4. 阶段 Agent 检索**知识库**(阶段专属集合)并应用**技能**。 +5. **LLM**(通过 LangChain)生成结构化发现,支持跨阶段追溯。 +6. 返回**评估报告**(风险、威胁、差距、整改建议)。 -*详细架构与组件说明见 [ARCHITECTURE.md](./ARCHITECTURE.md) 与 [docs/01-architecture-and-tech-stack.md](./docs/01-architecture-and-tech-stack.md)。* +*详细架构见 [ARCHITECTURE.md](./ARCHITECTURE.md) 与 [docs/01-architecture-and-tech-stack.md](./docs/01-architecture-and-tech-stack.md)。* --- -## 功能概览 - -| 领域 | 能力 | -| :------------- | :--------------------------------------------------------- | -| **文档解析** | Word、PDF、Excel、PPT、文本 → Markdown/JSON。 | -| **知识库** | 多格式上传、分块、向量化(Chroma)、RAG 检索。 | -| **评估** | 提交文件 → 获得结构化报告(风险项、合规差距、整改建议)。 | -| **LLM** | 可配置提供商:**Ollama**(本地)、OpenAI 等。 | -| **API** | REST API & **MCP Server** for Agent integration. | -| **安全与合规** | 内置 **RBAC**、**审计日志**与 **Prompt Injection** 防护。 | -| **Agent集成** | 支持 **MCP**,可被 OpenClaw、Claude Desktop 等智能体调用。 | - -路线图(如 AAD/SSO、ServiceNow 集成)见 [SPEC.md](./SPEC.md)。 - ---- +## 核心能力 -## 👀 功能预览 +### SSDLC 全生命周期覆盖 +六个专用 AI Agent,各配备阶段专属技能、提示词和知识库集合。支持单阶段运行或端到端全 SSDLC 评估。 -### 1. 评估工作台 -上传文档,选择评估角色(如 SOC2 审计员),即刻获取风险分析。 +### 智能 Agent 编排 (LangGraph) +- **有状态工作流**:LangGraph 状态机跨阶段维护上下文 +- **跨阶段追溯**:设计阶段的威胁关联到测试阶段的测试用例和运维阶段的监控规则 +- **条件路由**:Agent 根据项目风险等级、合规要求或用户选择激活 +- **人机协作**:阶段边界设置中断点,供人工审阅 +- **检查点**:长周期评估持久化状态,支持恢复 -![Assessment Workbench](docs/images/ui-dashboard.png) +### RAG 驱动的知识库 +上传组织内部安全策略、标准和历史审计报告。阶段专属集合确保每个 Agent 检索最相关的上下文: +- 需求阶段:合规框架、安全策略 +- 设计阶段:威胁目录、安全模式 +- 开发阶段:安全编码标准(OWASP) +- 测试阶段:漏洞数据库、修复指南 +- 部署阶段:CIS 基线、加固指南 +- 运维阶段:CVE 数据库、应急手册 -### 2. 结构化报告 -清晰的风险项、合规差距与整改建议视图。 - -![Structured Report](docs/images/ui-report.png) - -### 3. 知识库管理 -上传策略文档至 RAG 知识库,Agent 将引用其作为评估依据。 - -![Knowledge Base](docs/images/ui-kb.png) +### API 优先 & MCP 就绪 +Headless 服务设计。通过 REST API 集成到 CI/CD 管道,或通过 MCP 作为 AI 智能体(Claude Desktop、Cursor、OpenClaw)的技能使用。 --- -## 🤖 Agent 集成 (MCP) +## Agent 集成 (MCP) -将 DocSentinel 连接到 **Claude Desktop** 或 **Cursor**,作为工具使用。 +将 DocSentinel 连接到 **Claude Desktop**、**Cursor** 或 **OpenClaw**,作为强大的 SSDLC 安全技能使用。 ### Claude Desktop 编辑 `claude_desktop_config.json`: @@ -176,8 +180,6 @@ flowchart TB ### 方式 A: 一键部署(推荐) -运行部署脚本以启动全栈服务(API + 仪表盘 + 向量库 + 可选 Ollama)。 - ```bash git clone https://github.com/arthurpanhku/DocSentinel.git cd DocSentinel @@ -185,10 +187,9 @@ chmod +x deploy.sh ./deploy.sh ``` -- **Dashboard**: [http://localhost:8501](http://localhost:8501) - **API Docs**: [http://localhost:8000/docs](http://localhost:8000/docs) -### 方式 B: 手动 Docker 部署 +### 方式 B: 手动部署 **前置条件**: **Python 3.10+**. 可选: [Ollama](https://ollama.ai) (`ollama pull llama2`). @@ -206,30 +207,31 @@ uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 --- -### 示例:提交评估 - -可使用仓库内 [examples/](examples/) 下的示例文件快速试跑。 +### 示例:提交 SSDLC 评估 ```bash -# 使用示例文本 +# 运行设计阶段评估(威胁建模) curl -X POST "http://localhost:8000/api/v1/assessments" \ - -F "files=@examples/sample.txt" \ - -F "scenario_id=default" + -F "files=@examples/architecture-doc.pdf" \ + -F "phase=design" \ + -F "scenario_id=threat-modeling" -# 响应:{ "task_id": "...", "status": "accepted" } — 用返回的 task_id 查询结果 +# 响应:{ "task_id": "...", "status": "accepted" } curl "http://localhost:8000/api/v1/assessments/TASK_ID" ``` ### 示例:上传知识库并检索 ```bash -# 使用示例策略文件 -curl -X POST "http://localhost:8000/api/v1/kb/documents" -F "file=@examples/sample-policy.txt" +# 上传安全策略到需求阶段知识库集合 +curl -X POST "http://localhost:8000/api/v1/kb/documents" \ + -F "file=@examples/sample-policy.txt" \ + -F "collection=kb_requirements" # 检索(RAG) curl -X POST "http://localhost:8000/api/v1/kb/query" \ -H "Content-Type: application/json" \ - -d '{"query": "What are the access control requirements?", "top_k": 5}' + -d '{"query": "访问控制有哪些要求?", "top_k": 5}' ``` --- @@ -239,36 +241,27 @@ curl -X POST "http://localhost:8000/api/v1/kb/query" \ ```text DocSentinel/ ├── app/ # 应用代码 -│ ├── api/ # REST 路由:评估、知识库、健康检查 -│ ├── agent/ # 编排与评估流水线 -│ ├── core/ # 配置 (pydantic-settings) -│ ├── kb/ # 知识库 (Chroma, 分块, RAG) -│ ├── llm/ # LLM 抽象 (OpenAI, Ollama) -│ ├── parser/ # 文档解析 (PDF, Word, Excel, PPT, 文本) -│ ├── models/ # Pydantic 模型 -│ └── main.py +│ ├── api/ # REST 路由:评估、知识库、健康检查、技能 +│ ├── agent/ # LangGraph 编排器、阶段 Agent、技能 +│ │ ├── orchestrator.py # LangGraph 状态机与阶段路由 +│ │ ├── agents/ # 阶段专用 Agent 实现 +│ │ ├── skills_registry.py # 各 SSDLC 阶段内置技能 +│ │ └── skills_service.py # 技能 CRUD 管理 +│ ├── core/ # 配置、防护栏、安全、DB +│ ├── kb/ # 知识库 (Chroma + LightRAG 图 RAG) +│ ├── llm/ # LangChain LLM 抽象 (OpenAI, Ollama) +│ ├── parser/ # 文档解析 (Docling + SAST/DAST + 后备) +│ ├── models/ # Pydantic / SQLModel 模型 +│ ├── main.py # FastAPI 入口 +│ └── mcp_server.py # MCP Server ├── tests/ # 自动化测试 (pytest) -├── examples/ # 示例文件(问卷、策略样本) +├── examples/ # 示例文件 ├── docs/ # 设计与规格文档 -│ ├── 01-architecture-and-tech-stack.md -│ ├── 02-api-specification.yaml -│ ├── 03-assessment-report-and-skill-contract.md -│ ├── 04-integration-guide.md -│ ├── 05-deployment-runbook.md -│ └── schemas/ ├── .github/ # Issue/PR 模板、CI (Actions) -├── Dockerfile -├── docker-compose.yml # 仅 API -├── docker-compose.ollama.yml # API + Ollama 可选 -├── CONTRIBUTING.md # 贡献指南 -├── CODE_OF_CONDUCT.md # 行为准则 +├── SPEC.md # PRD,含 SSDLC 阶段定义 +├── ARCHITECTURE.md # 系统架构,含 LangGraph 设计 ├── CHANGELOG.md -├── SPEC.md -├── LICENSE -├── SECURITY.md ├── requirements.txt -├── requirements-dev.txt # 测试与开发依赖 -├── pytest.ini └── .env.example ``` @@ -276,34 +269,48 @@ DocSentinel/ ## 配置 -| 变量 | 说明 | 默认 | -| :--------------------------------------------- | :------------------- | :---------------------------------- | -| `LLM_PROVIDER` | `ollama` 或 `openai` | `ollama` | -| `OLLAMA_BASE_URL` / `OLLAMA_MODEL` | 本地 LLM | `http://localhost:11434` / `llama2` | -| `OPENAI_API_KEY` / `OPENAI_MODEL` | OpenAI | — | -| `CHROMA_PERSIST_DIR` | 向量库路径 | `./data/chroma` | -| `UPLOAD_MAX_FILE_SIZE_MB` / `UPLOAD_MAX_FILES` | 上传限制 | `50` / `10` | +| 变量 | 说明 | 默认 | +| :--- | :--- | :--- | +| `LLM_PROVIDER` | `ollama` 或 `openai` | `ollama` | +| `OLLAMA_BASE_URL` / `OLLAMA_MODEL` | 本地 LLM | `http://localhost:11434` / `llama2` | +| `OPENAI_API_KEY` / `OPENAI_MODEL` | OpenAI | -- | +| `CHROMA_PERSIST_DIR` | 向量库路径 | `./data/chroma` | +| `PARSER_ENGINE` | 解析器: `auto`, `docling`, `legacy` | `auto` | +| `ENABLE_GRAPH_RAG` | 启用 LightRAG 图检索 | `true` | +| `LANGGRAPH_CHECKPOINT_DIR` | LangGraph 检查点存储 | `./data/checkpoints` | +| `SSDLC_DEFAULT_PHASES` | 全评估默认阶段 | `requirements,design,development,testing,deployment,operations` | +| `UPLOAD_MAX_FILE_SIZE_MB` / `UPLOAD_MAX_FILES` | 上传限制 | `50` / `10` | *完整选项见 [.env.example](./.env.example) 与 [docs/05-deployment-runbook.md](./docs/05-deployment-runbook.md)。* --- -## 文档与规格 +## 技术栈 -- **[ARCHITECTURE.md](./ARCHITECTURE.md)** — 系统架构:高层图、Mermaid 视图(逻辑/组件/时序/集成/部署)、组件设计、数据流、安全架构。 -- **[SPEC.md](./SPEC.md)** — 产品需求与规格:问题陈述、方案、架构摘要、功能、安全控制与开放问题。 -- **[CHANGELOG.md](./CHANGELOG.md)** — 版本历史;发布说明。 -- **设计文档** [docs/](./docs/):架构与技术栈、API 规范、评估报告与 Skill 契约、集成指南、部署手册。Q1 发布清单:[docs/LAUNCH-CHECKLIST.md](./docs/LAUNCH-CHECKLIST.md)。 +| 层 | 技术 | 用途 | +| :--- | :--- | :--- | +| **Agent 编排** | LangGraph | 有状态的图形化 SSDLC 工作流引擎 | +| **LLM 框架** | LangChain | 统一 LLM 抽象、提示词、工具、RAG | +| **Web/API** | FastAPI | 异步 REST API,自动 OpenAPI | +| **向量库** | ChromaDB + LightRAG | 混合向量 + 图 RAG | +| **解析** | Docling + 后备解析器 | 多格式文档解析 | +| **LLM 提供商** | OpenAI, Ollama | 云端与本地 LLM | +| **语言** | Python 3.10+ | 主开发语言 | --- -## 开发与测试 +## 文档与 PRD -如需验证安装或参与开发,请运行测试套件: +- **[ARCHITECTURE.md](./ARCHITECTURE.md)** — 系统架构:LangGraph 设计、SSDLC Agent、数据流、部署。 +- **[SPEC.md](./SPEC.md)** — 产品需求:SSDLC 阶段、功能、安全控制。 +- **[CHANGELOG.md](./CHANGELOG.md)** — 版本历史;[发布](https://github.com/arthurpanhku/DocSentinel/releases)。 +- **设计文档** [docs/](./docs/):架构、API 规范、合约、集成指南、部署手册。 -### 方式 A: 一键测试(推荐) -自动设置测试环境并运行所有检查。 +--- +## 开发与测试 + +### 方式 A: 一键测试(推荐) ```bash chmod +x test_integration.sh ./test_integration.sh @@ -311,30 +318,25 @@ chmod +x test_integration.sh ### 方式 B: 手动 ```bash -# 1. 安装开发依赖 pip install -r requirements-dev.txt - -# 2. 运行所有测试 pytest - -# 3. 运行特定测试(如 Skills API) -pytest tests/test_skills_api.py +pytest tests/test_skills_api.py # 运行特定测试 ``` ## 参与贡献 欢迎提交 Issue 与 Pull Request。请先阅读 [CONTRIBUTING.md](CONTRIBUTING.md) 了解开发环境、测试与提交规范。参与即视为同意 [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) 行为准则。 -🤖 **AI 辅助贡献**:我们也鼓励使用 AI 工具参与贡献!请查看 [CONTRIBUTING_WITH_AI.md](CONTRIBUTING_WITH_AI.md) 获取最佳实践指南。 +AI 辅助贡献:我们鼓励使用 AI 工具参与贡献!请查看 [CONTRIBUTING_WITH_AI.md](CONTRIBUTING_WITH_AI.md)。 -📜 **贡献技能模板**:有好的安全评估角色?欢迎提交 [技能模板 Issue](https://github.com/arthurpanhku/DocSentinel/issues/new?template=new_skill_template.md) 或直接添加到 `examples/templates/`。特别欢迎脱敏后的真实安全问卷样例,帮助我们完善模板! +贡献技能模板:有适用于某个 SSDLC 阶段的安全角色?提交 [技能模板 Issue](https://github.com/arthurpanhku/DocSentinel/issues/new?template=new_skill_template.md) 或添加到 `examples/templates/`。 --- ## 安全 - **漏洞报告**:负责任披露请见 [SECURITY.md](./SECURITY.md)。 -- **安全需求**:项目遵循 [SPEC §7.2](./SPEC.md) 中定义的安全控制(身份、数据保护、应用安全、运维、供应链)。 +- **安全需求**:项目遵循 [SPEC §7.2](./SPEC.md) 中的安全控制。 --- @@ -354,6 +356,5 @@ pytest tests/test_skills_api.py - **作者**: PAN CHAO (Arthur Pan) - **仓库**: [github.com/arthurpanhku/DocSentinel](https://github.com/arthurpanhku/DocSentinel) -- **规格与设计文档**: 见上文链接。 若你在组织中使用 DocSentinel 或希望参与贡献,欢迎通过 GitHub Discussions 或 Issues 联系我们。 diff --git a/SECURITY.md b/SECURITY.md index 9a12b95..9d97f1a 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -1,8 +1,8 @@ # Security Policy | 安全策略 -This document covers vulnerability disclosure and security-related practices for the **DocSentinel** project. It aligns with [**PRD §7.2 Security Requirements and Controls**](./SPEC.md). +This document covers vulnerability disclosure and security-related practices for the **DocSentinel** project — an AI-powered SSDLC platform. It aligns with [**PRD §7.2 Security Requirements and Controls**](./SPEC.md). -本文档涵盖 **DocSentinel** 项目的漏洞披露与安全实践,遵循 [**PRD §7.2 安全需求与控制**](./SPEC.md)。 +本文档涵盖 **DocSentinel** 项目(AI 驱动的 SSDLC 平台)的漏洞披露与安全实践,遵循 [**PRD §7.2 安全需求与控制**](./SPEC.md)。 --- @@ -10,6 +10,7 @@ This document covers vulnerability disclosure and security-related practices for | Version | Supported | | :-------- | :----------------- | +| **4.0.x** | :white_check_mark: | | **3.1.x** | :white_check_mark: | | **3.0.x** | :white_check_mark: | | **2.0.x** | :warning: Limited | @@ -43,20 +44,26 @@ If you discover a security vulnerability, please report it responsibly: - **Input Validation**: File type and size limits are enforced (see `UPLOAD_MAX_FILE_SIZE_MB`, `UPLOAD_MAX_FILES`). Only allowed extensions are parsed (see `app/parser/service.py`). - **Prompt Injection Guardrails**: Input sanitization via regex pattern detection and length limits is enforced before content reaches the LLM (see `app/core/guardrails.py`). Malicious inputs are rejected with HTTP 400. - **TLS**: In production, use HTTPS and TLS 1.2+ for all endpoints and external calls ([PRD §7.2 DATA-01](./SPEC.md)). -- **Auth**: API currently does not enforce authentication in the MVP; add AAD/API Key as per [PRD §5.2.8 and §7.2 IAM](./SPEC.md) before exposing externally. +- **Auth**: API currently does not enforce authentication in the MVP; add AAD/API Key as per [PRD §7.2 IAM](./SPEC.md) before exposing externally. +- **LangGraph State**: Assessment state and checkpoints may contain sensitive document content. Ensure `LANGGRAPH_CHECKPOINT_DIR` is on encrypted storage in production. +- **SAST/DAST Integration**: When ingesting scan results from external tools, validate report integrity and source authenticity. - **机密信息**:请勿提交 `.env` 或任何包含 `SECRET_KEY`、API Key、密码的文件。`.env.example` 仅作为模板使用。 - **输入验证**:强制执行文件类型与大小限制(见 `UPLOAD_MAX_FILE_SIZE_MB`、`UPLOAD_MAX_FILES`)。仅解析允许的扩展名(见 `app/parser/service.py`)。 - **提示注入防护**:通过正则模式检测和长度限制对输入进行清洗,在内容到达 LLM 之前执行(见 `app/core/guardrails.py`)。恶意输入将被 HTTP 400 拒绝。 - **TLS**:生产环境中,所有端点与外部调用必须使用 HTTPS 和 TLS 1.2+([PRD §7.2 DATA-01](./SPEC.md))。 -- **认证**:MVP 阶段 API 暂未强制认证;在对外暴露前,请根据 [PRD §5.2.8 与 §7.2 IAM](./SPEC.md) 添加 AAD/API Key 认证。 +- **认证**:MVP 阶段 API 暂未强制认证;在对外暴露前,请根据 [PRD §7.2 IAM](./SPEC.md) 添加 AAD/API Key 认证。 +- **LangGraph 状态**:评估状态和检查点可能包含敏感文档内容。生产环境中请确保 `LANGGRAPH_CHECKPOINT_DIR` 位于加密存储上。 +- **SAST/DAST 集成**:从外部工具接入扫描结果时,请验证报告完整性和来源真实性。 --- ## References | 参考 - [**SPEC.md Section 7.2**](./SPEC.md) — Security Requirements and Controls (identity, data, application, operations, supply chain). +- [**ARCHITECTURE.md**](./ARCHITECTURE.md) — System architecture with LangGraph design and security architecture section. - [**docs/05-deployment-runbook.md**](./docs/05-deployment-runbook.md) — Deployment, configuration, and network requirements. - [**SPEC.md 第 7.2 节**](./SPEC.md) — 安全需求与控制(身份、数据、应用、运维、供应链)。 +- [**ARCHITECTURE.md**](./ARCHITECTURE.md) — 系统架构,含 LangGraph 设计与安全架构章节。 - [**docs/05-deployment-runbook.md**](./docs/05-deployment-runbook.md) — 部署、配置与网络需求。 diff --git a/SPEC.md b/SPEC.md index 91df733..a32e65c 100644 --- a/SPEC.md +++ b/SPEC.md @@ -3,24 +3,24 @@ | | | | :---------- | :---------------------- | | **Version** | v4.0 | -| **Date** | 2026-03-29 | +| **Date** | 2026-03-30 | | **Author** | PAN CHAO | | **Contact** | u3638376@connect.hku.hk | > **System Architecture | 系统架构文档** > -> Full system architecture (diagrams, data flow, deployment) is maintained in: +> Full system architecture (diagrams, data flow, deployment) is maintained in: > 完整的系统架构说明(含图示、数据流、部署视图)已单独成文: > > **[ARCHITECTURE.md](./ARCHITECTURE.md)** > -> *Section 5 of this PRD contains only an architecture summary and index.* +> *Section 5 of this PRD contains only an architecture summary and index.* > *本文 PRD 第五节仅保留架构摘要与索引。* **History | 版本历史** -- **v4.0**: SSDLC + LangGraph. Full SSDLC lifecycle support (6 stages), LangChain/LangGraph as orchestration engine, stage-specific skills and assessment flows. - SSDLC + LangGraph。完整 SSDLC 生命周期支持(6 阶段),引入 LangChain/LangGraph 作为编排引擎,阶段专属 Skill 与评估流程。 +- **v4.0**: SSDLC + LangGraph. Full SSDLC lifecycle support (6 stages), LangChain/LangGraph as orchestration engine, stage-specific skills and assessment flows. Pivoted to full-phase support with phase-specific SSDLC agents. + SSDLC + LangGraph。完整 SSDLC 生命周期支持(6 阶段),引入 LangChain/LangGraph 作为编排引擎,阶段专属 Skill 与评估流程。转向全阶段支持,配备阶段专用 SSDLC Agent。 - **v3.1**: Performance & quality. Graph RAG, Docling parser, async pipeline, parallel orchestration, guardrails, singleton KB, cached LLM. 性能与质量优化。Graph RAG、Docling 解析器、异步流水线、并行编排、输入防护、单例 KB、缓存 LLM。 - **v3.0**: Headless pivot. Removed Streamlit frontend; pure API + MCP service. @@ -31,9 +31,9 @@ PRD 与系统架构文档分离。 - **v1.3**: Added "Security Requirements and Controls". 新增非业务性「安全需求与安全控制」。 -- **v1.2**: KB multi-format upload & open-source parsing; Parser reuse. +- **v1.2**: KB multi-format upload & open-source parsing; Parser reuse. 知识库多格式上传与开源解析、Parser 复用。 -- **v1.1**: Enterprise integration (ServiceNow), IAM (AAD/SSO, RBAC), Deployment. +- **v1.1**: Enterprise integration (ServiceNow), IAM (AAD/SSO, RBAC), Deployment. 企业集成(ServiceNow)、IAM(AAD/SSO、RBAC)、部署与连通性。 --- @@ -42,11 +42,11 @@ **English** -This PRD is for the open-source "DocSentinel" project. It defines business pain points, solution approach, system architecture, and product scope to serve as a single source of truth for subsequent design and development. The project aims to use an AI Agent to automate the review of and recommendations for security-related documents, forms, and reports, reduce the burden on enterprise security teams, and support integration with mainstream and local LLMs, multi-format file parsing, and extensible Skills and knowledge bases. Starting from v4.0, DocSentinel provides **full SSDLC (Secure Software Development Lifecycle) coverage**, supporting automated assessment at every stage — from requirements and design through development, testing, deployment, and operations — powered by **LangChain and LangGraph** as the agent orchestration framework. +This PRD is for the open-source "DocSentinel" project. It defines business pain points, solution approach, system architecture, and product scope to serve as a single source of truth for subsequent design and development. The project aims to build an **AI-powered SSDLC (Secure Software Development Lifecycle) platform** that automates security activities across all six phases of the software development lifecycle — from requirements gathering to production operations. It automates the review of and recommendations for security-related documents, forms, and reports, reduces the burden on enterprise security teams, and supports integration with mainstream and local LLMs, multi-format file parsing, and extensible Skills and knowledge bases. Powered by **LangChain and LangGraph** for intelligent agent orchestration, it helps enterprise security teams embed security into every stage of delivery, not just the final review. **中文** -本 PRD 面向「DocSentinel」开源项目,用于明确业务痛点、解决方案、系统架构与产品范围,为后续设计与开发提供统一依据。项目目标是通过 AI Agent 自动化完成安全评估相关文档/表格/报告的审阅与建议,减轻企业安全团队负担,并支持对接主流与本地大模型、多格式文件解析及可扩展的 Skill 与知识库。自 v4.0 起,DocSentinel 提供**完整的 SSDLC(安全软件开发生命周期)覆盖**,支持从需求、设计、开发、测试、部署到运维每个阶段的自动化评估,并引入 **LangChain 与 LangGraph** 作为 Agent 编排框架。 +本 PRD 面向「DocSentinel」开源项目,用于明确业务痛点、解决方案、系统架构与产品范围,为后续设计与开发提供统一依据。项目目标是构建一个 **AI 驱动的 SSDLC(安全开发生命周期)平台**,自动化覆盖软件开发生命周期全部六个阶段的安全活动——从需求收集到生产运维。通过 AI Agent 自动化完成安全评估相关文档/表格/报告的审阅与建议,减轻企业安全团队负担,并支持对接主流与本地大模型、多格式文件解析及可扩展的 Skill 与知识库。通过 **LangChain 与 LangGraph** 实现智能 Agent 编排,帮助企业安全团队将安全内嵌到交付的每一个环节,而非仅在最终审阅时介入。 --- @@ -58,38 +58,45 @@ This PRD is for the open-source "DocSentinel" project. It defines business pain Enterprise Cyber Security teams operate under multiple constraints: -- **Diverse reference sources**: Internal security policies, industry best practices (e.g. NIST SSDF, OWASP, CISA), past project cases, and compliance frameworks (e.g. SOC2, ISO 27001). -- **Full SSDLC coverage**: Security review and control requirements exist at every stage—requirements/design, development, testing, deployment, and operations. -- **Wide variety of deliverables**: Security questionnaires, design documents, threat models, SAST/DAST reports, compliance evidence, and audit materials all require manual reading, comparison, and sign-off. +- **Diverse reference sources**: Internal security policies, industry best practices (e.g. NIST SSDF, OWASP, CISA), past project cases, and compliance frameworks (e.g. SOC2, ISO 27001, PCI DSS). +- **Full SSDLC coverage**: Security review and control requirements exist at every stage — requirements/design, development, testing, deployment, and operations — but most tools only address one or two stages. +- **Wide variety of deliverables**: Security questionnaires, threat models, architecture documents, secure coding guidelines, SAST/DAST reports, penetration test findings, deployment checklists, compliance evidence, and audit materials all require manual reading, comparison, and sign-off. +- **Shift-left pressure**: Modern DevSecOps demands security involvement early in the lifecycle, but security teams lack tooling to scale across requirements, design, and development phases. -In agile and DevOps environments, enterprises ship dozens to hundreds of projects per year. Security teams must complete large volumes of assessments and reviews with limited headcount, creating a clear bottleneck. +In agile and DevOps environments, enterprises ship dozens to hundreds of projects per year. Security teams must complete large volumes of assessments and reviews with limited headcount, creating a clear bottleneck — especially when coverage is expected across the entire SSDLC, not just pre-release reviews. **中文** 大型企业的 Cyber Security 团队需要在以下多维度约束下工作: -- **依据来源多样**:公司内部 Security Policy、行业最佳实践(如 NIST SSDF、OWASP、CISA 等)、历史项目案例与合规框架(如 SOC2、ISO 27001)。 -- **流程覆盖完整 SSDLC**:从需求/设计、开发、测试、部署到运维,每个阶段都有安全评审与管控要求。 -- **交付物类型繁多**:安全问卷(Security Questionnaire)、设计文档、威胁建模、SAST/DAST 报告、合规证明、审计材料等,需人工阅读、比对与签字(Sign-off)。 +- **依据来源多样**:公司内部 Security Policy、行业最佳实践(如 NIST SSDF、OWASP、CISA 等)、历史项目案例与合规框架(如 SOC2、ISO 27001、PCI DSS)。 +- **流程覆盖完整 SSDLC**:从需求/设计、开发、测试、部署到运维,每个阶段都有安全评审与管控要求——但大多数工具只覆盖一两个阶段。 +- **交付物类型繁多**:安全问卷、威胁建模、架构文档、安全编码规范、SAST/DAST 报告、渗透测试结果、部署检查清单、合规证明、审计材料等,需人工阅读、比对与签字(Sign-off)。 +- **左移压力**:现代 DevSecOps 要求安全尽早介入生命周期,但安全团队缺乏在需求、设计、开发阶段规模化覆盖的工具支持。 -在敏捷与 DevOps 环境下,企业每年上线项目数量从几十到几百不等,安全人员需要在有限人力下完成大量评估与审阅,成为明显瓶颈。 +在敏捷与 DevOps 环境下,企业每年上线项目数量从几十到几百不等,安全人员需要在有限人力下完成大量评估与审阅,成为明显瓶颈——尤其当覆盖范围从上线前审阅扩展到整个 SSDLC 时。 ### 2.2 Core Pain Points | 核心痛点 -| Pain Point (English) | 痛点描述 (中文) | -| :----------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------- | -| **Fragmented assessment criteria**
Teams must align with policies, industry standards, and project precedents; manual lookup and alignment cost is high. | **评估依据分散**
需同时参照 Policy、行业标准、项目案例;人工查找与对齐成本高。 | -| **Heavy questionnaire workflow**
Multiple rounds of questionnaire filling, assessment, evidence collection, and review; inconsistent templates. | **问卷与证据流程繁重**
问卷—评估—证据—审阅多轮往返;模板不统一、证据质量参差。 | -| **Development-phase control relies on people**
Policy definition, result interpretation, and exception approval still depend on security staff and are hard to scale. | **开发阶段管控依赖人工**
策略制定、结果解读、例外审批仍依赖安全人员,难以规模化。 | -| **Pre-release review pressure**
Security must review every file and sign off. Technical documents are hard for non-technical staff to interpret. | **上线前集中审阅压力大**
需 Review 全部文件并 Sign-off;技术文档阅读与理解成本高。 | -| **Scale vs. consistency**
Manual assessment tends to be inconsistent, incomplete, or delayed; reusable patterns are hard to institutionalize. | **规模与一致性矛盾**
人工评估易出现不一致、遗漏或延迟,且难以沉淀可复用的评估模式。 | +| Pain Point (English) | 痛点描述 (中文) | +| :--- | :--- | +| **Fragmented SSDLC coverage**
Most tools cover only testing/deployment; requirements, design, and development phases lack automated security support. | **SSDLC 覆盖碎片化**
大多数工具仅覆盖测试/部署阶段;需求、设计和开发阶段缺乏自动化安全支持。 | +| **Fragmented assessment criteria**
Teams must align with policies, industry standards, and project precedents; manual lookup and alignment cost is high. | **评估依据分散**
需同时参照 Policy、行业标准、项目案例;人工查找与对齐成本高。 | +| **No unified threat modeling**
Threat models are created ad-hoc in design phase; no automated STRIDE/DREAD analysis or carry-forward to testing. | **威胁建模无统一支持**
设计阶段威胁模型临时创建;无自动化 STRIDE/DREAD 分析,也无法延续至测试阶段。 | +| **Heavy questionnaire workflow**
Multiple rounds of questionnaire filling, assessment, evidence collection, and review; inconsistent templates. | **问卷与证据流程繁重**
问卷—评估—证据—审阅多轮往返;模板不统一、证据质量参差。 | +| **Development-phase control relies on people**
Secure coding guidance, SAST result interpretation, policy definition, and exception approval still depend on security staff and are hard to scale. | **开发阶段管控依赖人工**
安全编码指导、SAST 结果解读、策略制定、例外审批仍依赖安全人员,难以规模化。 | +| **Pre-release review pressure**
Security must review every file and sign off; DAST/pentest reports need interpretation. Technical documents are hard for non-technical staff to interpret. | **上线前集中审阅压力大**
需 Review 全部文件并 Sign-off;DAST/渗透测试报告需解读;技术文档阅读与理解成本高。 | +| **Post-deployment blind spots**
Vulnerability monitoring, incident response, and patch tracking are disconnected from the development lifecycle. | **上线后盲区**
漏洞监控、应急响应和补丁跟踪与开发生命周期脱节。 | +| **Scale vs. consistency**
Manual assessment tends to be inconsistent, incomplete, or delayed; reusable patterns are hard to institutionalize. | **规模与一致性矛盾**
人工评估易出现不一致、遗漏或延迟,且难以沉淀可复用的评估模式。 | | **SSDLC coverage gaps**
Security involvement is unevenly distributed across the SSDLC; requirements and design phases often get less scrutiny than pre-release review, leaving risks to accumulate. | **SSDLC 覆盖断层**
安全介入在 SSDLC 各阶段分布不均;需求与设计阶段审查不足,风险层层积累到上线前集中爆发。 | ### 2.3 Desired Change | 期望改变 -- **Automation / 自动化**: Automate analysis and initial assessment of forms, documents, and reports to reduce repetitive manual reading. +- **Full lifecycle coverage / 全生命周期覆盖**: Provide AI-assisted security support across all six SSDLC phases, not just testing and deployment. +- **Automation / 自动化**: Automate analysis and initial assessment of security artifacts at each phase — from requirements to operations. - **Consistency / 一致性**: Produce consistent assessment conclusions and remediation recommendations based on a unified knowledge base and policies. -- **Extensibility / 可扩展**: Support assessment scenarios for different compliance frameworks and customer/project types. +- **Intelligence / 智能化**: Use LangGraph-orchestrated agents to reason about cross-phase dependencies (e.g. a threat identified in design must be tested and monitored). +- **Extensibility / 可扩展**: Support custom SSDLC workflows, assessment scenarios, phase-specific skills, and different compliance frameworks and customer/project types. - **SSDLC coverage / 全生命周期覆盖**: Provide stage-aware assessment across the entire SSDLC — requirements, design, development, testing, deployment, and operations — with stage-specific skills and checklists. --- @@ -100,29 +107,109 @@ In agile and DevOps environments, enterprises ship dozens to hundreds of project **English** -Build a **dedicated AI Agent for security teams**, with the primary focus on **automating the assessment of all forms, documents, and reports that require security team review across the entire Secure Software Development Lifecycle (SSDLC)**. After security staff submit project-related files to the Agent, the Agent will: +Build an **AI-powered SSDLC platform for security teams**, with the primary focus on **automating security activities and assessment of all forms, documents, and reports across the entire secure software development lifecycle**. After security staff submit project-related files to the Agent, the platform: -1. **Parse multi-format files**: Convert Word, PDF, Excel, PPT, images, etc. into an intermediate format (e.g. JSON/Markdown). -2. **Use knowledge base and policy**: Rely on built-in or configurable compliance and policy knowledge to understand "what standards must be met." -3. **SSDLC-aware assessment**: Automatically determine or accept the SSDLC stage (Requirements, Design, Development, Testing, Deployment, Operations) and apply stage-specific assessment logic, checklists, and risk focus. -4. **Perform risk assessment and recommendations**: Identify security/compliance risks and provide security advice and actionable remediation. -5. **Produce structured output**: Enable security staff to quickly review, sign off, or hand off to business/development for remediation. +1. **Parses multi-format files**: Convert Word, PDF, Excel, PPT, SAST/DAST reports, images, etc. into an intermediate format (e.g. JSON/Markdown). +2. **Uses knowledge base and policy**: Rely on built-in or configurable compliance and policy knowledge to understand "what standards must be met." +3. **SSDLC-aware assessment**: Automatically determine or accept the SSDLC stage and apply stage-specific assessment logic, checklists, and risk focus. +4. **Performs risk assessment and recommendations**: Identify security/compliance risks and provide security advice and actionable remediation. +5. **Produces structured output**: Enable security staff to quickly review, sign off, or hand off to business/development for remediation. + +The platform covers six standard SSDLC phases with dedicated AI agents for each: + +1. **Requirements Phase Agent**: Analyze requirements documents to identify security requirements, compliance obligations (GDPR, PCI DSS, etc.), and perform initial risk analysis. +2. **Design Phase Agent**: Review architecture/design documents, perform automated threat modeling (STRIDE/DREAD), evaluate security architecture, encryption schemes, and access control designs. Conduct Security Design Review (SDR). +3. **Development Phase Agent**: Assess code against secure coding standards, review SAST findings, evaluate security controls (anti-injection, XSS prevention), and provide secure coding guidance. +4. **Testing Phase Agent**: Analyze SAST/DAST scan reports, interpret penetration test results, prioritize vulnerability fixes, and verify remediation completeness. +5. **Deployment Phase Agent**: Review deployment configurations, evaluate secret management, assess hardening measures, and perform pre-release security sign-off checks. +6. **Operations Phase Agent**: Monitor vulnerability feeds, assist incident response, track patch management, and audit security logs. + +The platform uses **LangGraph** to orchestrate these agents into configurable workflows — agents can run sequentially, in parallel, or conditionally based on project context. **LangChain** provides the unified LLM abstraction, tool integration, and RAG pipeline. **中文** -构建**安全团队专用 AI Agent**,首要方向为:**自动化评估所有需要安全团队审阅的表格、文档与报告,覆盖完整的安全软件开发生命周期(SSDLC)**。安全人员将项目相关文件提交给 Agent 后,Agent 能够: +构建一个**面向安全团队的 AI 驱动 SSDLC 平台**,首要方向为:**自动化覆盖安全软件开发生命周期的全部安全活动,评估所有需要安全团队审阅的表格、文档与报告**。安全人员将项目相关文件提交给 Agent 后,平台能够: -1. **解析多格式文件**:将 Word、PDF、Excel、PPT、图片等转为可被模型理解的中间格式(如 JSON/Markdown)。 +1. **解析多格式文件**:将 Word、PDF、Excel、PPT、SAST/DAST 报告、图片等转为可被模型理解的中间格式(如 JSON/Markdown)。 2. **结合知识库与策略**:基于内置/可配置的合规与策略知识库,理解「应该满足什么标准」。 -3. **SSDLC 阶段感知评估**:自动识别或接受 SSDLC 阶段(需求、设计、开发、测试、部署、运维),应用阶段专属评估逻辑、检查清单和风险关注点。 +3. **SSDLC 阶段感知评估**:自动识别或接受 SSDLC 阶段,应用阶段专属评估逻辑、检查清单和风险关注点。 4. **执行风险评估与建议**:识别与安全/合规相关的风险点,给出安全建议与可操作的整改方案。 5. **输出结构化结果**:便于安全人员快速复核、签字或转交业务/开发团队整改。 -### 3.2 Solution Value | 方案价值 +平台为六个标准 SSDLC 阶段配备专用 AI Agent: + +1. **需求阶段 Agent**:分析需求文档,识别安全需求、合规义务(GDPR、PCI DSS 等),执行初步风险分析。 +2. **设计阶段 Agent**:审阅架构/设计文档,执行自动化威胁建模(STRIDE/DREAD),评估安全架构、加密方案、权限设计。执行安全设计评审(SDR)。 +3. **开发阶段 Agent**:对照安全编码规范评估代码,审阅 SAST 发现,评估安全控件(防注入、XSS 防护),提供安全编码指导。 +4. **测试阶段 Agent**:分析 SAST/DAST 扫描报告,解读渗透测试结果,确定漏洞修复优先级,验证整改完整性。 +5. **部署阶段 Agent**:审阅部署配置,评估密钥管理,评估加固措施,执行上线前安全检查。 +6. **运维阶段 Agent**:监控漏洞情报,辅助应急响应,跟踪补丁管理,审计安全日志。 + +平台使用 **LangGraph** 将这些 Agent 编排为可配置的工作流——Agent 可根据项目上下文顺序执行、并行执行或条件执行。**LangChain** 提供统一的 LLM 抽象、工具集成和 RAG 管道。 + +### 3.2 SSDLC Phase Details | SSDLC 阶段详述 + +#### Phase 1: Requirements | 需求阶段 + +| Activity (English) | 活动 (中文) | Agent Capability | +| :--- | :--- | :--- | +| Define security requirements | 定义安全需求 | Extract security-relevant requirements from PRDs, user stories, BRDs | +| Identify compliance obligations | 识别合规要求 | Match requirements against GDPR, PCI DSS, SOC2, ISO 27001, etc. | +| Initial risk analysis | 初步风险分析 | Classify project risk level based on data sensitivity, exposure, and scope | +| Security requirements checklist | 安全需求清单 | Generate a checklist of security requirements that must be addressed | + +#### Phase 2: Design | 设计阶段 + +| Activity (English) | 活动 (中文) | Agent Capability | +| :--- | :--- | :--- | +| Security architecture review | 安全架构评审 | Evaluate architecture documents for security patterns and anti-patterns | +| Threat modeling (STRIDE/DREAD) | 威胁建模 | Automated STRIDE analysis on design documents; DREAD risk scoring | +| Access control & encryption design | 权限设计与加密方案 | Review IAM design, data flow encryption, key management proposals | +| Security Design Review (SDR) | 安全设计评审 | Structured SDR report with findings and recommendations | + +#### Phase 3: Development | 开发阶段 + +| Activity (English) | 活动 (中文) | Agent Capability | +| :--- | :--- | :--- | +| Secure coding standards assessment | 安全编码规范评估 | Check code/documents against OWASP Secure Coding Practices | +| SAST findings review | SAST 结果审阅 | Triage and interpret SAST tool output, reduce false positives | +| Built-in security controls | 内置安全控件 | Evaluate anti-injection, XSS prevention, CSRF protection implementations | +| Secure coding guidance | 安全编码指导 | Provide language-specific secure coding recommendations | + +#### Phase 4: Testing | 测试阶段 -- **Cost reduction / 降本**: Reduce time security staff spend on repetitive document review. -- **Speed / 提速**: Shorten the questionnaire → assessment → evidence → review cycle. -- **Reproducibility / 可复现**: Assessment logic and criteria can be captured in the knowledge base and Skills. +| Activity (English) | 活动 (中文) | Agent Capability | +| :--- | :--- | :--- | +| SAST report analysis | SAST 报告分析 | Parse and prioritize static analysis findings | +| DAST report analysis | DAST 报告分析 | Parse and interpret dynamic scan results | +| Penetration test review | 渗透测试审阅 | Analyze pentest reports, map findings to controls | +| Vulnerability fix verification | 漏洞修复验证 | Verify remediation evidence against original findings | + +#### Phase 5: Deployment / Release | 部署/发布阶段 + +| Activity (English) | 活动 (中文) | Agent Capability | +| :--- | :--- | :--- | +| Pre-release security review | 上线前安全评审 | Checklist-based review of all phase outputs | +| Configuration security | 配置安全 | Review deployment configs, secrets management, least privilege | +| Security hardening assessment | 安全加固评估 | Evaluate server/container hardening against CIS benchmarks | +| Release sign-off | 发布签字 | Generate structured sign-off report with risk summary | + +#### Phase 6: Operations / Maintenance | 运维/响应阶段 + +| Activity (English) | 活动 (中文) | Agent Capability | +| :--- | :--- | :--- | +| Vulnerability monitoring | 漏洞监控 | Analyze CVE feeds and vulnerability advisories against project stack | +| Incident response assistance | 应急响应辅助 | Provide structured incident analysis and response recommendations | +| Patch management tracking | 补丁管理跟踪 | Track vulnerability remediation progress and SLA compliance | +| Log audit analysis | 日志审计分析 | Analyze security logs for anomalies and compliance evidence | + +### 3.3 Solution Value | 方案价值 + +- **Full lifecycle / 全生命周期**: Security coverage from day one (requirements) through production operations — not just pre-release review. +- **Cost reduction / 降本**: Reduce time security staff spend on repetitive document review across all SSDLC phases. +- **Speed / 提速**: Shorten the cycle time at each phase; enable parallel security review with development. +- **Intelligence / 智能化**: LangGraph-orchestrated agents maintain cross-phase context — a threat identified in design is automatically tracked through testing and deployment. +- **Reproducibility / 可复现**: Assessment logic and criteria are captured in the knowledge base, skills, and graph-based workflows. - **Openness / 开放**: Support multiple commercial and local LLMs to meet requirements for data residency and cost control. --- @@ -131,20 +218,23 @@ Build a **dedicated AI Agent for security teams**, with the primary focus on **a ### 4.1 Product Goals | 产品目标 -| Goal | Description | -| :------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------- | -| **Automated assessment**
自动化评估 | Support automatic parsing and risk assessment of common formats for security questionnaires, design documents, compliance evidence, and audit reports. | -| **Configurable scenarios**
可配置评估场景 | Use the knowledge base and Skills to configure different assessment criteria and check items by compliance framework, customer type, or project type. | -| **Multi-model support**
多模型支持 | Support mainstream commercial LLMs (e.g. ChatGPT, Qwen, Claude) and local/on-prem models (e.g. Ollama) through a unified interface. | -| **Actionable results**
结果可操作 | Output risk items, compliance gaps, concrete remediation suggestions, and (optionally) priority. | -| **SSDLC lifecycle**
SSDLC 全生命周期 | Cover all 6 SSDLC stages (Requirements, Design, Development, Testing, Deployment, Operations) with stage-specific skills, checklists, and flows. | +| Goal | Description | +| :--- | :--- | +| **SSDLC full coverage**
SSDLC 全阶段覆盖 | Provide AI-assisted security assessment across all 6 SSDLC phases with dedicated agents for each, with stage-specific skills, checklists, and flows. | +| **Intelligent orchestration**
智能编排 | Use LangGraph to create configurable, stateful agent workflows that maintain context across SSDLC phases. | +| **Automated assessment**
自动化评估 | Support automatic parsing and risk assessment of common formats: security questionnaires, design documents, SAST/DAST reports, pentest findings, deployment configs, compliance evidence, and audit reports. | +| **Configurable scenarios**
可配置评估场景 | Use the knowledge base and Skills to configure different assessment criteria by compliance framework, SSDLC phase, project type, customer type, or risk level. | +| **Multi-model support**
多模型支持 | Support mainstream commercial LLMs (e.g. ChatGPT, Qwen, Claude) and local/on-prem models (e.g. Ollama) through a unified LangChain interface. | +| **Actionable results**
结果可操作 | Output risk items, compliance gaps, threat models, remediation suggestions, and sign-off reports with traceability across phases. | ### 4.2 Success Metrics (Suggested) | 成功指标(建议) -- **Coverage**: Number of supported document types (e.g. 5+ common formats) and knowledge base entries. -- **Efficiency**: Average time from upload to report generation; time saved vs. manual review. -- **Usability**: Steps and time to complete one "upload → view report → make decision" loop. -- **Extensibility**: Configuration/development cost to add a new file type or assessment scenario. +- **SSDLC Coverage**: Number of SSDLC phases with active agent support (target: 6/6). +- **Coverage**: Number of supported document types (e.g. 8+ common formats) and knowledge base entries per phase. +- **Efficiency**: Average time from upload to report generation per phase; time saved vs. manual review. +- **Cross-phase traceability**: Percentage of findings that are tracked from identification to remediation across phases. +- **Usability**: Steps and time to complete one "upload → assess → review → sign-off" loop per phase. +- **Extensibility**: Configuration/development cost to add a new SSDLC phase workflow or assessment scenario. --- @@ -152,7 +242,7 @@ Build a **dedicated AI Agent for security teams**, with the primary focus on **a > **Full Architecture Document** > -> For detailed diagrams, data flow, deployment, and security architecture, see: +> For detailed diagrams, data flow, deployment, and security architecture, see: > 详细组件说明、Mermaid 架构图、数据流与时序图、集成视图、安全架构及部署视图见: > > **[ARCHITECTURE.md](./ARCHITECTURE.md)** @@ -161,11 +251,11 @@ Build a **dedicated AI Agent for security teams**, with the primary focus on **a **English** -The system uses a layered design: **Access** (REST API / MCP Server) → **Core** (Orchestrator, SSDLC Pipeline, Memory, Skills, Knowledge Base RAG, Parser) → **LLM abstraction** → **Cloud/local LLMs**. The orchestrator is built on **LangChain + LangGraph**, enabling stateful, graph-based agent workflows with conditional branching per SSDLC stage. Optional integrations: **AAD** (identity/SSO) and **ServiceNow** (project metadata). +The system uses a layered design: **Access** (REST API / MCP Server / CLI) → **SSDLC Orchestration** (LangGraph state machine with phase-specific agents, SSDLC Pipeline, Memory, Skills) → **Core Services** (Knowledge Base RAG, Parser) → **LLM Abstraction** (LangChain) → **Cloud/Local LLMs**. The orchestrator is built on **LangChain + LangGraph**, enabling stateful, graph-based agent workflows with conditional branching per SSDLC stage. Optional integrations: **AAD** (identity/SSO), **ServiceNow** (project metadata), and **SAST/DAST tools** (scan results ingestion). **中文** -系统采用分层设计:**接入层**(REST API / MCP Server)→ **核心**(任务编排、SSDLC 流水线、记忆体、Skill 层、知识库 RAG、文件解析)→ **LLM 抽象层** → **商用/本地 LLM**。编排引擎基于 **LangChain + LangGraph** 构建,支持有状态、图驱动的 Agent 工作流与 SSDLC 阶段条件分支。可选对接 **AAD**(身份/SSO)与 **ServiceNow**(项目元数据)。 +系统采用分层设计:**接入层**(REST API / MCP Server / CLI)→ **SSDLC 编排层**(LangGraph 状态机与阶段专用 Agent、SSDLC 流水线、记忆体、Skill 层)→ **核心服务**(知识库 RAG、文件解析)→ **LLM 抽象层**(LangChain)→ **商用/本地 LLM**。编排引擎基于 **LangChain + LangGraph** 构建,支持有状态、图驱动的 Agent 工作流与 SSDLC 阶段条件分支。可选对接 **AAD**(身份/SSO)、**ServiceNow**(项目元数据)及 **SAST/DAST 工具**(扫描结果接入)。 **High-Level Diagram | 架构图** @@ -175,50 +265,59 @@ The system uses a layered design: **Access** (REST API / MCP Server) → **Core* └───────────────────────────┬─────────────────────────────┘ │ ┌───────────────────────────▼─────────────────────────────┐ - │ Access Layer | 接入层 (API / MCP) │ + │ Access Layer | 接入层 (API / MCP / CLI) │ └───────────────────────────┬─────────────────────────────┘ │ ┌───────────────────────────────────────────▼───────────────────────────────────────────┐ - │ DocSentinel Core | 核心 │ - │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ - │ │ Orchestrator│ │ Memory │ │ Skills │ │ KB (RAG) │ │ Parser │ │ - │ │ 任务编排 │ │ 记忆体 │ │ Skill 层 │ │ 知识库 │ │ 文件解析 │ │ - │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └─────┬───────┘ │ - │ │ │ │ │ │ │ - │ └────────────────┴────────────────┴────────────────┴────────────────┘ │ - │ │ │ - │ ┌───────────▼───────────┐ │ - │ │ LLM Abstraction Layer│ │ - │ └───────────┬───────────┘ │ - └──────────────────────────────────────────┼──────────────────────────────────────────────┘ - │ - ┌─────────────────────────────────────┼─────────────────────────────────────┐ - │ Commercial/Cloud LLM │ Local/On-prem LLM │ - │ ChatGPT / Claude / Qwen / Gemini │ Ollama / vLLM / ... │ - └─────────────────────────────────────────────────────────────────────────────┘ + │ SSDLC Orchestration (LangGraph) | SSDLC 编排层 │ + │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ + │ │ Require- │ │ Design │ │ Dev │ │ Test │ │ Deploy │ │ Ops │ │ + │ │ ments │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │ + │ │ Agent │ │ │ │ │ │ │ │ │ │ │ │ + │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ + │ └─────────────┴────────────┴─────────────┴────────────┴────────────┘ │ + │ │ │ + │ ┌─────────────┐ ┌─────────────┐ │ ┌─────────────┐ ┌─────────────┐ │ + │ │ Memory │ │ Skills │ │ │ KB (RAG) │ │ Parser │ │ + │ │ 记忆体 │ │ Skill 层 │ │ │ 知识库 │ │ 文件解析 │ │ + │ └─────────────┘ └─────────────┘ │ └─────────────┘ └─────────────┘ │ + │ │ │ + │ ┌───────────▼───────────┐ │ + │ │ LLM Abstraction Layer │ │ + │ │ (LangChain) │ │ + │ └───────────┬───────────┘ │ + └────────────────────────────────────┼───────────────────────────────────────────────────┘ + │ + ┌───────────────────────────────┼───────────────────────────────────┐ + │ Commercial/Cloud LLM │ Local/On-prem LLM │ + │ ChatGPT / Claude / Qwen │ Ollama / vLLM / ... │ + └───────────────────────────────────────────────────────────────────┘ ``` ### 5.2 Component Index | 核心组件索引 -| Component | Role | Details | -| :------------------ | :-------------------------------------------------------- | :----------------------------------- | -| **Orchestrator** | LangGraph-based stateful agent graph; coordinates Parser, KB, Skills, LLM. | ARCHITECTURE.md § Component Design | -| **SSDLC Pipeline** | Stage-aware routing (6 stages); selects stage-specific skills and checklists. | ARCHITECTURE.md § SSDLC Pipeline | -| **Memory** | Manages working, episodic, and semantic memory. | ARCHITECTURE.md § Component Design | -| **Skills** | Reusable assessment capabilities (e.g. policy check, SSDLC stage skills). | ARCHITECTURE.md § Component Design | -| **Knowledge Base** | Multi-format ingestion, chunking, embedding, RAG. | ARCHITECTURE.md § Component Design | -| **Parser** | Converts files (PDF, Word, Excel, etc.) to Markdown/JSON. | ARCHITECTURE.md § Component Design | -| **LLM Abstraction** | Unified interface for model switching. | ARCHITECTURE.md § Component Design | -| **Integrations** | AAD (SSO), ServiceNow (metadata). | ARCHITECTURE.md § Integration Points | +| Component | Role | Details | +| :--- | :--- | :--- | +| **SSDLC Orchestrator** | LangGraph state machine coordinating phase agents with conditional routing and shared state. | ARCHITECTURE.md § Component Design | +| **SSDLC Pipeline** | Stage-aware routing (6 stages); selects stage-specific skills and checklists. | ARCHITECTURE.md § SSDLC Pipeline | +| **Phase Agents** | Six dedicated agents, each with phase-specific prompts, tools, and evaluation criteria. | ARCHITECTURE.md § SSDLC Agents | +| **Memory** | Manages working, episodic, cross-phase, and semantic state via LangGraph checkpointing. | ARCHITECTURE.md § Component Design | +| **Skills** | Reusable assessment capabilities (e.g. threat modeling, SAST triage, compliance check, SSDLC stage skills). | ARCHITECTURE.md § Component Design | +| **Knowledge Base** | Multi-format ingestion, chunking, embedding, hybrid RAG (vector + graph). | ARCHITECTURE.md § Component Design | +| **Parser** | Converts files (PDF, Word, Excel, SAST/DAST reports, etc.) to Markdown/JSON. | ARCHITECTURE.md § Component Design | +| **LLM Abstraction** | LangChain unified interface for model switching. | ARCHITECTURE.md § Component Design | +| **Integrations** | AAD (SSO), ServiceNow (metadata), SAST/DAST tool connectors. | ARCHITECTURE.md § Integration Points | ### 5.3 Data Flow (Summary) | 数据流(简要) -1. User submits **assessment task** (files + optional SSDLC stage / skill ID) via API or MCP. API returns `task_id` immediately (non-blocking). -2. **Parser** converts files to intermediate Markdown/text format (Docling or legacy). -3. **SSDLC Router** determines the lifecycle stage (auto-detect or user-specified) and selects stage-specific skill + checklist. -4. **LangGraph Orchestrator** executes the agent graph: Policy+History and Evidence nodes run **in parallel**, followed by Drafter and Reviewer nodes. -5. Structured **assessment report** (risks, gaps, remediations, confidence, sources, SSDLC stage) is stored. -6. User polls `GET /assessments/{task_id}` to retrieve the completed report. +1. User submits **SSDLC assessment task** (files + phase + optional SSDLC stage / skill ID / project/scenario) via API/MCP. API returns `task_id` immediately (non-blocking). +2. (Optional) Fetch **project metadata** from ServiceNow. +3. **Parser** converts files to intermediate Markdown/text format (Docling or legacy). +4. **SSDLC Router** determines the lifecycle stage (auto-detect or user-specified) and selects stage-specific skill + checklist. +5. **LangGraph Orchestrator** routes to the appropriate **Phase Agent(s)**. Policy+History and Evidence nodes run **in parallel**, followed by Drafter and Reviewer nodes. +6. Phase Agent(s) load **Knowledge Base** chunks (RAG) and **Skills**, call **LLM** with context. +7. Generate structured **assessment report** (risks, gaps, threat model, remediations, confidence, sources, SSDLC stage) with cross-phase traceability. +8. Results stored for **human-in-the-loop** review and sign-off. User polls `GET /assessments/{task_id}` to retrieve the completed report. --- @@ -226,51 +325,62 @@ The system uses a layered design: **Access** (REST API / MCP Server) → **Core* ### 6.1 Core Feature List | 核心功能列表 -| Module | Feature | Priority | -| :----------------- | :---------------------------------------------------------------------- | :------- | -| **Parser** | Upload Word / PDF / Excel / PPT and convert to JSON/Markdown. | P0 | -| **Parser** | OCR / Vision support for images. | P1 | -| **Parser** | Ingest architecture diagrams as text inputs (e.g. Mermaid.js `.mmd`) for Design-stage reviews. | P1 | -| **Knowledge Base** | Upload multi-format docs, parse, chunk, embed, and retrieve (RAG). | P0 | -| **Knowledge Base** | Metadata filtering (e.g. by framework, customer). | P1 | -| **Knowledge Base** | **Graph RAG**: Map relationships across internal policies and controls (e.g., Cloud Policy ↔ Data Privacy Policy) for deeper compliance insights. | P1 | -| **Assessment** | Select scenario, upload files, trigger assessment. | P0 | -| **Assessment** | Output structured report (Risks, Gaps, Remediation, **Confidence**). | P0 | -| **Assessment** | **Human-in-the-Loop**: Review, approve, reject, comment workflow. | P0 | -| **Assessment** | HITL feedback learning: allow auditors to **correct** findings and feed accepted corrections back into history/KB to reduce future false positives. | P1 | -| **Assessment** | Per-finding **Confidence Scores** + evidence links (page/paragraph citations) to speed up manual verification and benchmarking. | P1 | -| **LLM** | Configurable commercial LLMs (OpenAI, Claude, etc.). | P0 | -| **LLM** | Configurable local models (Ollama). | P0 | -| **Skill** | **Skill/Persona Management**: Create custom roles and import templates. | P0 | -| **Skill** | Built-in personas (e.g. SOC2 Auditor, AppSec Engineer). | P0 | -| **Orchestrator** | **LangGraph**: Stateful graph-based agent orchestration with conditional branching. | P0 | -| **SSDLC** | **Requirements Stage**: Security requirements, compliance mapping, threat modeling inputs. | P0 | -| **SSDLC** | **Design Stage**: Security architecture review, STRIDE/DREAD, encryption/permission design, SDR. | P0 | -| **SSDLC** | Design-stage threat modeling integration: support PyTM exports and Mermaid.js diagrams to help the agent “see” architecture, data flows, and trust boundaries. | P1 | -| **SSDLC** | **Development Stage**: Secure coding standards, built-in controls (anti-injection, XSS). | P0 | -| **SSDLC** | **Testing Stage**: SAST/DAST report review, penetration test findings, vulnerability verification. | P0 | -| **SSDLC** | **Deployment Stage**: Release readiness review, config security, key management, hardening. | P0 | -| **SSDLC** | **Operations Stage**: Vulnerability monitoring, incident response, patch management, log audit. | P0 | -| **SSDLC** | **Auto-detect stage** from document content or accept explicit stage parameter. | P1 | -| **Memory** | **History Reuse**: Retrieve past similar answers. | P1 | -| **Access** | REST API + MCP Server. | P0 | -| **Integrations** | ServiceNow: Read project metadata. | P0 | -| **Integrations** | ServiceNow: Write back results / Webhook trigger. | P1 | -| **Integrations** | Automated remediation tracking: create and sync remediation items to Jira or GitHub Issues (ticket links in report). | P1 | -| **IAM** | AAD (Azure AD) Login & SSO. | P0 | -| **IAM** | RBAC (Analyst, Lead, Project Owner, Admin, API Consumer). | P0 | -| **IAM** | API Authentication (Bearer Token / API Key). | P0 | -| **IAM** | Data isolation by project/role. | P0 | +| Module | Feature | Priority | +| :--- | :--- | :--- | +| **SSDLC Orchestrator** | LangGraph-based state machine with 6 phase agents and conditional routing. | P0 | +| **SSDLC Orchestrator** | Cross-phase state management and finding traceability. | P0 | +| **SSDLC Orchestrator** | Configurable workflows: sequential, parallel, or selective phase execution. | P1 | +| **Requirements Agent** | Analyze requirements docs for security requirements and compliance obligations. | P0 | +| **Design Agent** | Automated threat modeling (STRIDE/DREAD) from architecture documents. | P0 | +| **Design Agent** | Security Design Review (SDR) report generation. | P0 | +| **Design Agent** | Threat modeling integration: support PyTM exports and Mermaid.js diagrams to help the agent “see” architecture, data flows, and trust boundaries. | P1 | +| **Development Agent** | Secure coding assessment against OWASP standards. | P0 | +| **Development Agent** | SAST findings triage and interpretation. | P1 | +| **Testing Agent** | SAST/DAST report parsing and vulnerability prioritization. | P0 | +| **Testing Agent** | Penetration test report analysis and remediation tracking. | P1 | +| **Deployment Agent** | Pre-release security checklist and configuration review. | P0 | +| **Deployment Agent** | CIS benchmark assessment for hardening. | P1 | +| **Operations Agent** | Vulnerability monitoring and CVE analysis against project stack. | P1 | +| **Operations Agent** | Incident response assistance and log audit. | P2 | +| **SSDLC** | **Auto-detect stage** from document content or accept explicit stage parameter. | P1 | +| **Parser** | Upload Word / PDF / Excel / PPT / SAST/DAST reports and convert to JSON/Markdown. | P0 | +| **Parser** | OCR / Vision support for images. | P1 | +| **Parser** | Ingest architecture diagrams as text inputs (e.g. Mermaid.js `.mmd`) for Design-stage reviews. | P1 | +| **Knowledge Base** | Upload multi-format docs, parse, chunk, embed, and retrieve (RAG). | P0 | +| **Knowledge Base** | Metadata filtering (e.g. by framework, SSDLC phase, project, customer). | P1 | +| **Knowledge Base** | Phase-specific knowledge collections (requirements policies, design patterns, coding standards, etc.). | P0 | +| **Knowledge Base** | **Graph RAG**: Map relationships across internal policies and controls for deeper compliance insights. | P1 | +| **Assessment** | Select SSDLC phase and scenario, upload files, trigger assessment. | P0 | +| **Assessment** | Output structured report (Risks, Gaps, Threat Model, Remediation, Confidence). | P0 | +| **Assessment** | **Human-in-the-Loop**: Review, approve, reject, comment workflow. | P0 | +| **Assessment** | HITL feedback learning: allow auditors to **correct** findings and feed accepted corrections back into history/KB to reduce future false positives. | P1 | +| **Assessment** | Per-finding **Confidence Scores** + evidence links (page/paragraph citations) to speed up manual verification and benchmarking. | P1 | +| **LLM** | Configurable commercial LLMs (OpenAI, Claude, etc.) via LangChain. | P0 | +| **LLM** | Configurable local models (Ollama) via LangChain. | P0 | +| **Skill** | **Skill/Persona Management**: Create custom roles and import templates. | P0 | +| **Skill** | Built-in personas per SSDLC phase (e.g. Threat Modeler, Secure Code Reviewer, Pentest Analyst, SOC2 Auditor, AppSec Engineer). | P0 | +| **Memory** | LangGraph checkpointing for cross-phase state persistence. | P0 | +| **Memory** | **History Reuse**: Retrieve past similar assessments. | P1 | +| **Access** | REST API + MCP Server for agent integration. | P0 | +| **Integrations** | ServiceNow: Read project metadata. | P0 | +| **Integrations** | ServiceNow: Write back results / Webhook trigger. | P1 | +| **Integrations** | SAST/DAST tool connectors (SonarQube, Checkmarx, Burp, etc.). | P1 | +| **Integrations** | Automated remediation tracking: create and sync remediation items to Jira or GitHub Issues (ticket links in report). | P1 | +| **IAM** | AAD (Azure AD) Login & SSO. | P0 | +| **IAM** | RBAC (Analyst, Lead, Project Owner, Admin, API Consumer). | P0 | +| **IAM** | API Authentication (Bearer Token / API Key). | P0 | +| **IAM** | Data isolation by project/role. | P0 | ### 6.2 User Stories (Examples) | 用户故事(示例) -- **As a security team member**, I want to upload a Security Questionnaire (Excel/Word) and an architecture document (PDF) **so that** the Agent can automatically identify gaps vs. policy/standards and suggest remediation. -- **As a security lead**, I want to select or link a project from ServiceNow when starting an assessment **so that** the system auto-fills project type and compliance scope. -- **As enterprise IT**, I want to configure the Agent to use only a local Ollama model **so that** assessment content never leaves the internal network. -- **As a developer**, I want to submit documents via REST API and receive assessment results in JSON **so that** the Agent can be integrated into existing ticketing workflows. -- **As a security architect**, I want to submit a system design document and specify "Design" as the SSDLC stage **so that** the Agent applies STRIDE/DREAD threat modeling and checks encryption/permission design against our standards. -- **As a DevSecOps engineer**, I want to submit SAST/DAST scan results at the "Testing" stage **so that** the Agent triages findings, maps them to compliance requirements, and prioritizes remediation. -- **As an operations engineer**, I want to submit incident response logs at the "Operations" stage **so that** the Agent evaluates our response procedures against best practices and identifies process gaps. +- **As a security team member**, I want to upload a project's requirements document (or a Security Questionnaire) and have the Requirements Agent automatically identify missing security requirements, compliance obligations, and gaps vs. policy/standards **so that** I can provide early feedback before design begins. +- **As a security architect**, I want to submit an architecture document to the Design Agent and receive an automated STRIDE threat model **so that** I can focus on reviewing and validating threats rather than creating the initial model from scratch. +- **As a security lead**, I want to run a full SSDLC assessment across multiple phases for a project (or select a project from ServiceNow) **so that** I get a unified view of security posture from requirements through deployment. +- **As a developer**, I want to submit my code review package and SAST results to the Development Agent via REST API **so that** I get prioritized findings with secure coding guidance specific to my language and framework, in JSON format for integration into ticketing workflows. +- **As a pentest manager**, I want to upload penetration test reports to the Testing Agent **so that** findings are automatically mapped to the original threat model and remediation is tracked. +- **As an operations engineer**, I want the Operations Agent to analyze new CVE feeds against our deployment stack and evaluate incident response logs **so that** I know which vulnerabilities require immediate patching and can identify process gaps. +- **As enterprise IT**, I want to configure the platform to use only a local Ollama model **so that** all assessment data stays within the internal network. +- **As a DevSecOps engineer**, I want to integrate the assessment API into our CI/CD pipeline **so that** security checks run automatically at each stage. - **As a project manager**, I want the Agent to auto-detect the SSDLC stage from the uploaded document type **so that** I don't need to manually specify it every time. ### 6.3 SSDLC Stage Definitions | SSDLC 阶段定义 @@ -294,15 +404,15 @@ Each stage maps to one or more **built-in SSDLC Skills** that define stage-speci ### 7.1 General NFRs | 通用非功能需求 -| Category | Requirement (English) | 要求 (中文) | -| :--------------------- | :-------------------------------------------------------------------------------------------- | :------------------------------------------ | -| **Security & Privacy** | Support fully local/on-prem deployment and local LLM; support audit logs. | 支持纯本地部署与本地 LLM;支持审计日志。 | -| **Performance** | Acceptable end-to-end latency for a standard assessment (e.g. 10-page PDF + 1 questionnaire). | 单次评估时延可接受(具体目标待定)。 | -| **Maintainability** | KB, Skills, and LLM config maintainable via config/UI without code changes. | 知识库、Skill、LLM 可配置,无需改代码扩展。 | -| **Observability** | Log model usage, tokens, duration, and errors. | 记录模型、token、耗时与错误。 | -| **Auth & Isolation** | RBAC and data isolation by project/role; fine-grained auth via AAD/ServiceNow. | 按角色与项目隔离数据;细粒度授权。 | -| **Deployment** | Support on-prem/private deployment; connectivity to AAD/ServiceNow/LLM. | 支持内网部署;需连通 AAD/ServiceNow/LLM。 | -| **Open Source** | Architecture aligns with mainstream open-source Agent projects. | 架构参考主流开源项目,便于社区贡献。 | +| Category | Requirement (English) | 要求 (中文) | +| :--- | :--- | :--- | +| **Security & Privacy** | Support fully local/on-prem deployment and local LLM; support audit logs. | 支持纯本地部署与本地 LLM;支持审计日志。 | +| **Performance** | Acceptable end-to-end latency for single-phase assessment; parallel phase execution for full SSDLC. | 单阶段评估时延可接受;全 SSDLC 评估支持并行执行。 | +| **Maintainability** | KB, Skills, LangGraph workflows, and LLM config maintainable via config/API without code changes. | 知识库、Skill、LangGraph 工作流、LLM 可配置,无需改代码扩展。 | +| **Observability** | Log model usage, tokens, duration, errors, and agent state transitions. | 记录模型、token、耗时、错误及 Agent 状态转换。 | +| **Auth & Isolation** | RBAC and data isolation by project/role; fine-grained auth via AAD/ServiceNow. | 按角色与项目隔离数据;细粒度授权。 | +| **Deployment** | Support on-prem/private deployment; connectivity to AAD/ServiceNow/LLM/SAST/DAST tools. | 支持内网部署;需连通 AAD/ServiceNow/LLM/SAST/DAST 工具。 | +| **Open Source** | Architecture aligns with mainstream open-source Agent projects (LangChain/LangGraph ecosystem). | 架构对齐 LangChain/LangGraph 生态,便于社区贡献。 | ### 7.2 Security Requirements and Controls (Non-Functional) | 安全需求与控制 @@ -322,7 +432,7 @@ This section defines security controls for the **system itself** (not the docume - **IAM-02**: Strong auth: AAD/OIDC SSO; API Bearer JWT or API Key (no secrets in URL). - **IAM-03**: RBAC with least privilege default. - **IAM-04**: Session/Token timeout and revocation. -- **IAM-05**: Sensitive operations (e.g. delete KB) require confirmation or higher privilege. +- **IAM-05**: Sensitive operations (e.g. delete KB, modify workflows) require confirmation or higher privilege. **7.2.3 Data Security | 数据安全** @@ -343,9 +453,9 @@ This section defines security controls for the **system itself** (not the docume **7.2.5 Operations and Audit | 运维与审计** - **OPS-01**: Audit logs (who, what, when, resource) protected from tampering. -- **OPS-02**: Operational logs (performance, errors) without sensitive content. +- **OPS-02**: Operational logs (performance, errors, agent state transitions) without sensitive content. - **OPS-03**: Security event detection and alerting. -- **OPS-04**: Backup and recovery for critical data. +- **OPS-04**: Backup and recovery for critical data (KB, assessment history, LangGraph checkpoints). **7.2.6 Supply Chain | 供应链** @@ -355,45 +465,77 @@ This section defines security controls for the **system itself** (not the docume --- -## 8. References (Open-Source AI Agent Projects) | 参考与借鉴 +## 8. Technology Stack | 技术栈 + +### 8.1 Agent Orchestration | Agent 编排 + +| Component | Technology | Purpose | +| :--- | :--- | :--- | +| **Workflow Engine** | LangGraph | Stateful, graph-based agent orchestration with conditional routing, parallel execution, and checkpointing | +| **LLM Framework** | LangChain | Unified LLM abstraction, prompt management, tool integration, RAG chains | +| **State Management** | LangGraph Checkpointing | Cross-phase state persistence, conversation memory, assessment context | + +### 8.2 Core Stack | 核心技术栈 + +| Component | Technology | Purpose | +| :--- | :--- | :--- | +| **Language** | Python 3.10+ | Primary development language | +| **Web/API** | FastAPI | Async REST API with auto OpenAPI | +| **Vector DB** | ChromaDB | Chunk-level similarity search | +| **Graph RAG** | LightRAG | Entity-relationship aware retrieval | +| **Embeddings** | sentence-transformers | Vector embeddings for RAG | +| **Parsing** | Docling (primary) + legacy fallback | Multi-format document parsing | +| **LLM Providers** | OpenAI, Ollama | Cloud and local LLM support | + +--- + +## 9. References | 参考与借鉴 -- **Docker Agent (cagent)**: Reference for "pluggable LLM + tools". -- **VoltAgent AI Agent Platform**: Reference for "production Agent platform" architecture. -- **GitHub Agentic Workflows**: Reference for "human-in-the-loop" and safety policies. -- **Agent Memory Architecture**: Reference for layered memory and RAG. +- **LangGraph Documentation**: Reference for stateful agent orchestration, conditional routing, and multi-agent patterns. +- **LangChain Documentation**: Reference for LLM abstraction, RAG patterns, and tool integration. +- **NIST SSDF (Secure Software Development Framework)**: Reference for SSDLC phase definitions and security activities. +- **OWASP SAMM (Software Assurance Maturity Model)**: Reference for security practice areas across the SDLC. +- **Microsoft SDL**: Reference for security development lifecycle practices. +- **STRIDE/DREAD**: Reference for threat modeling methodology. --- -## 9. Next Steps | 后续步骤 +## 10. Next Steps | 后续步骤 -1. **Technology Choices**: Finalize Python, LangChain/LangGraph, Vector DB, Parsing libs, LLM SDK. -2. **MVP Scope**: "One file type + Single KB + One Skill + 1 LLM" end-to-end loop. Then add AAD & ServiceNow. -3. **Enterprise Integration**: Align with IT on AAD registration and ServiceNow API access. -4. **Pilot**: Run with 1-2 teams to gather feedback. -5. **Open Source**: Release as "DocSentinel" after MVP stabilization. +1. **LangGraph Integration**: Implement LangGraph state machine with phase agent nodes, conditional edges, and shared state. +2. **Phase Agent MVP**: Implement Requirements and Design phase agents first (highest Shift-Left value). +3. **Knowledge Base per Phase**: Build phase-specific knowledge collections (requirements policies, design patterns, coding standards, testing guides, deployment checklists, operations playbooks). +4. **SAST/DAST Connectors**: Build parsers for common tool output formats (SonarQube, Checkmarx, Burp Suite, OWASP ZAP). +5. **Cross-Phase Traceability**: Implement finding linkage from threat model → test case → deployment check → monitoring rule. +6. **Enterprise Integration**: Align with IT on AAD registration and ServiceNow API access. +7. **Pilot**: Run with 1-2 teams across a full SSDLC cycle to gather feedback. +8. **Open Source**: Release as "DocSentinel" after MVP stabilization. --- -## 10. Open Questions and Deliverables | 待澄清问题与建议产出 +## 11. Open Questions and Deliverables | 待澄清问题与建议产出 -### 10.1 Open Questions | 待澄清问题 +### 11.1 Open Questions | 待澄清问题 -- **API Contract**: Request/response shape for core APIs? -- **Report Schema**: Concrete JSON schema for findings? -- **Skill Contract**: Input/output for the first Skill? -- **KB Chunking**: Strategy and parameters? -- **ServiceNow**: Concrete tables/APIs mapping? -- **Limits**: File size, concurrency, rate limits? -- **License**: Project license (Apache 2.0 / MIT)? +- **LangGraph Workflow Schema**: How to define and persist custom SSDLC workflow configurations? +- **Phase Agent Granularity**: Should each phase have a single agent or multiple sub-agents (e.g. Design → Threat Modeler + Architecture Reviewer)? +- **SAST/DAST Integration**: Which tool output formats to support first? Standard SARIF format? +- **Cross-Phase State**: How much context to carry between phases? Full report or summarized findings? +- **Report Schema**: Concrete JSON schema for phase-specific and cross-phase findings? +- **Skill Contract**: Input/output for the first phase-specific Skills? +- **KB Partitioning**: Separate vector collections per SSDLC phase or unified with metadata filtering? +- **Limits**: File size, concurrency, rate limits per phase? -### 10.2 Recommended Deliverables | 建议产出文档 +### 11.2 Recommended Deliverables | 建议产出文档 1. **Technology & Architecture**: `docs/01-architecture-and-tech-stack.md` 2. **API Specification**: `docs/02-api-specification.yaml` 3. **Report & Skill Contract**: `docs/03-assessment-report-and-skill-contract.md` 4. **Integration Guide**: `docs/04-integration-guide.md` 5. **Deployment Runbook**: `docs/05-deployment-runbook.md` -6. **Security Implementation**: `SECURITY.md` and secure coding guidelines. +6. **Agent Integration (MCP)**: `docs/06-agent-integration.md` +7. **SSDLC Workflow Guide**: `docs/07-ssdlc-workflow-guide.md` *(new)* +8. **Security Implementation**: `SECURITY.md` and secure coding guidelines. --- diff --git a/docs/01-architecture-and-tech-stack.md b/docs/01-architecture-and-tech-stack.md index b290450..3eeba83 100644 --- a/docs/01-architecture-and-tech-stack.md +++ b/docs/01-architecture-and-tech-stack.md @@ -3,8 +3,8 @@ | | | | :-------------- | :------------------------------------------------------- | | **Status** | [x] Updated (v4.0 aligned) \| [ ] In Review \| [ ] Approved | -| **Version** | 0.5 | -| **Related PRD** | Section 5 System Architecture, Section 9 Next Steps | +| **Version** | 1.0 | +| **Related PRD** | Section 5 System Architecture, Section 8 Tech Stack | --- @@ -22,55 +22,65 @@ | Item | Choice | Version | Notes | | :------------ | :---------- | :------ | :---------------------------------------------------- | -| **Framework** | FastAPI | ≥0.109 | Async, auto OpenAPI | -| **Server** | Uvicorn | ≥0.27 | ASGI server | +| **Framework** | FastAPI | >=0.109 | Async, auto OpenAPI | +| **Server** | Uvicorn | >=0.27 | ASGI server | | **Docs** | OpenAPI 3.x | — | Generated by FastAPI; see `02-api-specification.yaml` | -### 1.3 Agent and LLM | Agent 与 LLM +### 1.3 Agent Orchestration | Agent 编排 + +| Item | Choice | Version | Notes | +| :------------------ | :-------- | :------ | :------------------------------------------ | +| **Workflow Engine** | LangGraph | Latest | Stateful graph-based agent orchestration; StateGraph with conditional routing, parallel execution, checkpointing | +| **LLM Framework** | LangChain | Latest | Unified LLM abstraction, prompt templates, tool integration, RAG chains | +| **State Management** | LangGraph Checkpointing | — | Cross-phase state persistence; MemorySaver (MVP) or DB-backed (production) | + +### 1.4 LLM Providers | LLM 提供商 | Item | Choice | Version | Notes | | :------------------ | :------------------- | :------ | :------------------------------------------ | -| **Orchestrator** | LangGraph (`app/agent`) | Latest | Stateful graph-based agent workflow; SSDLC-aware routing with conditional edges | -| **Agent Framework** | LangChain | Latest | Foundation for LangGraph nodes; Runnable interface for LLM calls | -| **LLM Abstraction** | LangChain | Latest | Unified interface for OpenAI/Ollama | -| **Supported LLMs** | OpenAI, Ollama | — | OpenAI (and compatible APIs) + Ollama; LLM client is cached via @lru_cache | +| **Cloud LLM** | OpenAI (ChatGPT) | — | Via LangChain `ChatOpenAI`; compatible with Azure OpenAI, Claude, Qwen | +| **Local LLM** | Ollama | — | Via LangChain `ChatOllama`; data stays on-prem | +| **LLM Client** | Cached | — | `@lru_cache` — one client per process lifetime | -### 1.4 Vector Store and RAG | 向量库与 RAG +### 1.5 Vector Store and RAG | 向量库与 RAG | Item | Choice | Version | Notes | | :------------- | :----------------- | :------ | :------------------------------------------ | -| **Vector DB** | Chroma | ≥0.4 | Embedded, persisted to `CHROMA_PERSIST_DIR` | +| **Vector DB** | Chroma | >=0.4 | Embedded, persisted to `CHROMA_PERSIST_DIR`; phase-specific collections | | **Embeddings** | HuggingFace | — | `sentence-transformers/all-MiniLM-L6-v2` | | **Chunking** | RecursiveCharacter | — | 1024 chars, 128 overlap (configurable) | | **Graph RAG** | LightRAG | — | Entity-relationship aware retrieval; `ENABLE_GRAPH_RAG` | -### 1.5 Document Parsing | 文档解析 +### 1.6 Document Parsing | 文档解析 | Format | Library | Version | Notes | | :---------- | :------------- | :------ | :---------------------------------------- | | **All (primary)** | Docling | Latest | Table/heading preserving; OCR capable; `PARSER_ENGINE=auto` | -| **PDF** | PyMuPDF (fitz) | ≥1.23 | Fallback when Docling unavailable | -| **Word** | python-docx | ≥1.1 | | -| **Excel** | openpyxl | ≥3.1 | | -| **PPT** | python-pptx | ≥0.6 | | +| **PDF** | PyMuPDF (fitz) | >=1.23 | Fallback when Docling unavailable | +| **Word** | python-docx | >=1.1 | | +| **Excel** | openpyxl | >=3.1 | | +| **PPT** | python-pptx | >=0.6 | | +| **SAST/DAST** | Custom parsers | — | SARIF, SonarQube JSON, Checkmarx XML, Burp XML, ZAP | | **Text/MD** | Built-in | — | `.txt`, `.md` | | **Router** | Custom | — | Dispatches by extension in `parse_file()` | -### 1.6 Identity and Integrations | 身份与集成 +### 1.7 Identity and Integrations | 身份与集成 | Item | Choice | Notes | | :----------- | :---------------- | :-------------------------------------------------------------------- | | **Auth** | OAuth2/OIDC (AAD) | Placeholder in `app/integrations`; see `docs/04-integration-guide.md` | | **Metadata** | ServiceNow | Placeholder in `app/integrations`; see `docs/04-integration-guide.md` | +| **SAST/DAST** | Tool connectors | SonarQube, Checkmarx, Burp Suite, OWASP ZAP; see `docs/04-integration-guide.md` | | **Config** | pydantic-settings | `app/core/config.py` reads `.env` | -### 1.7 Storage and Cache | 存储与缓存 +### 1.8 Storage and Cache | 存储与缓存 -| Item | Choice | Notes | -| :--------------- | :---------- | :------------------------------------------------- | -| **Task State** | Memory dict | MVP only; assessments run as background tasks; replace with Redis/DB for production | -| **Vector Store** | Local disk | Persisted to `CHROMA_PERSIST_DIR` | -| **Files** | Transient | Stream processing; parsed content goes to KB/Agent | +| Item | Choice | Notes | +| :--------------- | :-------------------- | :------------------------------------------------- | +| **Task State** | LangGraph Checkpoints | Persistent state across phases; MemorySaver for MVP, DB-backed for production | +| **Vector Store** | Local disk | Persisted to `CHROMA_PERSIST_DIR`; separate collections per SSDLC phase | +| **Files** | Transient | Stream processing; parsed content goes to KB/Agent | +| **Checkpoints** | Local disk / DB | `LANGGRAPH_CHECKPOINT_DIR` for MVP; PostgreSQL for production | --- @@ -81,20 +91,26 @@ Aligned with PRD Section 5.1. ```text -[ Access Layer ] API (FastAPI) / MCP Server (stdio) +[ Access Layer ] API (FastAPI) / MCP Server (stdio) / CLI | -[ Core Layer ] LangGraph Orchestrator - | ├── SSDLC Pipeline (6-stage router) - | ├── Memory (in-memory dict) - | ├── Skills (persona + SSDLC stage skills) - | ├── Knowledge Base (Vector + Graph RAG) - | └── Parser (Docling / legacy) +[ SSDLC Orchestration ] LangGraph StateGraph + | ├── Phase Router (conditional edges) + | ├── SSDLC Pipeline (6-stage router) + | ├── Requirements Agent + | ├── Design Agent + | ├── Development Agent + | ├── Testing Agent + | ├── Deployment Agent + | ├── Operations Agent + | └── Reviewer Agent | -[ LLM Layer ] Abstraction (LangChain) - | ├── OpenAI (and compatible APIs) - | └── Ollama (Local) +[ Core Services ] Knowledge Base (Vector + Graph RAG) | Parser (Docling / legacy) | Memory | Skills (persona + SSDLC stage skills) | -[ Integrations ] AAD (Auth, placeholder) | ServiceNow (Metadata, placeholder) +[ LLM Layer ] LangChain Abstraction + | ├── OpenAI / Claude / Qwen (Cloud) + | └── Ollama / vLLM (Local) + | +[ Integrations ] AAD (Auth) | ServiceNow (Metadata) | SAST/DAST Tools ``` ### 2.2 Components and Interfaces | 组件职责与接口 @@ -102,12 +118,13 @@ Aligned with PRD Section 5.1. | Component | Responsibility | Interface | | :----------------- | :--------------------------------------------- | :-------------------------------------------- | | **API Layer** | Auth, routing, rate limiting, validation. | REST, see `02-api-specification.yaml` | -| **Orchestrator** | Task lifecycle, invoking Parser/KB/Skill/LLM. | Internal Python API | -| **Memory** | Session context, working memory. | Read/Write (e.g. `get/set_session`) | -| **Skills** | Specific assessment logic (e.g. Policy Check). | I/O Contract, see `03-assessment-report...md` | -| **Knowledge Base** | Ingest (Parse→Chunk→Embed) and Retrieve. | `upload()`, `query(text)` | -| **Parser** | File to unified JSON/Markdown. | `parse(file_stream)` → Schema 03 | -| **LLM Layer** | Unified chat/completion API. | `invoke(prompt, context)` | +| **LangGraph Orchestrator** | SSDLC workflow, phase routing, state management, checkpointing. | Internal Python API; `StateGraph` definition | +| **Phase Agents** | Phase-specific assessment logic via LangChain tools. | LangGraph node functions; shared `SSDLCState` | +| **Memory** | Cross-phase context, LangGraph checkpoints. | LangGraph state + checkpointer | +| **Skills** | Phase-specific assessment capabilities. | I/O Contract, see `03-assessment-report...md` | +| **Knowledge Base** | Ingest (Parse->Chunk->Embed) and Retrieve per phase. | `upload()`, `query(text, collection)` | +| **Parser** | File to unified JSON/Markdown (including SAST/DAST reports). | `parse(file_stream)` -> Schema 03 | +| **LLM Layer** | Unified chat/completion API via LangChain. | `invoke(prompt, context)` / LangChain Tools | ### 2.3 KB Chunking Strategy | 知识库切块策略 @@ -116,29 +133,40 @@ Aligned with PRD Section 5.1. | **Chunk Size** | 1024 | Characters or tokens per chunk. | | **Overlap** | 128 | Overlap to maintain context at boundaries. | | **Splitter** | Recursive | Splits by paragraphs, then sentences. | -| **Metadata** | Yes | Filename, page number, section headers. | +| **Metadata** | Yes | Filename, page number, section headers, SSDLC phase tag. | +| **Collections** | Per phase | `kb_requirements`, `kb_design`, `kb_development`, `kb_testing`, `kb_deployment`, `kb_operations` | --- ## 3. Module Layout | 目录结构 -Current implementation structure: +Target implementation structure: ```text DocSentinel/ ├── app/ -│ ├── api/ # FastAPI routes: health, assessments, kb -│ ├── core/ # Configuration (pydantic-settings) -│ ├── agent/ # LangGraph orchestrator -│ │ ├── orchestrator.py # LangGraph graph definition -│ │ ├── ssdlc/ # SSDLC pipeline: router, stage skills, checklists -│ │ ├── skills_registry.py -│ │ └── skills_service.py -│ ├── kb/ # KnowledgeBaseService (Chroma + chunking) +│ ├── api/ # FastAPI routes: health, assessments, kb, skills +│ ├── core/ # Configuration (pydantic-settings), guardrails +│ ├── agent/ # LangGraph orchestrator and phase agents +│ │ ├── orchestrator.py # LangGraph StateGraph definition +│ │ ├── state.py # SSDLCState TypedDict +│ │ ├── router.py # Phase routing logic +│ │ ├── ssdlc/ # SSDLC pipeline: router, stage skills, checklists +│ │ ├── agents/ # Phase agent implementations +│ │ │ ├── requirements.py +│ │ │ ├── design.py +│ │ │ ├── development.py +│ │ │ ├── testing.py +│ │ │ ├── deployment.py +│ │ │ └── operations.py +│ │ ├── reviewer.py # Cross-phase review agent +│ │ ├── skills_registry.py # Built-in skills per SSDLC phase +│ │ └── skills_service.py # Skill CRUD and management +│ ├── kb/ # KnowledgeBaseService (Chroma + chunking + phase collections) │ │ └── graph_rag.py # LightRAG integration -│ ├── llm/ # LLM factory and invocation -│ ├── parser/ # Parsers for PDF, docx, xlsx, pptx, txt -│ ├── integrations/ # Placeholders: AAD, ServiceNow +│ ├── llm/ # LangChain LLM factory and invocation +│ ├── parser/ # Parsers: Docling + legacy + SAST/DAST report parsers +│ ├── integrations/ # AAD, ServiceNow, SAST/DAST tool connectors │ ├── models/ # Pydantic models for API and internal data │ ├── main.py # App entry point │ └── mcp_server.py # MCP Server @@ -159,29 +187,33 @@ Maintained in `requirements.txt`. Key architectural dependencies: fastapi>=0.109.0 uvicorn[standard]>=0.27.0 -# Agent / LLM -langchain>=0.1.0 +# Agent Orchestration +langgraph>=0.2.0 +langchain>=0.2.0 langchain-community langchain-openai langgraph # Graph-based agent orchestration # Vector Store & Graph RAG -chromadb +chromadb>=0.4.22 lightrag-hku # Graph RAG (entity-relationship retrieval) # Parsing -docling # Primary parser (table/heading/OCR) -pymupdf # PDF fallback -python-docx # Word -openpyxl # Excel -python-pptx # PPT +docling>=2.0.0 # Primary parser (table/heading/OCR) +pymupdf>=1.23 # PDF fallback +python-docx>=1.1 # Word fallback +openpyxl>=3.1 # Excel fallback +python-pptx>=0.6 # PPT fallback + +# Embeddings +sentence-transformers # MCP mcp[cli] # Model Context Protocol server # Utils httpx -pydantic-settings +pydantic-settings>=2.1 python-multipart ``` @@ -191,7 +223,7 @@ python-multipart | Version | Date | Changes | | :------ | :------ | :--------------------------------------------- | -| **0.5** | 2026-03 | SSDLC pipeline (6 stages), LangGraph orchestrator, SSDLC stage skills. | +| **1.0** | 2026-03 | Major rewrite: LangGraph orchestration, SSDLC phase agents, phase-specific KB collections, SAST/DAST parsers, SSDLC stage skills. | | **0.4** | 2026-03 | Added Graph RAG, Docling parser, MCP Server, singleton KB, async assessment. | | **0.2** | 2025-03 | Updated tech stack versions and module layout. | | **0.1** | Initial | Draft selection. | diff --git a/docs/03-assessment-report-and-skill-contract.md b/docs/03-assessment-report-and-skill-contract.md index 0bb9686..e1e4136 100644 --- a/docs/03-assessment-report-and-skill-contract.md +++ b/docs/03-assessment-report-and-skill-contract.md @@ -1,31 +1,39 @@ # 03 — Assessment Report and Skill Contract | 评估报告与 Skill 契约 -| | | -| :-------------- | :----------------------------------------- | -| **Status** | [x] Updated (v4.0 aligned) \| [ ] In Review \| [ ] Approved | -| **Version** | 0.3 | -| **Related PRD** | Section 5.2.3 Skill, Section 6 Features | +| | | +| :-------------- | :------------------------------------------- | +| **Status** | [x] Updated (v4.0 aligned, v2.0) \| [ ] In Review \| [ ] Approved | +| **Version** | 2.0 | +| **Related PRD** | Section 3.2 SSDLC Phases, Section 6 Features | --- ## 1. Assessment Report Schema | 评估报告结构 -Agent outputs a **structured report** conforming to this schema. It is used for API responses and optional ServiceNow write-back. +Agent outputs a **structured report** conforming to this schema. It is used for API responses, cross-phase traceability, sign-off workflows, and optional ServiceNow write-back. Each report is tagged with the SSDLC phase that generated it. ### 1.1 JSON Schema ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", - "$id": "https://security-ai-agent.example/schemas/assessment-report.json", - "title": "AssessmentReport", + "$id": "https://docsentinel.example/schemas/assessment-report.json", + "title": "SSDLCAssessmentReport", "type": "object", - "required": ["version", "task_id", "status", "summary"], + "required": ["version", "task_id", "phase", "status", "summary"], "properties": { - "version": { "type": "string", "const": "1.0" }, + "version": { "type": "string", "const": "2.0" }, "task_id": { "type": "string", "format": "uuid" }, - "status": { "type": "string", "enum": ["completed", "partial", "failed"] }, - "summary": { "type": "string", "description": "Executive summary of findings" }, + "phase": { + "type": "string", + "enum": ["requirements", "design", "development", "testing", "deployment", "operations", "full_ssdlc"], + "description": "SSDLC phase that produced this report" + }, + "status": { + "type": "string", + "enum": ["completed", "partial", "failed", "review_pending", "approved", "rejected", "escalated"] + }, + "summary": { "type": "string", "description": "Executive summary of findings for this phase" }, "confidence": { "type": "number", "minimum": 0, "maximum": 1, "description": "Overall confidence score (0.0–1.0)" }, "risk_items": { "type": "array", @@ -35,10 +43,25 @@ Agent outputs a **structured report** conforming to this schema. It is used for "type": "array", "items": { "$ref": "#/$defs/ComplianceGap" } }, + "threat_model": { + "type": "object", + "description": "STRIDE/DREAD threat model (primarily from Design phase)", + "$ref": "#/$defs/ThreatModel" + }, + "vulnerabilities": { + "type": "array", + "description": "Parsed SAST/DAST/Pentest findings (primarily from Testing phase)", + "items": { "$ref": "#/$defs/Vulnerability" } + }, "remediations": { "type": "array", "items": { "$ref": "#/$defs/Remediation" } }, + "cross_phase_refs": { + "type": "array", + "description": "References linking findings across SSDLC phases", + "items": { "$ref": "#/$defs/CrossPhaseRef" } + }, "sources": { "type": "array", "description": "Citation evidence from parsed documents", @@ -51,7 +74,9 @@ Agent outputs a **structured report** conforming to this schema. It is used for "project_id": { "type": "string" }, "ssdlc_stage": { "type": "string", "enum": ["requirements", "design", "development", "testing", "deployment", "operations"], "description": "SSDLC stage this assessment covers" }, "model_used": { "type": "string" }, - "completed_at": { "type": "string", "format": "date-time" } + "completed_at": { "type": "string", "format": "date-time" }, + "ssdlc_phase": { "type": "string" }, + "skill_id": { "type": "string" } } }, "format": { "type": "string", "enum": ["json", "markdown"], "default": "json" } @@ -59,7 +84,7 @@ Agent outputs a **structured report** conforming to this schema. It is used for "$defs": { "RiskItem": { "type": "object", - "required": ["id", "title", "severity"], + "required": ["id", "title", "severity", "phase"], "properties": { "id": { "type": "string" }, "title": { "type": "string" }, @@ -68,7 +93,8 @@ Agent outputs a **structured report** conforming to this schema. It is used for "source_ref": { "type": "string", "description": "Reference to source doc/section" }, "confidence": { "type": "number", "minimum": 0, "maximum": 1, "description": "Finding-level confidence score (0.0–1.0)" }, "citation_ids": { "type": "array", "items": { "type": "string" }, "description": "IDs referencing entries in top-level sources[]" }, - "category": { "type": "string" } + "category": { "type": "string" }, + "phase": { "type": "string", "description": "SSDLC phase where this risk was identified" } } }, "ComplianceGap": { @@ -81,7 +107,56 @@ Agent outputs a **structured report** conforming to this schema. It is used for "evidence_suggestion": { "type": "string" }, "confidence": { "type": "number", "minimum": 0, "maximum": 1, "description": "Gap-level confidence score (0.0–1.0)" }, "citation_ids": { "type": "array", "items": { "type": "string" }, "description": "IDs referencing entries in top-level sources[]" }, - "framework": { "type": "string" } + "framework": { "type": "string" }, + "phase": { "type": "string" } + } + }, + "ThreatModel": { + "type": "object", + "properties": { + "methodology": { "type": "string", "enum": ["STRIDE", "DREAD", "STRIDE_DREAD"] }, + "threats": { + "type": "array", + "items": { + "type": "object", + "required": ["id", "category", "description"], + "properties": { + "id": { "type": "string" }, + "category": { "type": "string", "enum": ["Spoofing", "Tampering", "Repudiation", "InformationDisclosure", "DenialOfService", "ElevationOfPrivilege"] }, + "description": { "type": "string" }, + "affected_component": { "type": "string" }, + "dread_score": { + "type": "object", + "properties": { + "damage": { "type": "integer", "minimum": 1, "maximum": 10 }, + "reproducibility": { "type": "integer", "minimum": 1, "maximum": 10 }, + "exploitability": { "type": "integer", "minimum": 1, "maximum": 10 }, + "affected_users": { "type": "integer", "minimum": 1, "maximum": 10 }, + "discoverability": { "type": "integer", "minimum": 1, "maximum": 10 }, + "total": { "type": "number" } + } + }, + "mitigations": { "type": "array", "items": { "type": "string" } } + } + } + } + } + }, + "Vulnerability": { + "type": "object", + "required": ["id", "title", "severity", "source_tool"], + "properties": { + "id": { "type": "string" }, + "title": { "type": "string" }, + "severity": { "type": "string", "enum": ["info", "low", "medium", "high", "critical"] }, + "source_tool": { "type": "string", "description": "SAST/DAST tool that found this (e.g. SonarQube, Burp)" }, + "cwe_id": { "type": "string" }, + "cvss_score": { "type": "number" }, + "location": { "type": "string", "description": "File path, URL, or component" }, + "description": { "type": "string" }, + "remediation": { "type": "string" }, + "status": { "type": "string", "enum": ["open", "in_progress", "fixed", "accepted", "false_positive"] }, + "linked_threat_id": { "type": "string", "description": "Cross-ref to threat model threat ID" } } }, "Remediation": { @@ -90,12 +165,26 @@ Agent outputs a **structured report** conforming to this schema. It is used for "properties": { "id": { "type": "string" }, "action": { "type": "string" }, - "priority": { "type": "string", "enum": ["low", "medium", "high"] }, + "priority": { "type": "string", "enum": ["low", "medium", "high", "critical"] }, + "phase": { "type": "string", "description": "SSDLC phase this remediation applies to" }, "related_risk_ids": { "type": "array", "items": { "type": "string" } }, "related_gap_ids": { "type": "array", "items": { "type": "string" } }, + "related_vuln_ids": { "type": "array", "items": { "type": "string" } }, + "related_threat_ids": { "type": "array", "items": { "type": "string" } }, "external_ticket": { "type": "string", "description": "Optional external tracking reference (e.g. Jira key or GitHub Issue URL)" } } }, + "CrossPhaseRef": { + "type": "object", + "required": ["source_phase", "source_id", "target_phase", "target_id"], + "properties": { + "source_phase": { "type": "string" }, + "source_id": { "type": "string" }, + "target_phase": { "type": "string" }, + "target_id": { "type": "string" }, + "relationship": { "type": "string", "description": "e.g. 'threat_to_test', 'risk_to_remediation'" } + } + }, "SourceCitation": { "type": "object", "required": ["id", "file", "excerpt"], @@ -120,34 +209,51 @@ Agent outputs a **structured report** conforming to this schema. It is used for When `format == "markdown"`, the output should follow: ```markdown -# Assessment Report | 安全评估报告 -**Task ID**: {task_id} +# SSDLC Assessment Report | 安全评估报告 +**Task ID**: {task_id} +**Phase**: {phase} **Completed**: {completed_at} ## Summary | 摘要 {summary} ## Risk Items | 风险项 -| ID | Title | Severity | Confidence | Description | Citations | -| --- | ----- | -------- | ---------- | ----------- | --------- | -| ... | ... | ... | ... | ... | ... | +| ID | Title | Severity | Phase | Confidence | Description | Citations | +| --- | ----- | -------- | ----- | ---------- | ----------- | --------- | +| ... | ... | ... | ... | ... | ... | ... | + +## Threat Model | 威胁建模 (Design Phase) +### Methodology: {methodology} +| ID | Category | Description | Affected Component | DREAD Score | +| --- | -------- | ----------- | ------------------ | ----------- | +| ... | ... | ... | ... | ... | + +## Vulnerabilities | 漏洞 (Testing Phase) +| ID | Title | Severity | Source Tool | CWE | Location | Status | +| --- | ----- | -------- | ----------- | --- | -------- | ------ | +| ... | ... | ... | ... | ... | ... | ... | ## Compliance Gaps | 合规差距 -| Control/Clause | Gap Description | Confidence | Evidence Suggestion | Citations | -| -------------- | --------------- | ---------- | ------------------- | --------- | -| ... | ... | ... | ... | ... | +| Control/Clause | Gap Description | Framework | Phase | Confidence | Evidence Suggestion | Citations | +| -------------- | --------------- | --------- | ----- | ---------- | ------------------- | --------- | +| ... | ... | ... | ... | ... | ... | ... | ## Remediations | 整改建议 -| Priority | Action | Related Risks/Gaps | -| -------- | ------ | ------------------ | -| ... | ... | ... | +| Priority | Action | Phase | Related Risks/Threats/Vulns | +| -------- | ------ | ----- | --------------------------- | +| ... | ... | ... | ... | + +## Cross-Phase Traceability | 跨阶段追溯 +| Source Phase | Source ID | Target Phase | Target ID | Relationship | +| ----------- | --------- | ------------ | --------- | ------------ | +| ... | ... | ... | ... | ... | ``` --- ## 2. Parser Output Schema | 文件解析输出结构 -Unified output format for both Assessment Input and Knowledge Base Ingestion. +Unified output format for both assessment input and knowledge base ingestion. Extended to support SAST/DAST report formats. ```json { @@ -156,16 +262,21 @@ Unified output format for both Assessment Input and Knowledge Base Ingestion. "type": "object", "required": ["metadata", "content"], "properties": { + "format": { "type": "string", "enum": ["markdown", "json", "sarif"] }, "metadata": { "type": "object", "required": ["filename", "type"], "properties": { "filename": { "type": "string" }, - "type": { "type": "string", "enum": ["pdf", "docx", "xlsx", "pptx", "txt", "md", "mmd", "mermaid"] }, + "type": { "type": "string", "description": "MIME type or extension (pdf, docx, xlsx, pptx, txt, md, mmd, mermaid, sarif, etc.)" }, "parser_engine": { "type": "string", "enum": ["docling", "legacy"], "default": "legacy" }, + "pages": { "type": "integer" }, + "language": { "type": "string" }, "upload_time": { "type": "string", "format": "date-time" }, "scenario_id": { "type": "string", "description": "Optional scenario context" }, - "file_hash": { "type": "string", "description": "SHA hash for deduplication" } + "file_hash": { "type": "string", "description": "SHA hash for deduplication" }, + "source_tool": { "type": "string", "description": "For SAST/DAST reports: tool name" }, + "ssdlc_phase_hint": { "type": "string", "description": "Suggested SSDLC phase for this document" } } }, "content": { "type": "string", "description": "Markdown or plain-text content" }, @@ -216,21 +327,40 @@ When `POST /assessments` is called, the API returns an `AssessmentTaskCreated` i ### 4.1 Skill Template Schema -Each skill (or persona) is defined by a JSON template. +Each skill (or persona) is defined by a JSON template. Skills are now organized by SSDLC phase. ```json { - "id": "iso-27001-auditor", - "name": "ISO 27001 Lead Auditor", - "description": "Formal ISMS audit focusing on process, documentation, and controls.", - "system_prompt": "You are an ISO 27001 Lead Auditor...", - "risk_focus": ["Access Control", "Supplier Security"], - "compliance_frameworks": ["ISO/IEC 27001:2013"], + "id": "design-threat-modeler", + "name": "Threat Modeler", + "description": "Performs STRIDE/DREAD threat modeling on architecture and design documents.", + "ssdlc_phase": "design", + "system_prompt": "You are a security threat modeling expert. Analyze the provided architecture document using the STRIDE methodology...", + "risk_focus": ["Spoofing", "Tampering", "Information Disclosure", "Elevation of Privilege"], + "compliance_frameworks": ["OWASP", "NIST SP 800-53"], + "tools": ["stride_analyzer", "dread_scorer"], "is_builtin": true } ``` -### 4.2 SSDLC Stage Skills (Built-in) +### 4.2 Built-in Skills by SSDLC Phase + +| SSDLC Phase | Skill ID | Name | Focus | +| :--- | :--- | :--- | :--- | +| **Requirements** | `req-compliance-analyst` | Compliance Analyst | GDPR, PCI DSS, SOC2, ISO 27001 compliance mapping | +| **Requirements** | `req-risk-assessor` | Risk Assessor | Project risk classification, data sensitivity analysis | +| **Design** | `design-threat-modeler` | Threat Modeler | STRIDE/DREAD analysis, attack surface mapping | +| **Design** | `design-security-architect` | Security Architect | Architecture patterns, encryption, IAM design review | +| **Development** | `dev-secure-code-reviewer` | Secure Code Reviewer | OWASP Secure Coding Practices, language-specific guidance | +| **Development** | `dev-sast-analyst` | SAST Analyst | SAST findings triage, false positive reduction | +| **Testing** | `test-pentest-analyst` | Pentest Analyst | Penetration test report analysis, finding prioritization | +| **Testing** | `test-vuln-manager` | Vulnerability Manager | SAST/DAST triage, remediation tracking | +| **Deployment** | `deploy-release-reviewer` | Release Security Reviewer | Pre-release checklist, configuration audit | +| **Deployment** | `deploy-hardening-specialist` | Hardening Specialist | CIS benchmarks, container/server hardening | +| **Operations** | `ops-vuln-monitor` | Vulnerability Monitor | CVE analysis, patch priority assessment | +| **Operations** | `ops-incident-responder` | Incident Responder | Incident analysis, response recommendations | + +### 4.3 SSDLC Stage Skills (Built-in) Each SSDLC stage has a dedicated built-in skill: @@ -243,12 +373,12 @@ Each SSDLC stage has a dedicated built-in skill: | **Deployment** | `ssdlc-deployment` | Release readiness, config security, key management, hardening | CIS Benchmarks, DISA STIG | | **Operations** | `ssdlc-operations` | Vulnerability monitoring, incident response, patch management, log audit | NIST CSF, SOC2, ISO 27001 | -### 4.3 Skill Execution Contract +### 4.4 Skill Execution Contract -- **Input**: `parsed_documents` + `kb_chunks` + `history_chunks` + `skill_focus` + `ssdlc_stage` (optional) -- **Output**: Structured `AssessmentReport` fragment (JSON). +- **Input**: `parsed_documents` + `kb_chunks` (phase-specific collection) + `history_chunks` + `skill_focus` + `ssdlc_stage` (optional) + `ssdlc_state` (cross-phase context from LangGraph) +- **Output**: Structured `SSDLCAssessmentReport` fragment (JSON) with phase tag and cross-phase references. -The LangGraph orchestrator injects the `system_prompt`, `risk_focus`, `ssdlc_stage`, and stage-specific checklist into the LLM context to guide the generation. +The LangGraph orchestrator injects the `system_prompt`, `risk_focus`, `tools`, `ssdlc_stage`, stage-specific checklist, and cross-phase state into the LLM context to guide the generation. --- @@ -256,6 +386,5 @@ The LangGraph orchestrator injects the `system_prompt`, `risk_focus`, `ssdlc_sta | Version | Date | Changes | | :------ | :------ | :------------------------------------------------------ | -| **0.3** | 2026-03 | Added SSDLC stage skills (6 stages), `ssdlc_stage` field in report metadata, LangGraph execution contract. | -| **0.2** | 2026-03 | Aligned ParsedDocument with code (removed `format`, added `parser_engine`, `raw_structure`, `chunk_ids`). Added `SourceCitation` and `sources` to report schema. Added Task Lifecycle Models section. Fixed `AssessmentReport.status` enum to match code. | +| **2.0** | 2026-03 | Major rewrite: SSDLC phase-tagged reports, ThreatModel schema, Vulnerability schema, CrossPhaseRef, SourceCitation, phase-specific skills, SARIF parser support, SSDLC stage skills (6 stages), `ssdlc_stage` field in report metadata, LangGraph execution contract. | | **0.1** | Initial | Draft Report Schema, Parser Output, and Skill Contract. | diff --git a/docs/README.md b/docs/README.md index b3d9a34..7aebf79 100644 --- a/docs/README.md +++ b/docs/README.md @@ -10,14 +10,14 @@ This directory holds **executable design and specification** artifacts that acco ## Document List | 文档列表 -| ID | Document | Purpose | Timing | -| :----- | :----------------------------------------------------------------------------------- | :------------------------------------------------------------------ | :------------------- | -| **01** | [Architecture and Tech Stack](./01-architecture-and-tech-stack.md) | Technology choices, high-level architecture, interfaces, data flow. | Start / Design Phase | -| **02** | [API Specification](./02-api-specification.yaml) | REST API Contract (OpenAPI 3.x). | Parallel with 01 | -| **03** | [Assessment Report and Skill Contract](./03-assessment-report-and-skill-contract.md) | JSON Schemas for Reports and Skills. | Pre-Development | -| **04** | [Integration Guide](./04-integration-guide.md) | AAD, ServiceNow configuration and mapping. | Integration Phase | -| **05** | [Deployment Runbook](./05-deployment-runbook.md) | Deployment, config reference, ops. | Pre-Release | -| **06** | [Agent Integration (MCP)](./06-agent-integration.md) | MCP setup for Claude Desktop, Cursor, OpenClaw. | Integration Phase | +| ID | Document | Purpose | Timing | +| :----- | :----------------------------------------------------------------------------------- | :------------------------------------------------------------------------------- | :------------------- | +| **01** | [Architecture and Tech Stack](./01-architecture-and-tech-stack.md) | Technology choices (LangGraph, LangChain), architecture, interfaces, data flow. | Start / Design Phase | +| **02** | [API Specification](./02-api-specification.yaml) | REST API Contract (OpenAPI 3.x) with SSDLC phase endpoints. | Parallel with 01 | +| **03** | [Assessment Report and Skill Contract](./03-assessment-report-and-skill-contract.md) | JSON Schemas for SSDLC phase reports and phase-specific Skills. | Pre-Development | +| **04** | [Integration Guide](./04-integration-guide.md) | AAD, ServiceNow, SAST/DAST tool configuration and mapping. | Integration Phase | +| **05** | [Deployment Runbook](./05-deployment-runbook.md) | Deployment, config reference, ops. | Pre-Release | +| **06** | [Agent Integration (MCP)](./06-agent-integration.md) | MCP server setup for Claude Desktop, Cursor, OpenClaw. | Integration Phase | --- @@ -27,11 +27,12 @@ Aligned with PRD and current implementation: - **Language**: Python 3.10+ - **Web/API**: FastAPI + MCP Server (stdio) -- **Agent**: LangChain + LangGraph (stateful graph-based orchestration with SSDLC routing) -- **SSDLC**: 6-stage pipeline (Requirements → Design → Development → Testing → Deployment → Operations) -- **Vector DB**: Chroma (+ LightRAG for Graph RAG) +- **Agent Orchestration**: LangGraph (stateful graph-based workflows with SSDLC routing) +- **LLM Framework**: LangChain (unified LLM abstraction, prompts, tools, RAG) +- **SSDLC Phases**: 6-stage pipeline (Requirements → Design → Development → Testing → Deployment → Operations) +- **Vector DB**: Chroma + LightRAG (hybrid vector/graph retrieval) - **Parsing**: Docling (primary) + PyMuPDF, python-docx, openpyxl (legacy fallback) -- **LLM**: LangChain Abstraction (OpenAI / Ollama) +- **LLM Providers**: OpenAI / Ollama (via LangChain) *See [01-architecture-and-tech-stack.md](./01-architecture-and-tech-stack.md) for details.* @@ -39,10 +40,10 @@ Aligned with PRD and current implementation: ## How to Use | 使用方式 -1. **Start with 01**: Confirm stack and architecture. -2. **Sync 02**: Use FastAPI to generate OpenAPI or write YAML first. -3. **Validate 03**: Use `schemas/assessment-report.json` for validation. -4. **Refine 04/05**: Update when integrating with real environments. +1. **Start with 01**: Confirm stack and architecture (LangGraph + LangChain). +2. **Sync 02**: Use FastAPI to generate OpenAPI or write YAML first; ensure SSDLC phase endpoints. +3. **Validate 03**: Use `schemas/assessment-report.json` for validation; verify phase-specific report fields. +4. **Refine 04/05**: Update when integrating with real environments (AAD, ServiceNow, SAST/DAST tools). ### Directory Structure