How to set up a full detection validation pipeline -- from threat intel to validated, PR-ready detections -- on your own system.
Threat Intelligence
│
▼
┌─────────────────┐
│ CTI Analyst │ Extract MITRE ATT&CK TTPs
│ (LLM node) │
└────────┬────────┘
▼
┌─────────────────┐
│ Coverage Analyzer│ Query security-detections MCP
│ (MCP + LLM) │ for existing coverage → find gaps
└────────┬────────┘
▼
┌─────────────────┐
│ Detection Eng. │ Generate detection rules
│ (LLM node) │ (SPL / KQL / Sigma / Elastic)
└────────┬────────┘
▼
┌─────────────────┐
│ Atomic Executor │ Run Atomic Red Team tests
│ (Attack Range) │ against lab environment
└────────┬────────┘
▼
┌─────────────────┐
│ SIEM Validator │ Query SIEM to confirm
│ (MCP) │ detection fires
└────────┬────────┘
▼
┌─────────────────┐
│ Data Dump + PR │ Export attack data,
│ (git + gh) │ stage DRAFT PR
└─────────────────┘
Each box is a LangGraph node (in agents/nodes/) and a Cursor subagent (in .cursor/agents/). You can run the full pipeline automatically via the LangGraph CLI, or step through it interactively with Cursor agents.
Before diving into setup, here are the 3-5 key things you must configure for your environment. Everything else has sensible defaults.
Set SIEM_PLATFORM in your .env file. This controls what detection format the pipeline generates:
| Value | Detection Format | Validation Tool | Lab Environment |
|---|---|---|---|
splunk |
SPL YAML (security_content) | contentctl validate |
Attack Range (AWS) or any Splunk instance |
sentinel |
KQL analytics rules | Azure CLI (az monitor log-analytics query) |
Azure VMs with Defender/Sentinel |
elastic |
EQL/TOML detection rules | Elastic detection-rules CLI or Kibana API | Elastic Stack (Docker or cloud) |
sigma |
Sigma YAML (platform-agnostic) | pySigma + backend conversion |
Any SIEM (convert at deploy time) |
Point to where your detection content lives:
# Required: where your detections are stored
SECURITY_CONTENT_PATH=/path/to/your/detection-repo
# Optional: attack data repo (Splunk-specific)
ATTACK_DATA_PATH=/path/to/attack_dataYou need a target machine that generates telemetry your SIEM can ingest. Choose one:
| Option | Best For | Cost | Setup Time |
|---|---|---|---|
| Attack Range (AWS) | Splunk users, full automation | ~$5-15/day | ~30 min |
| Azure VM + Sentinel | Microsoft shops | Free tier available | ~1 hour |
| Docker Elastic Stack | Elastic users, local testing | Free | ~20 min |
| Existing VM + Sysmon | Any SIEM, reuse infrastructure | Free | ~15 min |
| No lab (Sigma only) | Rule authoring without live validation | Free | ~5 min |
# Default: Anthropic Claude
ANTHROPIC_API_KEY=sk-ant-...
LLM_MODEL=claude-sonnet-4-20250514
# Alternative: OpenAI (requires code change in node files)
# OPENAI_API_KEY=sk-...
# LLM_MODEL=gpt-4oSee docs/MODELS.md for full provider setup.
Each SIEM has its own validation tool. Set VALIDATION_TOOL or rely on the defaults:
| SIEM | Validation Tool | Install |
|---|---|---|
| Splunk | contentctl |
cd $SECURITY_CONTENT_PATH && source venv/bin/activate |
| Sentinel | Azure CLI | az monitor log-analytics query |
| Elastic | detection-rules CLI | pip install detection-rules |
| Sigma | pySigma | pip install pySigma pySigma-backend-splunk |
# Copy and edit the env file
cp agents/.env.example agents/.env
# The 3 critical settings:
SIEM_PLATFORM=sentinel # or splunk, elastic, sigma
SECURITY_CONTENT_PATH=./my-detections
ANTHROPIC_API_KEY=sk-ant-... # or OPENAI_API_KEYThe pipeline was developed against Splunk's security_content repository using:
- Attack Range (Splunk's open-source lab) -- spins up a Splunk instance + Windows/Linux targets in AWS with Sysmon, PowerShell logging, and Zeek pre-configured.
- Atomic Red Team -- standard + custom atomics (T9999.XXX series) executed via Attack Range CLI.
- contentctl -- Splunk's validation tool that checks detection YAMLs against a strict schema.
- Splunk MCP -- the
splunk-mcpMCP server for live SPL queries and data export. - security-detections MCP -- indexes ~7,200+ detections across Sigma, Splunk ESCU, Elastic, and KQL for coverage analysis.
The workflow: parse threat report → check existing coverage via MCP → write Splunk SPL detections → run atomics on Attack Range → validate in Splunk → export attack data → stage DRAFT PR.
Everything below shows how to replicate this for your SIEM.
- AWS account (Attack Range provisions EC2 instances)
- Python 3.10+ with Poetry
- Terraform
- Ansible
- ~$5-15/day in AWS costs while the range is running
# 1. Clone repos
git clone https://github.com/splunk/attack_range.git
git clone https://github.com/splunk/security_content.git
git clone https://github.com/splunk/attack_data.git
# 2. Install Attack Range
cd attack_range
poetry install
poetry shell
# 3. Configure (edit attack_range.yml)
# - Set your AWS region, key name, IP whitelist
# - Enable windows_servers, linux_servers as needed
# 4. Build the range (~15-30 min)
python attack_range.py build
# 5. Check status
python attack_range.py show
# → Splunk URL, Windows target name, Linux target IPexport SECURITY_CONTENT_PATH=/path/to/security_content
export ATTACK_DATA_PATH=/path/to/attack_data
export ATTACK_RANGE_PATH=/path/to/attack_range
export ATTACK_RANGE_VENV=/path/to/poetry/venv/bin/activate
export ATTACK_RANGE_DEFAULT_TARGET=ar-win-yourlab-0
export SPLUNK_MCP_ENABLED=true
export ANTHROPIC_API_KEY=sk-ant-...cd agents
npm install
npm run orchestrate -- --type threat_report --url https://www.cisa.gov/news-events/alerts/...Or step-by-step with Cursor: open the project, invoke the orchestrator agent, and follow its phases.
# Standard Atomic Red Team test
python attack_range.py simulate -e ART -te T1003.001 -t <TARGET_NAME>
# Custom atomic (deploy first, then execute)
ansible-playbook -i '<TARGET_IP>,' deploy_custom_atomics.yml
python attack_range.py simulate -e ART -te T9999.001 -t <TARGET_NAME>
# Wait 2-3 minutes for Splunk ingestion, then validateUse the Splunk MCP:
splunk-mcp:run_detection(detection_path="detections/endpoint/your_detection.yml")
Or run the SPL manually in the Splunk UI / via REST API.
# Stop range when not testing (saves money, keeps config)
python attack_range.py stop
# Resume when ready
python attack_range.py resume
# Destroy completely
python attack_range.py destroy- Azure subscription (free tier works for small labs)
- Log Analytics workspace
- Microsoft Sentinel enabled on the workspace
- One or more VMs sending logs (Windows with MDE, or Linux with Azure Monitor Agent)
# 1. Create a Log Analytics workspace
az monitor log-analytics workspace create \
--resource-group mylab-rg \
--workspace-name detection-lab \
--location eastus
# 2. Enable Sentinel on the workspace
az sentinel onboarding-state create \
--resource-group mylab-rg \
--workspace-name detection-lab
# 3. Deploy a Windows VM with Defender for Endpoint
# (or use Azure Arc to onboard an existing machine)
# Ensure data connectors are enabled:
# - Microsoft Defender for Endpoint
# - Windows Security Events via AMA
# - Sysmon (optional but recommended)
# 4. Verify data flowing
# In Sentinel > Logs, run:
# DeviceProcessEvents | take 10KQL analytics rules go in your detection repo as .kql or .yaml files:
// Detection: Suspicious LSASS Access
// MITRE: T1003.001
DeviceProcessEvents
| where Timestamp > ago(1h)
| where FileName == "procdump.exe" or FileName == "procdump64.exe"
| where ProcessCommandLine has "lsass"
| project Timestamp, DeviceName, AccountName, FileName, ProcessCommandLineTo deploy as a Sentinel Analytics Rule:
# Using Azure CLI
az sentinel alert-rule create \
--resource-group mylab-rg \
--workspace-name detection-lab \
--rule-name "Suspicious LSASS Access" \
--query "DeviceProcessEvents | where FileName in~ ('procdump.exe','procdump64.exe') | where ProcessCommandLine has 'lsass'"Install Invoke-AtomicRedTeam on the target VM:
# On the Windows VM
IEX (IWR 'https://raw.githubusercontent.com/redcanaryco/invoke-atomicredteam/master/install-atomicredteam.ps1' -UseBasicParsing)
Install-AtomicRedTeam -getAtomics
# Run a test
Invoke-AtomicTest T1003.001Or use Azure Automation / Run Command:
az vm run-command invoke \
--resource-group mylab-rg \
--name mywin-vm \
--command-id RunPowerShellScript \
--scripts "Invoke-AtomicTest T1003.001 -Confirm:\$false"// Check if the detection would have fired
DeviceProcessEvents
| where Timestamp > ago(30m)
| where FileName in~ ("procdump.exe", "procdump64.exe")
| where ProcessCommandLine has "lsass"
| count
// count > 0 = detection firesTo point the LangGraph pipeline at Sentinel instead of Splunk:
-
Set environment variables in
agents/.env:SIEM_PLATFORM=sentinel LLM_MODEL=claude-sonnet-4-20250514 # or gpt-4o # Sentinel-specific (optional) AZURE_SUBSCRIPTION_ID=your-sub-id AZURE_RESOURCE_GROUP=mylab-rg AZURE_WORKSPACE_NAME=detection-lab # Not needed for Sentinel: # SPLUNK_MCP_ENABLED=false (default) # ATTACK_RANGE_PATH (not used) # ATTACK_DATA_PATH (not used)
-
The
detection-engineernode generates KQL instead of SPL whenSIEM_PLATFORM=sentinel. -
For validation, replace Splunk MCP calls with Azure Log Analytics REST API queries or use the Azure Monitor MCP if available.
-
Data export: Use
az monitor log-analytics queryto export matching events. -
Validation alternative (no Azure MCP): Run KQL queries manually in Sentinel > Logs, or use the Azure CLI:
az monitor log-analytics query \ --workspace $AZURE_WORKSPACE_NAME \ --analytics-query "$(cat your_detection.kql)" \ --timespan PT1H
| Table | What It Contains | Equivalent Splunk Source |
|---|---|---|
DeviceProcessEvents |
Process creation | Sysmon EventID 1 |
DeviceNetworkEvents |
Network connections | Sysmon EventID 3 |
DeviceFileEvents |
File operations | Sysmon EventID 11 |
DeviceRegistryEvents |
Registry changes | Sysmon EventID 13 |
DeviceLogonEvents |
Authentication | Security 4624/4625 |
SigninLogs |
Azure AD sign-ins | Azure AD logs |
EmailEvents |
Email activity | O365 logs |
- Elastic Stack 8.x (Elasticsearch + Kibana)
- Elastic Agent with Fleet (for endpoint telemetry)
- At least one Windows/Linux endpoint enrolled
# 1. Start Elastic Stack (Docker is easiest for a lab)
docker compose up -d # with elastic's official docker-compose
# 2. Enable Elastic Security in Kibana
# Navigate to Security > Overview
# Install the Elastic Defend integration via Fleet
# 3. Enroll an agent on your test VM
# Fleet > Add agent > copy the enrollment command
# Run on target: ./elastic-agent install --url=... --enrollment-token=...
# 4. Verify data
# In Kibana Dev Tools:
GET .ds-logs-endpoint.events.process-*/_search?size=1Elastic uses TOML for detection rules, or you can write EQL/ES|QL:
EQL (Event Query Language):
process where process.name == "procdump.exe" and process.args : "*lsass*"
ES|QL:
FROM logs-endpoint.events.process-*
| WHERE process.name == "procdump.exe" AND process.command_line LIKE "*lsass*"
| KEEP @timestamp, host.name, user.name, process.name, process.command_line
Detection rule TOML:
[rule]
name = "Suspicious LSASS Access via Procdump"
rule_id = "your-uuid-here"
risk_score = 73
severity = "high"
type = "eql"
query = '''
process where process.name == "procdump.exe" and process.args : "*lsass*"
'''
[rule.threat]
framework = "MITRE ATT&CK"
[[rule.threat.technique]]
id = "T1003"
name = "OS Credential Dumping"
[[rule.threat.technique.subtechnique]]
id = "T1003.001"
name = "LSASS Memory"Same as any platform -- install Invoke-AtomicRedTeam on the target endpoint and run tests. Elastic Agent picks up the telemetry automatically.
# Query Elasticsearch directly
curl -XGET "localhost:9200/.ds-logs-endpoint.events.process-*/_search" \
-H 'Content-Type: application/json' \
-d '{"query":{"bool":{"must":[{"match":{"process.name":"procdump.exe"}},{"match_phrase":{"process.command_line":"lsass"}}]}}}'Or use Kibana Security > Detections to see if the rule fired.
-
Set environment variables in
agents/.env:SIEM_PLATFORM=elastic LLM_MODEL=claude-sonnet-4-20250514 # Elastic-specific (optional) ELASTICSEARCH_URL=http://localhost:9200 KIBANA_URL=http://localhost:5601 # Not needed for Elastic: # SPLUNK_MCP_ENABLED=false (default) # ATTACK_RANGE_PATH (not used)
-
Detection engineer outputs EQL or TOML instead of SPL.
-
Validation uses Elasticsearch REST API instead of Splunk MCP:
# Query Elasticsearch for detection results curl -XGET "$ELASTICSEARCH_URL/.ds-logs-endpoint.events.process-*/_search" \ -H 'Content-Type: application/json' \ -d @your_query.json
-
Data export: Use Elasticsearch
_searchAPI with scroll/PIT. -
Validation alternative (CLI): Use the Elastic detection-rules CLI:
git clone https://github.com/elastic/detection-rules.git cd detection-rules && pip install . python -m detection_rules validate-rule path/to/rule.toml
| Index Pattern | What It Contains | Equivalent |
|---|---|---|
logs-endpoint.events.process-* |
Process events | Sysmon 1 / CIM Processes |
logs-endpoint.events.network-* |
Network events | Sysmon 3 / CIM Network_Traffic |
logs-endpoint.events.file-* |
File events | Sysmon 11 / CIM Filesystem |
logs-endpoint.events.registry-* |
Registry events | Sysmon 13 |
logs-system.auth-* |
Auth events | Security 4624 |
Sigma rules are SIEM-agnostic -- write once, convert to any backend. This is the best approach if you support multiple SIEMs or want maximum portability.
- Python 3.8+
- pySigma (
pip install pySigma) - Backend plugins for your target SIEM
# Install pySigma and backends
pip install pySigma
pip install pySigma-backend-splunk # for Splunk output
pip install pySigma-backend-microsoft365defender # for KQL output
pip install pySigma-backend-elasticsearch # for Elastic output
# Install pipelines (field mapping)
pip install pySigma-pipeline-sysmon
pip install pySigma-pipeline-windowstitle: Suspicious LSASS Access via Procdump
id: your-uuid-here
status: stable
level: high
description: Detects procdump being used to dump LSASS memory.
author: Your Name
date: 2026/02/06
references:
- https://attack.mitre.org/techniques/T1003/001/
tags:
- attack.credential_access
- attack.t1003.001
logsource:
category: process_creation
product: windows
detection:
selection:
Image|endswith: '\procdump.exe'
CommandLine|contains: 'lsass'
condition: selection
falsepositives:
- Legitimate memory dump for debugging# To Splunk SPL
sigma convert -t splunk -p sysmon rule.yml
# To Microsoft 365 Defender KQL
sigma convert -t microsoft365defender rule.yml
# To Elasticsearch Lucene query
sigma convert -t elasticsearch rule.yml
# To Elasticsearch EQL
sigma convert -t elasticsearch -f eql rule.yml- Write Sigma rule
- Convert to your SIEM's query language
- Run the converted query against your lab
- If it fires, the rule is valid
- Commit the Sigma rule (not the converted output)
-
Set environment variables in
agents/.env:SIEM_PLATFORM=sigma LLM_MODEL=claude-sonnet-4-20250514 # Sigma-specific: which backend to convert to for validation SIGMA_TARGET_BACKEND=splunk # or microsoft365defender, elasticsearch # Not needed for Sigma: # SPLUNK_MCP_ENABLED=false (default) # ATTACK_RANGE_PATH (not used)
-
Detection engineer outputs Sigma YAML.
-
Validation step converts to your SIEM's format, then queries:
# Convert and validate in one step sigma convert -t splunk -p sysmon rule.yml | \ splunk search "$(cat -)"
-
Store the Sigma rule in your repo; conversion happens at deploy time.
-
Best practice: Use Sigma as the source of truth and maintain a CI/CD pipeline that converts to your SIEM's native format on merge.
Regardless of SIEM, the atomic testing workflow is the same:
# Install Invoke-AtomicRedTeam
IEX (IWR 'https://raw.githubusercontent.com/redcanaryco/invoke-atomicredteam/master/install-atomicredteam.ps1' -UseBasicParsing)
Install-AtomicRedTeam -getAtomics
# List available tests for a technique
Invoke-AtomicTest T1003.001 -ShowDetailsBrief
# Run a specific test
Invoke-AtomicTest T1003.001 -TestNumbers 1
# Run all tests for a technique
Invoke-AtomicTest T1003.001# Install Invoke-AtomicRedTeam (PowerShell required)
pwsh -c "IEX (IWR 'https://raw.githubusercontent.com/redcanaryco/invoke-atomicredteam/master/install-atomicredteam.ps1' -UseBasicParsing); Install-AtomicRedTeam -getAtomics"
# Or use the bash runner
git clone https://github.com/redcanaryco/atomic-red-team.git
cd atomic-red-team/atomics/T1003.001
bash T1003.001.sh # if a bash variant existsWhen standard Atomic Red Team tests don't cover your specific detection behavior:
- Create a custom atomic YAML following the Atomic Red Team format
- Use the T9999.XXX numbering convention (avoids conflicts with official tests)
- Deploy via Ansible playbook or direct copy to the target
- Execute and wait for log ingestion
You don't need actual malware. Focus on generating telemetry that matches your detection logic:
- Copy legit binaries to suspicious names (
cmd.exe->svchost.exe) - Run with suspicious command-line flags
- Create files in monitored paths
- The goal is validating detection logic, not replicating malware perfectly
Minimum viable setup for your first e2e detection validation:
- Pick your SIEM -- Splunk, Sentinel, Elastic, or Sigma
- Set up a lab target -- One Windows VM with Sysmon (or equivalent logging)
- Install atomics -- Invoke-AtomicRedTeam on the target
- Configure env vars -- Copy
agents/.env.exampleto.envand fill in paths - Install MCP servers -- At minimum
security-detectionsfor coverage analysis - Run a test -- Execute
T1003.001on the target, wait 3 minutes - Query your SIEM -- Confirm the events show up
- Write a detection -- Create a rule that matches the telemetry
- Validate -- Run the detection query, confirm count > 0
- Export -- Save the validated data and stage a PR
| Feature | Splunk + Attack Range | Sentinel | Elastic | Sigma |
|---|---|---|---|---|
| Cost | ~$5-15/day AWS | Free tier available | Free (self-hosted) | Free |
| Setup time | ~30 min | ~1 hour | ~20 min (Docker) | ~5 min |
| Built-in atomics | Yes (via Attack Range) | Manual install | Manual install | N/A |
| Detection format | SPL YAML | KQL | EQL/TOML | Sigma YAML |
| Validation tool | contentctl | Azure CLI | Detection Engine | pySigma |
| MCP available | splunk-mcp | azure-mcp (community) | Not yet | N/A |
| Best for | Splunk shops, security_content contributors | Microsoft/Azure shops | Elastic shops | Multi-SIEM, portability |
To run the LangGraph pipeline for your SIEM:
# 1. Set your platform
export SIEM_PLATFORM=sentinel # or splunk, elastic, sigma
# 2. Set paths
export SECURITY_CONTENT_PATH=./my-detections
export ANTHROPIC_API_KEY=sk-ant-...
# 3. Run
cd agents
npm install
npm run orchestrate -- --type technique --input "T1566.004 Spearphishing Voice"The pipeline adapts its detection output format based on SIEM_PLATFORM. The CTI analysis and coverage check steps are platform-agnostic (they use the security-detections MCP which indexes rules from all formats).