This project automates the creation of a knowledge graph from security reports using Large Language Models (LLMs). The extracted entities and relationships are structured based on defined schemas and stored as nodes and edges in a Neo4j database, allowing for a detailed, queryable graph representation of threat intelligence.
The Threat Intelligence Knowledge Graph is a Python-based tool that extracts contextual structured intelligence from unstructured cyber security reports (fetched from URLs) and generates a knowledge graph using Neo4j. It supports multiple LLM providers (OpenAI, Google Gemini, Anthropic Claude) and aims to structure data based on concepts like the MITRE ATT&CK taxonomy and STIX 2.1 objects. This tool automates the process of identifying and structuring entities and their relationships within security contexts.
Example graph:
Generated from the following article: https://cloud.google.com/blog/topics/threat-intelligence/sandworm-disrupts-power-ukraine-operational-technology
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ │ │ Model Extractors │ │ │
│ Threat Intel │ │ ┌─────────────────┐ │ │ Knowledge Graph │
│ Reports │───>│ │OpenAI │ │───>│ Storage │
│ │ │ │Claude │ │ │ │
│ - URLs │ │ │Gemini │ │ │ - Neo4j Database │
│ - Local Files │ │ │NER/STIXnet │ │ │ - Query Interface │
└─────────────────────┘ │ └─────────────────┘ │ └─────────────────────┘
└─────────────────────┘
The system consists of three main components:
- Input processing - Handles document loading and text chunking from various sources
- Entity extraction - Uses LLMs or pattern-based extractors to identify entities and relationships
- Storage and visualisation - Persists the knowledge graph in Neo4j for visualization and analysis
Note, STIXnet is a modified submodule.
- Python 3.8+
- An active Neo4j instance (local or cloud-based, AuraDB free tier works well)
- API Key for your chosen LLM provider
- Python libraries via
requirements.txt
Clone the repository and install the dependencies:
git clone https://github.com/F3dai/Threat-intelligence-knowledge-graph.git
cd Threat-intelligence-knowledge-graph
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
pip install -r requirements.txt
# If using Named Entity Recognition, download NLTK resources
python STIXnet/download_nltk.py- Configure a
.envfile in the root directory. Template provided. - (Optional) Adjust default model names, chunking parameters, allowed nodes/relationships in
src/threat_intel_kg/config.py.
The tool is run from the command line using src/cli.py.
usage: cli.py [-h] {process,version} ...
Extract threat intelligence from security reports and build a knowledge graph.
positional arguments:
{process,version} Command to run
process Process URLs to extract threat intelligence
version Show version information
options:
-h, --help show this help message and exitWhen using process:
usage: cli.py process [-h] [--verbose]
[--model {gpt-3.5-turbo,gpt-4-turbo,gpt-4o,gemini-2.5-pro,gemini-2.0-flash,gemini-2.5-flash-preview-04-17,claude-3-5-haiku,claude-3-5-sonnet-20240620,ner}]
urls [urls ...]
cli.py process: error: the following arguments are required: urls
Example usage:
python src/cli.py process --model gemini-2.5-flash-preview-04-17 https://cloud.google.com/blog/topics/threat-intelligence/sandworm-disrupts-power-ukraine-operational-technology
The system supports a variety of models for entity extraction:
| Provider | Models | Context Window | Features |
|---|---|---|---|
| OpenAI | gpt-3.5-turbo gpt-4-turbo gpt-4o |
16K 128K 128K |
High accuracy Good relationship extraction Fast response times |
| gemini-2.5-pro gemini-2.0-flash gemini-2.5-flash |
1M 128K 128K |
Very large context window Handles long reports Rate-limited (5 RPM, 25 RPD) |
|
| Anthropic | claude-3-5-haiku claude-3-5-sonnet |
200K 400K |
Strong at nuanced extraction Good with technical details |
| STIXnet | ner | Unlimited | No API key needed Specialized IoC detection Limited to predefined patterns |
We've conducted comprehensive evaluations of different models to measure their performance in threat intelligence extraction:
This summary table shows that:
- GPT-3.5-Turbo extracts the most entities, but likely because of chunking
- Gemini models offer good cost efficiency with varying levels of entity extraction
- NER extractor is free but extracts fewer entities and relationships. It is also poor at understanding wider context, resulting in fewer relationships.
- Claude models occupy a middle ground in terms of cost and extraction performance
Note: The number of entities extracted does not necessarily indicate better quality. We are working on enhancing our evaluation framework to include qualitative measures such as relevance, accuracy, and completeness of the extracted knowledge graph. The current metrics focus on quantitative aspects only.
The project includes a comprehensive evaluation framework for comparing different models:
# Run model evaluation with default settings
python evaluation/model_evaluation.py
# Evaluate specific models
python evaluation/model_evaluation.py --models gpt-4o claude-3-5-sonnet-20240620
# Run evaluations in parallel. Caution for hitting rate limits with this option.
python evaluation/model_evaluation.py --parallelThe evaluation generates an interactive HTML report with:
- Processing time comparison
- Entity extraction metrics
- Cost efficiency analysis
- Detailed performance statistics
Results are stored in timestamped directories under evaluation/results/ with interactive visualisations in the charts/ subdirectory.
- Enforced STIX taxonomy
- Imeplented additional models, including STIXnet, a local Named Entity Recognition model
- Created evaluation framework to compare models
- Added MIT Licesne
- Evaluation framework:
- Improving metrics
- High quality anotation for benchmarking
- Prompt engineering now that we have an evaluation framework
- Implementing additional NER models
- Possibly a web interface
This project is in an early stage (alpha) - you should review and test if using in any production environment. It interacts with external APIs (potentially incurring costs), processes data from untrusted external URLs, and modifies a database. Use with caution and at your own risk!

