CHARON

Setup | Usage | Configuration | Repo Architecture

CHARON

CHARON is a polyglot static analysis engine for detecting vulnerabilities in scripting languages native extensions. CHARON can detect taint-style vulnerabilities that cross several times the language boundary, spanning multiple functions on either side.

🏭 Setup

Dependencies: CHARON requires the following components to be installed on your system in order to function.

|-- python interpreter (>=3.10.0)
|-- joern (==1.1.1277)
|   |-- java development kit (e.g., OpenJDK)
|   |-- gcc^(*)
|   |-- g++^(*)
|-- pip^(*)
|-- npm^(*)
|-- tar^(*)
|-- jq^(*)
|-- curl^(*)

Note: If you intend to analyze pre-downloaded code, only python and joern are required. Other dependencies are used for the download process.

Joern Installation

Download the latest joern installer script.
Run the installer: ./joern-install --interactive.
When prompted for the version, enter 1.1.1277.
Allow symlink creation if desired (requires elevated privileges).
Set the JOERN_DIR in $CHARON_DIR/core/common/config.py to your joern installation path.

🚀 Usage

CHARON needs a list of packages to analyze, at least one operation to perform, the language system and a desired analysis type:

usage: python charon.py [-h] [-p PACKAGES] [-s] [-a] [-l {node,python}] [-c]
                 [-t {complete,verification}]

options:
  -h, --help            show this help message and exit
  -p PACKAGES, --packages PACKAGES
                        Package List (file)
  -s, --setup           Prepare Analysis
  -a, --analysis        Perform Analysis
  -l {node,python}, --language {node,python}
                        Scripting Language/Framework to analyze
  -c, --clean           Delete downloaded packages and generated output files
  -t {complete,verification}, --type {complete,verification} Analysis type

Package List

CLI argument: -p|--packages, expected value: file path

The list of packages to analyze should be passed as a file containing one package name per line.

Note: we provide the package lists we used in npm_ne.csv and pypi_ne.csv, for NPM and PyPI, respectively.

Operation

Setup

CLI argument: -s|--setup, expected value: none

Providing this flag enables the preparation phase for the provided packages, i.e., package downloading, dataset filtering, metadata extraction and CPG construction for both native and script code.

Analysis

CLI argument: -a|--analysis, expected value: none

Providing this flag enables the analysis phase for the provided packages. The analysis phase depends on outputs from the preparation phase. Therefore, before the first analysis on a set of packages p, it is mandatory to run the preparation phase on p first. However, the preparation phase does not need to be re-run on subsequent analyses, provided that both the package set and studied vulnerabilities remain identical.

It is possible to combine the preparation and analysis phases in a single run by passing both flags at the same time.

Language System

CLI argument: -l|--language, expected value: node|python

This argument specifies the language system used by the studied packages. It is used by the framework to load the correct plugin files.

Analysis Type

CLI argument: -t|--type, expected value: complete|verification

This argument influences how the identified flows are visualized in the final report. In a complete analysis, all cross-language dataflows are displayed in the report. A verification analysis displays a single cross-language flow per sink node. This may greatly reduce the size of the final report, but information about other attacker-controlled data sources is dropped.

Example

Running the full pipeline of CHARON on a set of Node.js native extensions node.csv, including preparation and complete analysis:

$ python charon.py -p node.csv -sa -l node -t complete &

⚙️ Configuration

CHARON supports a range of configuration options to tailor its analysis to your enviroment and system specs.

Configuration options are defined in $CHARON_DIR/core/common/config.py. Below, we list the most important options.

Performance and Resource Management
- PROCESSES (default: 10): the amount of parallel instances when downloading and verifying the cross-language nature and presence of sinks in packages.
- PROCESSES_PREP (default: 1): the amount of parallel instances during the graph construction phase. Depending on the size of the package codebase, CPG construction can be both CPU and memory costly. We recommend increasing this value incrementally and carefully, until the optimal value for your system is reached.
- PROCESSES_ANALYSIS (default: 1): the amount of parallel instances during the PPG analysis phase. Depending on the size of the package codebase, PPGs may occupy larger sections of memory. We recommend increasing this value incrementally and carefully, until the optimal value for your system is reached.
- TIMEOUT (default: 2500): a temporal limit for the processing of a package, in seconds. If an analysis or graph construction did not terminate before hitting the timeout, the current instance is killed and CHARON continues with the next package.
- MEM_CAP (default: 16): sets the maximal memory size an individual JVM is allowed to allocate, in GB.
Logging Verbosity
- LEVEL (default: logging.DEBUG): increase or decrease the logging verbosity. The available levels are documented in Python's logging documentation.
Environment
- JOERN_DIR: the path to your top-level joern installation directory. This value is used as fallback in case CHARON fails to find joern's executables, e.g., due to missing symlinks while joern is not included in $PATH.

🗂️ Repository Architecture

Folder: apps

This folder contains data related to analyzed packages. Package source code is downloaded into the code folder. For local applications, copy the code here. After graph construction, Code Property Graphs (CPGs) for native and script code are stored in cpg/addon and cpg/script.

Folder: core

Contains CHARON’s Python wrapper.

analysis: Manages analysis threads, starts the Joern engine, and collects results.
common: Shared functionalities, including file management, configuration, and data structures.
modules: Plugins for cross-language systems, managing language-specific configurations, result directories, and API lists for PPG linking.
preparation: Scripts for package download, filtering, metadata extraction, and graph construction.
fin.py: Finalizes analysis, moves results/logs to prevent overwriting.
init.py: Initializes the pipeline, validates vulnerability plugins, and registers them.

Folder: include

Contains source code for native extension APIs (e.g., NAN, N-API) added to the native CPG during construction.

Folder: scala

Contains CHARON’s PPG linking and cross-language analysis algorithm, executed by the Joern engine.

import: Joern query language extensions for linking and mitigation filtering.
main: Compiled CHARON versions for analysis, invoked by the Python wrapper.
templates: Core cross-language analysis and intra-language components.

Folder: vln

Contains plugins defining vulnerability-specific elements, with a template for adding new vulnerabilty types.

Folder: output

Contains CHARON’s log file and analysis results.

$LANG/callmap, $LANG/native, $LANG/script: PPG edges and dataflow lists for each package.
APIMAP.json: Metadata of native API presence for CPG and PPG linking.
CHARON.log: Logs analysis progress, vulnerabilities, mitigations, and potential failures.
SINKMAP.json: Sink presence information for query optimization.
VLNMAP.json: Cross-language dataflows detected, comprising the final analysis report.

📜 License

This project is available as open source under the terms of the GNU AFFERO GENERAL PUBLIC LICENSE V3.0. See LICENSE for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CHARON

🏭 Setup

Joern Installation

🚀 Usage

Package List

Operation

Language System

Analysis Type

Example

⚙️ Configuration

🗂️ Repository Architecture

Folder: apps

Folder: core

Folder: include

Folder: scala

Folder: vln

Folder: output

📜 License

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
apps		apps
assets		assets
core		core
include		include
output		output
scala		scala
vln		vln
LICENSE		LICENSE
README.md		README.md
charon.py		charon.py
joern_version.txt		joern_version.txt
npm_ne.csv		npm_ne.csv
pypi_ne.csv		pypi_ne.csv

License

VainlyStrain/charon

Folders and files

Latest commit

History

Repository files navigation

CHARON

🏭 Setup

Joern Installation

🚀 Usage

Package List

Operation

Language System

Analysis Type

Example

⚙️ Configuration

🗂️ Repository Architecture

Folder: apps

Folder: core

Folder: include

Folder: scala

Folder: vln

Folder: output

📜 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages