das-anomaly

das-anomaly is an open-source Python package for unsupervised anomaly detection in distributed acoustic sensing (DAS) datasets using an autoencoder-based deep learning algorithm. It is being developed by Ahmad Tourei under the supervision of Dr. Eileen R. Martin at Colorado School of Mines.

If you use das-anomaly in your work, please cite the following:

Ahmad Tourei. (2025). DASDAE/das-anomaly: latest (Concept). Zenodo. http://doi.org/10.5281/zenodo.12747212

Installation

Prerequisites

Python = 3.10, 3.11, 3.12
pip

Dependencies

Optional:

MPI4Py

Dependency notes:

Installation and loading of Open MPI is required prior to MPI4Py installation. Ensure proper installation using a helloworld example.
If you'd like to train the model on GPU, make sure you install TensorFlow with GPU setup in your environment. More information can be found here.
Currently waiting on TensorFlow to support Python 3.13 before we can support it as well.

Install Required Dependencies Only

For clean dependency management, use a virtual environment or a fresh Conda environment. To install the package in editable mode with the required dependencies, run the following after cloning the repository and navigating to the repo directory:

pip install -e .

Install All Dependencies

To install the package in editable mode with all optional dependencies, run:

pip install -e '.[all]'

Uninstall

To uninstall the package, run:

pip uninstall das_anomaly

Autoencoder Model Architecture

The package implements a convolutional autoencoder designed to compress and reconstruct power spectral density (PSD) inputs.

Encoder: A lightweight convolutional neural network reduces the input dimensionality, mapping it into a compact latent space.
Decoder: A symmetric decoder reconstructs the data by upsampling the latent representation back to the original resolution.

Usage Workflow

The overall workflow for using the package is illustrated below:

The main steps are:

Define constants and create a Spool of data:

Using the config_user script in the das_anomaly directory, define the constants and directory paths for the data, power spectral density (PSD) images, detected anomaly results, etc. You would complete adding the values and paths as you go over the steps mentioned below. Then, using DASCore, create an index file for the spool of data first time reading the DAS data directory:

Example

import dascore as dc
from das_anomaly.settings import SETTINGS

data_path = SETTINGS.DATA_PATH

# Update will create an index of the contents for fast querying/access. No need to apply update() in future.
spool = dc.spool(directory_path).update()

Note: Creating the spool for the first time may take some time if your directory contains hundreds of gigabytes or terabytes of DAS data. However, DASCore creates an index file, allowing it to quickly query the directory on subsequent accesses.

Set a consistent upper bound for PSD amplitude values:

To ensure all PSD images share the same colorbar scale (in RGB), determine an appropriate CLIP_VALUE_MAX in the config_user input file. This can be done using the get_psd_max_clip function, which computes the mean value of maximum amplitude from TIME_WINDOWs of the data which does not include obvious anomalies (therefore, a quick exploratory data analysis is needed here.)

Example

from das_anomaly.psd import PSDConfig, PSDGenerator
from das_anomaly.settings import SETTINGS

# path to one or a few background noise patches 
bn_data_path = SETTINGS.BN_DATA_PATH
cfg = PSDConfig(data_path=bn_data_path)
gen = PSDGenerator(cfg)
percentile = 90 # data dependent - need visual inspection
clip_val = gen.run_get_psd_val(percentile=percentile)
print(f"Mean {percentile}-percentile amplitude across all patches: {clip_val:.3e}")

Generate PSD plots:

Use the das_anomaly.psd module and create PSD plots in RGB format and in plain mode (with no axes or colorbar). The das_anomaly.psd.PSDGenerator reads DAS data, creates a spool using DASCore library, applies a detrend function to each patch of the chunked spool, and then average the energy over a desired time window and stack all channels together to create a spatial PSD image with channels on the X-axis and frequency on the Y-axis. You can use MPI to embarrassingly distribute reading data and plotting PSDs over CPUs.

Example

from das_anomaly.psd import PSDConfig, PSDGenerator

cfg = PSDConfig()
# serial processing with single processor:
PSDGenerator(cfg).run()
# parallel processing with multiple processors using MPI:
PSDGenerator(cfg).run_parallel()

Note: If you'd like to use PSDs for purposes other than training the model, the hide_axes=False will plot the PSD with axes and colorbar (default is True).

Example

from das_anomaly.psd import PSDConfig, PSDGenerator

cfg = PSDConfig(hide_axes=False)
# serial processing with single processor:
PSDGenerator(cfg).run()
# parallel processing with multiple processors using MPI (first, make sure you've installed the package with all dependencies explained above):
PSDGenerator(cfg).run_parallel()

Select and copy known anomaly PSD plots:

From the generated PSD plots, visually identify and then copy examples of known anomalies to the ANOMALY_IMAGES_PATH specified in the config_user input script. These anomalies can include events such as earthquakes from an existing catalog, instrument noise, anthropogenic disturbances, etc. Including these examples helps improve thresholding during the detection process.

Train:

The das_anomaly.train module helps with randomly selecting train and test PSD images and training the model (with CPU or GPU) on anomaly-free PSD images.

Example

from das_anomaly.settings import SETTINGS
from das_anomaly.train import TrainAEConfig, AutoencoderTrainer, TrainSplitConfig, ImageSplitter

# select and copy train and test datasets from PSD
cfg = TrainSplitConfig()
ImageSplitter(cfg).run()

# train the autoencoder model
cfg = TrainAEConfig()
AutoencoderTrainer(cfg).run()

Note: Since the TrainSplitConfig() function randomly selects PSD images from the generated plots, you must ensure the training and testing datasets do not include obvious anomalies. If you have an excel sheet with time stamp of anomalies (such as a catalog), use the "exclude_known_events_from_training" in examples directory to exclude them. Or, manually inspect both the training and testing sets to ensure they do not contain apparent anomalies. Review their time- and frequency-domain plots, and remove any suspicious samples to maintain the quality of training.

Test and set thresholds:

Using the validate_and_plot_density and thresholding_f_score jupyter notebooks in the examples directory, validate the trained model and find appropriate MSE and density score as thresholds for anomaly detection. Make sure to modify the DENSITY_THRESHOLD and MSE_THRESHOLD parameters in the config_user script.

Run the trained model:

The das_anomaly.detect module uses the trained model to detect anomalies in the PSD images and writes their information (e.g., time stamp). It also copies the detected anomaly to the RESULTS_PATH. MPI can be used to distribute PSDs over CPUs. Then, using the das_anomaly.count module, count the number of detected anomalies and display their details and file paths.

Example

from das_anomaly.count.counter import CounterConfig, AnomalyCounter
from das_anomaly.detect import DetectConfig, AnomalyDetector

cfg = DetectConfig()
# serial processing with single processor:
AnomalyDetector(cfg).run()
# parallel processing with multiple processors using MPI:
AnomalyDetector(cfg).run_parallel()

# count number of detected anomalies
cfg = CounterConfig(keyword="anomaly", classify_types=True, max_gap_seconds=0)
anomalies = AnomalyCounter(cfg).run() 
print(anomalies) # prints info on number of anomalies and path to them

Note

Still under development. Use with caution.

Contact

Ahmad Tourei Colorado School of Mines [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.github/workflows		.github/workflows
das_anomaly		das_anomaly
docs/figures		docs/figures
examples		examples
tests		tests
.DS_Store		.DS_Store
.coverage		.coverage
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

das-anomaly

Installation

Prerequisites

Dependencies

Install Required Dependencies Only

Install All Dependencies

Uninstall

Autoencoder Model Architecture

Usage Workflow

Example

Example

Example

Example

Example

Example

Note

Contact

About

Uh oh!

Releases 8

Packages

Uh oh!

Languages

License

DASDAE/das-anomaly

Folders and files

Latest commit

History

Repository files navigation

das-anomaly

Installation

Prerequisites

Dependencies

Install Required Dependencies Only

Install All Dependencies

Uninstall

Autoencoder Model Architecture

Usage Workflow

Example

Example

Example

Example

Example

Example

Note

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Languages

Packages