This project implements an AIOps tool CroSysLog for log-entry level anomaly detection (i.e., detect anomalous log entries) across different software systems.
-
Learner.py: Defines the base model that do for log-event level anomaly detection in CroSysLog. -
Meta.py: Contains the implementation of the MAML algorithm. TheMAMLclass manages the meta-training and meta-testing phase for CroSysLog. It trains the base model LSTM defined inLearner.py. We considered this repo MAML-Pytorch in our implementation. -
MetaDataset.py: Responsible for sampling, loading, pre-processing datasets for source and target systems. It defines theMetaDatasetclass, which samples the log data from source/target systems for meta-training and meta-testing phases, and create log embeddings using the neural representation method defined inNeuralParserclass. -
NeuralParser.py: Implements the BERT-based embedding generation for logs usingBertTokenizer(Wordpiece tokenization) andBertModel(base BERT). TheBertEmbeddingsclass creates sentence embeddings for the logs, which are used as input to the meta-learning models. -
train_sample.py: Contains the training script for CroSysLog. This script uses theMetaDatasetto load the data, and trains and evaluates CroSysLog. It also handles the training configuration and hyperparameter optimization usingray[tune].
The project requires the following libraries:
- Python 3.10
- Torch 2.0.1
- Transformers 4.30.2 (for BERT)
- Ray 2.4.0 (for hyperparameter tuning)
- Pandas 2.0.3, NumPy 1.25.0, Polars 0.19.8 (for data manipulation)
- Scikit-learn 1.3.0 (for data normalization)
This project uses software log datasets from four large-scale distributed supercomputing systems—BGL, Thunderbird, Liberty, and Spirit—sourced from the Usenix CFDR repository. We do not hold the right to publicly share these datasets here. Please refer to the original source for downloading the datasets.
-
Prepare the Dataset
-
Install Dependencies
-
Train CroSysLog