🔥MIDLog: Weakly-supervised Log-based Anomaly Detection with Inexact Labels via Multi-instance Learning (ICSE 2025)
This is the basic implementation of our paper in ICSE 2025 (Research Track): Weakly-supervised Log-based Anomaly Detection with Inexact Labels via Multi-instance Learning
Log-based anomaly detection is essential for maintaining software availability. However, existing log-based anomaly detection approaches heavily rely on fine-grained exact labels of log entries which are very hard to obtain in real-world systems. This brings a key problem that anomaly detection models require supervision signals while labeled log entries are unavailable. Facing this problem, we propose a new labeling strategy called inexact labeling that instead of labeling an log entry, system experts can label a bag of log entries in a time span. Furthermore, we propose MIDLog, a weakly supervised log-based anomaly detection approach with inexact labels. We leverage the multi-instance learning paradigm to achieve explicit separation of anomalous log entries from the inexact labeled anomalous log set so as to deduce exact anomalous log labels from inexact labeled log sets. Extensive evaluation on three public datasets shows that our approach achieves an F1 score of over 85% with inexact labels.
├─checkpoint # Saved models
├─data # Log data
├─glove # Pre-trained Language Models for Log Embedding
├─src
| ├─dataset.py # Load dataset
| ├─models.py # Attention-based Patch Encoder, PatchMixer Model definition
| └─utils.py # Log Embedding
├─main.py # entries
└─process.py # Data preprocess
We used 3 open-source log datasets for evaluation, Spirit, Thunderbird and Hadoop.
| Software System | Description | Time Span | # Messages | Data Size | Link |
|---|---|---|---|---|---|
| Spirit | Spirit (ICC2) supercomputer log | 558 days | 272,298,969 | 30.289 GB | Usenix-CFDR Data |
| Thunderbird | Thunderbird supercomputer log | 244 days | 211,212,192 | 27.367 GB | Usenix-CFDR Data |
| Hadoop | Hadoop mapreduce job log | N.A. | 394,308 | 48.61MB | LogHub |
Note: Considering the huge scale of the Spirit and Thunderbird datasets, we followed the settings of the previous study LogADEmpirical and selected the earliest 1 GB and 10 million log messages from the Spirit and Thunderbird datasets, respectively, for experimentation.
Key Packages:
Numpy==1.20.3
Pandas==1.3.5
Pytorch_lightning==1.1.2
scikit_learn==0.24.2
torch==1.13.1+cu116
tqdm==4.62.3
You need to follow these steps to completely run MIDLog.
- Step 1: Download Log Data and put it under
datafolder. - Step 2: Using Drain to parse the unstructed logs.
- Step 3: Download
glove.6B.300d.txtfrom Stanford NLP word embeddings, and put it underglovefolder.
you can run MIDLog on Spirit dataset with this code:
python process.py --log_file data/log_structured --temp_file data/log_templates
python main.py --stage ulnl --mode train --epochs 10 --batch_size 256 --num_keys 1278
python main.py --stage ulnl --mode gen --epochs 10 --batch_size 256 --num_keys 1278 --candidates 150 --load_checkpoint True
python main.py --stage gnd --mode train --epochs 10 --batch_size 256
python main.py --stage midl --mode train --epochs 150 --batch_size 32 --num_keys 1278
python main.py --stage midl --mode auc_eval --epochs 150 --batch_size 32 --num_keys 1278 --load_checkpoint True
