Explainable Valuation of Log Data for Deep Learning Based Anomaly Detection
![]() Figure 1. End-to-end xLoad workflow. |
![]() Figure 2. Aggregation of SHAP values for template-level relevance. |
This repository provides the end-to-end pipeline to:
- parse raw logs into templates,
- train DL-based log anomaly detectors,
- explain model inputs with SHAP,
- rank template relevance,
- distill (purge) low-relevance logs,
- re-train and evaluate the resulting models.
- The open dataset (
HDFS) can be downloaded from LogHub. - The proprietary production dataset (
Raptor) is not publicly released.
-
Keeps detection quality with less data
Across representative DL models and two large-scale datasets, up to about 30% low-relevance logs can be removed while keeping anomaly detection metrics (Accuracy / Recall / F1) largely stable. -
Clearly better than random truncation
Under the same truncation ratio, relevance-based truncation (xLoad) preserves F1 and Recall better than random truncation.
The paper reports that random truncation causes a clearer metric decline, while xLoad remains more robust. -
Substantial training-time reduction
With around 30% truncation, training-time reduction is often around 50% (model- and dataset-dependent).
This benefit is shown consistently in the training-time table. -
Good temporal durability
Models trained on distilled logs can stay effective over later time slices, reducing re-training frequency in practical settings.
![]() Figure 3. HDFS template relevance patterns across models. |
![]() Figure 4. Raptor top/bottom relevance ranking comparison. |
.
|-- README.md
|-- conf
| |-- config.yaml # Main xLoad settings
| |-- drain3.ini # Template mining settings (HDFS example)
| `-- log.yaml # Logging config
|-- epurger
| |-- __main__.py # CLI entry
| |-- parser.py # Template mining / parsing
| |-- preprocessor.py # Windowing, feature preparation, train/test split
| |-- trainer.py # Model training / evaluation
| |-- explainer.py # SHAP explanation
| |-- purger.py # Relevance-based log purging
| |-- figure.py # Figure generation utilities
| `-- models # DeepLog, LogRobust, CNN, Logsy implementations
`-- requirements.txt
- Python
3.10is recommended. - Conda environment is recommended.
- While pure CPU execution is supported, NVIDIA GPUs with CUDA is optional but strongly recommended for optimal performance.
- Runs on Windows, macOS, and Linux.
Install dependencies:
pip install -r requirements.txtFor LogRobust, download and place FastText vectors in data/pretrain:
Note:
LogRobustcan require very large RAM (about 50 GB free RAM on HDFS in our practice).
First, you need to download HDFS.log from LogHub, which is about 1.5GB.
python -m epurger par -s HDFS -i data/HDFS.logKey parameters:
-s, --dataset: dataset type (HDFSorRaptor).-i, --input: raw log file/directory path.-o, --output: optional output directory for parsed logs. If omitted, default is<input>-par.
You need anomaly_label.csv (also from LogHub) to build labeled structured data.
python -m epurger pre -s HDFS -f data/HDFS.log-par/HDFS_structured.csv -l data/anomaly_label.csvAfter this step, you will get a preprocessed dataset such as:
data/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/
Key parameters:
-f, --file: structured log csv generated bypar.-l, --label: anomaly label file.-tr, --test-ratio: test split ratio (default0.2).-a, --anomaly-ratio: anomaly ratio in training set (default0.0).-ta, --testAnomalyRatio: anomaly ratio in test set (default1.0).-t, --template: optional template file for filtering/preprocess.--threshold: template filtering threshold.
When -t (template path) is provided in tras, trained models are automatically explained.
SHAP value generation itself does not require template file, but generating ranking files does.
python -m epurger tras \
-m CNN_Logsy_DeepLog_RobustLog \
-d data/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ \
-t data/HDFS.log-par/HDFS_templates.csvOutputs:
- models:
workspace/models/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ - SHAP results:
workspace/shap/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/
Key parameters:
-m, --model: model list joined by_, for exampleCNN_Logsy_DeepLog_RobustLog.-d, --data: preprocessed dataset directory.-e, --epochs: training epochs (default follows internal config).-t, --template: template file path; used for ranking generation after explanation.-l, --limit: max data size for explanation stage.
Because paths differ across environments, here is a practical multi-model example.
Prepare a copy of anomaly_label.csv in data/HDFS.log-par/ before purging.
nohup python -m epurger prg \
-f data/HDFS.log-par/HDFS_structured.csv \
-r workspace/shap/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ab852d2e-CNN-16384-20230724-175732,8ad0b317-Logsy-16384-20230724-184401,8a5adec2-DeepLog-16384-20230724-175216 \
-m workspace/models/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ab852d2e-CNN,8ad0b317-Logsy,8a5adec2-DeepLog \
--dataset HDFS \
-b 5 \
-s 30 \
--recursive 1 \
-t data/HDFS.log-par/HDFS_templates.csv \
> nohup.out &This command purges logs and retrains/evaluates model pairs at purge ratios 5%, 10%, 15%, 20%, 25%, 30%.
Key parameters:
-f, --file: structured log file to purge.-r, --ranking: ranking result directory (single) or comma-separated directories (multiple).-m, --model: corresponding base model directory (single) or comma-separated directories.-s, --separator: purge separator value. With default percentage mode, it is treated as a purge ratio (%).-b, --bottom: lower bound separator in recursive mode.--recursive: enable recursive purge frombottomtoseparator.-t, --template: template file used for preprocessing purged data.--dataset: dataset type (HDFS/Raptor).--percentage: whether separator is percentage (defaultTrue).--reversed: reverse purge direction.--both: execute both directions.-g, --grouped: enable grouped random purge policy.
You can evaluate a specified model on a given dataset as follows:
python -m epurger evl \
-m workspace/models/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/31b7ce21-Logsy \
-d data/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/Key parameters:
-m, --model: model directory to evaluate.-d, --data: dataset directory for evaluation.-k, --topk: top-k for next-event based evaluation logic.--directory: if set, evaluate a directory of models instead of a single model.
epurger is used to explain log-based anomaly detection models and mine feature significance.
Main subcommands:
parse (par)preprocess (pre)train (tra)/trains (tras)explain (exp)/explains (exps)rank (rnk)purge (prg)purgeDraw (prgD)/purgeDraws (prgDs)evaluate (evl)
Use the command below for complete options and latest argument details:
python -m epurger -h




















