Skip to content

aiopsplus/xload

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xLoad

Explainable Valuation of Log Data for Deep Learning Based Anomaly Detection

Workflow from paper PDF
Figure 1. End-to-end xLoad workflow.
SHAP aggregation from paper PDF
Figure 2. Aggregation of SHAP values for template-level relevance.

This repository provides the end-to-end pipeline to:

  • parse raw logs into templates,
  • train DL-based log anomaly detectors,
  • explain model inputs with SHAP,
  • rank template relevance,
  • distill (purge) low-relevance logs,
  • re-train and evaluate the resulting models.

Artifact Notes

  • The open dataset (HDFS) can be downloaded from LogHub.
  • The proprietary production dataset (Raptor) is not publicly released.

Why xLoad Is Stronger Than Random Truncation

  1. Keeps detection quality with less data
    Across representative DL models and two large-scale datasets, up to about 30% low-relevance logs can be removed while keeping anomaly detection metrics (Accuracy / Recall / F1) largely stable.

  2. Clearly better than random truncation
    Under the same truncation ratio, relevance-based truncation (xLoad) preserves F1 and Recall better than random truncation.
    The paper reports that random truncation causes a clearer metric decline, while xLoad remains more robust.

  3. Substantial training-time reduction
    With around 30% truncation, training-time reduction is often around 50% (model- and dataset-dependent).
    This benefit is shown consistently in the training-time table.

  4. Good temporal durability
    Models trained on distilled logs can stay effective over later time slices, reducing re-training frequency in practical settings.

HDFS template ranking from paper PDF
Figure 3. HDFS template relevance patterns across models.
Raptor top end ranking from paper PDF
Figure 4. Raptor top/bottom relevance ranking comparison.

Metrics

Performance curves under truncation (direct xLoad vs. random evidence)

HDFS DeepLog metric from paper PDF
Figure 5. HDFS / DeepLog.
HDFS CNN metric from paper PDF
Figure 6. HDFS / CNN.
HDFS RobustLog metric from paper PDF
Figure 7. HDFS / LogRobust.
HDFS Logsy metric from paper PDF
Figure 8. HDFS / Logsy.
Raptor DeepLog metric from paper PDF
Figure 9. Raptor / DeepLog.
Raptor CNN metric from paper PDF
Figure 10. Raptor / CNN.
Raptor RobustLog metric from paper PDF
Figure 11. Raptor / LogRobust.
Raptor Logsy metric from paper PDF
Figure 12. Raptor / Logsy.
Metric legend from paper PDF
Figure 13. Legend used in the performance metric figures.

Durability curves over time (model stability after distillation)

Durability 30 DeepLog from paper PDF
Figure 14. Durability (30% truncation) / DeepLog.
Durability 30 CNN from paper PDF
Figure 15. Durability (30% truncation) / CNN.
Durability 30 RobustLog from paper PDF
Figure 16. Durability (30% truncation) / LogRobust.
Durability 30 Logsy from paper PDF
Figure 17. Durability (30% truncation) / Logsy.
Durability 10 DeepLog from paper PDF
Figure 18. Durability (20% truncation) / DeepLog.
Durability 10 CNN from paper PDF
Figure 19. Durability (20% truncation) / CNN.
Durability 10 RobustLog from paper PDF
Figure 20. Durability (20% truncation) / LogRobust.
Durability 10 Logsy from paper PDF
Figure 21. Durability (20% truncation) / Logsy.
Durability legend from paper PDF
Figure 22. Legend used in the durability figures.

Project Structure

.
|-- README.md
|-- conf
|   |-- config.yaml        # Main xLoad settings
|   |-- drain3.ini         # Template mining settings (HDFS example)
|   `-- log.yaml           # Logging config
|-- epurger
|   |-- __main__.py        # CLI entry
|   |-- parser.py          # Template mining / parsing
|   |-- preprocessor.py    # Windowing, feature preparation, train/test split
|   |-- trainer.py         # Model training / evaluation
|   |-- explainer.py       # SHAP explanation
|   |-- purger.py          # Relevance-based log purging
|   |-- figure.py          # Figure generation utilities
|   `-- models             # DeepLog, LogRobust, CNN, Logsy implementations
`-- requirements.txt

Environment

  • Python 3.10 is recommended.
  • Conda environment is recommended.
  • While pure CPU execution is supported, NVIDIA GPUs with CUDA is optional but strongly recommended for optimal performance.
  • Runs on Windows, macOS, and Linux.

Install dependencies:

pip install -r requirements.txt

For LogRobust, download and place FastText vectors in data/pretrain:

Note: LogRobust can require very large RAM (about 50 GB free RAM on HDFS in our practice).

Reproduction Workflow (HDFS)

Example in HDFS dataset

Parse(par): Get structured logs

First, you need to download HDFS.log from LogHub, which is about 1.5GB.

python -m epurger par -s HDFS -i data/HDFS.log

Key parameters:

  • -s, --dataset: dataset type (HDFS or Raptor).
  • -i, --input: raw log file/directory path.
  • -o, --output: optional output directory for parsed logs. If omitted, default is <input>-par.

Preprocess(pre): Preprocess data and split train/test dataset

You need anomaly_label.csv (also from LogHub) to build labeled structured data.

python -m epurger pre -s HDFS -f data/HDFS.log-par/HDFS_structured.csv -l data/anomaly_label.csv

After this step, you will get a preprocessed dataset such as: data/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/

Key parameters:

  • -f, --file: structured log csv generated by par.
  • -l, --label: anomaly label file.
  • -tr, --test-ratio: test split ratio (default 0.2).
  • -a, --anomaly-ratio: anomaly ratio in training set (default 0.0).
  • -ta, --testAnomalyRatio: anomaly ratio in test set (default 1.0).
  • -t, --template: optional template file for filtering/preprocess.
  • --threshold: template filtering threshold.

Train(tra) and Explain(exp/exps): Train and explain models

When -t (template path) is provided in tras, trained models are automatically explained. SHAP value generation itself does not require template file, but generating ranking files does.

python -m epurger tras \
  -m CNN_Logsy_DeepLog_RobustLog \
  -d data/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ \
  -t data/HDFS.log-par/HDFS_templates.csv

Outputs:

  • models: workspace/models/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/
  • SHAP results: workspace/shap/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/

Key parameters:

  • -m, --model: model list joined by _, for example CNN_Logsy_DeepLog_RobustLog.
  • -d, --data: preprocessed dataset directory.
  • -e, --epochs: training epochs (default follows internal config).
  • -t, --template: template file path; used for ranking generation after explanation.
  • -l, --limit: max data size for explanation stage.

Purge(prg): Purge logs then train and evaluate models

Because paths differ across environments, here is a practical multi-model example. Prepare a copy of anomaly_label.csv in data/HDFS.log-par/ before purging.

nohup python -m epurger prg \
  -f data/HDFS.log-par/HDFS_structured.csv \
  -r workspace/shap/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ab852d2e-CNN-16384-20230724-175732,8ad0b317-Logsy-16384-20230724-184401,8a5adec2-DeepLog-16384-20230724-175216 \
  -m workspace/models/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/ab852d2e-CNN,8ad0b317-Logsy,8a5adec2-DeepLog \
  --dataset HDFS \
  -b 5 \
  -s 30 \
  --recursive 1 \
  -t data/HDFS.log-par/HDFS_templates.csv \
  > nohup.out &

This command purges logs and retrains/evaluates model pairs at purge ratios 5%, 10%, 15%, 20%, 25%, 30%.

Key parameters:

  • -f, --file: structured log file to purge.
  • -r, --ranking: ranking result directory (single) or comma-separated directories (multiple).
  • -m, --model: corresponding base model directory (single) or comma-separated directories.
  • -s, --separator: purge separator value. With default percentage mode, it is treated as a purge ratio (%).
  • -b, --bottom: lower bound separator in recursive mode.
  • --recursive: enable recursive purge from bottom to separator.
  • -t, --template: template file used for preprocessing purged data.
  • --dataset: dataset type (HDFS/Raptor).
  • --percentage: whether separator is percentage (default True).
  • --reversed: reverse purge direction.
  • --both: execute both directions.
  • -g, --grouped: enable grouped random purge policy.

Evaluation(evl): Evaluate models

You can evaluate a specified model on a given dataset as follows:

python -m epurger evl \
  -m workspace/models/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/31b7ce21-Logsy \
  -d data/HDFS.log-par/HDFS_structured_test_0.2_anomaly_0.0_0/

Key parameters:

  • -m, --model: model directory to evaluate.
  • -d, --data: dataset directory for evaluation.
  • -k, --topk: top-k for next-event based evaluation logic.
  • --directory: if set, evaluate a directory of models instead of a single model.

Command Line Manual Examples

epurger is used to explain log-based anomaly detection models and mine feature significance.

Main subcommands:

  • parse (par)
  • preprocess (pre)
  • train (tra) / trains (tras)
  • explain (exp) / explains (exps)
  • rank (rnk)
  • purge (prg)
  • purgeDraw (prgD) / purgeDraws (prgDs)
  • evaluate (evl)

Use the command below for complete options and latest argument details:

python -m epurger -h

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages