Skip to content

fr-meyer/MD-ViSCo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MD-ViSCo: A Unified Model for Multi-Directional Vital Sign Waveform Conversion

Python 3.10+ PyTorch Hydra Config License: MIT

MD-ViSCo is a unified deep learning framework for converting vital sign waveforms (ECG, PPG, ABP) using a single model. It combines a 1D U-Net with a Swin Transformer using AdaIN for waveform style adaptation, and integrates patient demographic information via text embeddings for enhanced predictions.

Published in IEEE Journal of Biomedical and Health Informatics (2026).

Table of Contents

Features

  • Multi-directional conversion between ECG, PPG, ABP, and IMP waveforms
  • Unified model architecture for all conversion tasks
  • Demographic integration for enhanced predictions (PulseDB only)
  • Multiple baseline implementations (MD-ViSCo, NABNet, PatchTST, PPG2ABP, P2E-WGAN, WaveNet)
  • Blood pressure estimation from PPG/ECG signals
  • Feature extraction capabilities for physiological analysis
  • Atrial fibrillation classification from ECG using MIMIC PERform AF dataset

Requirements

  • Python 3.10
  • CUDA 12.1+ (optional, for GPU support)
  • Conda or Miniconda

Installation

  1. Clone the repository:

    git clone https://github.com/fr-meyer/MD-ViSCo
    cd MD-ViSCo
  2. Create conda environment:

    conda env create -f environment.yml
  3. Activate environment:

    conda activate mdvisco
  4. Verify installation:

    Check CUDA/GPU support (optional):

    python -c "import torch; print(torch.cuda.is_available())"

    For CPU-only training, unset CUDA_VISIBLE_DEVICES and use --nproc_per_node=1.

    Verify dataset paths:

    • Check dataset_path, dataset_folder, and file_name in train_dataset / test_dataset configs under src/conf/
    • Ensure preprocessing was run and the output file exists at the configured location

Tutorials

This section provides a minimal end-to-end flow to get you from raw data to a trained and evaluated model. For more detailed and task-specific instructions, see How-To Guides.

Quick Start

1. Preprocess Dataset

Prepare the dataset once by converting the raw files into the HDF5 format expected by the training and evaluation pipelines:

# PulseDB: preprocessor=pulsedb_preprocessing, input_file=path/to/PulseDB_data.mat
# UCI: preprocessor=uci_preprocessing, input_file=path/to/UCI_data.h5
python -m src.script.preprocess.preprocess \
    preprocessor=pulsedb_preprocessing \
    preprocessor.input_file=path/to/PulseDB_data.mat \
    preprocessor.output_file=path/to/output/directory/processed_pulsedb.h5

2. Train a Model

Train a baseline model on PulseDB. You can later swap train_dataset / test_dataset, model, or trainer for other experiments:

# Single GPU: --nproc_per_node=1
# Multi-GPU: --nproc_per_node=2 (or number of GPUs)
torchrun --standalone --nproc_per_node=1 --module src.train -m \
    train_dataset=train_pulsedb \
    test_dataset=test_pulsedb \
    model=patchtst \
    trainer=approximation_trainer_patchtst

3. Evaluate

Run evaluation on the test split using the saved checkpoint.

torchrun --standalone --nproc_per_node=1 --module src.test -m \
    evaluator=waveform_reconstruction_evaluator \
    test_dataset=test_pulsedb \
    model=patchtst \
    evaluator.checkpoint_epoch=100

How-To Guides

This section focuses on task-oriented recipes: pick the guide that matches the problem you want to solve.

Training Models

Waveform Reconstruction

Train models to convert between waveforms (e.g., PPG→ABP, ECG→PPG):

# MD-ViSCo: model=mdvisco_approximation, trainer=approximation_trainer_mdvisco
# PatchTST: model=patchtst, trainer=approximation_trainer_patchtst
# NABNet: model=nabnet_approximation, trainer=approximation_trainer_nabnet
torchrun --standalone --nproc_per_node=1 --module src.train -m \
    train_dataset=train_pulsedb \
    test_dataset=test_pulsedb \
    model=patchtst \
    trainer=approximation_trainer_patchtst

Blood Pressure Prediction

Train models to predict BP scalars (SBP/DBP) from waveforms:

# PPG→BP: trainer.directions=ppg2bp
# ECG→BP: trainer.directions=ecg2bp
# PPG+ECG→BP: trainer.directions=ppg2bp_ecg2bp, trainer=refinement_trainer_mdvisco
torchrun --standalone --nproc_per_node=1 --module src.train -m \
    trainer=refinement_trainer_nabnet \
    trainer.directions=ppg2bp \
    train_dataset=train_pulsedb

AF Classification

Train AF classification models on the MIMIC PERform AF dataset:

torchrun --standalone --nproc_per_node=1 --module src.train -m \
    trainer=classification_trainer \
    train_dataset=train_mimic_perform_af_1024

Evaluating Models

Evaluate trained models on test datasets:

# Waveform reconstruction: evaluator=waveform_reconstruction_evaluator
# Blood pressure: evaluator=blood_pressure_evaluator, test_dataset=test_pulsedb_refinement_bp
torchrun --standalone --nproc_per_node=1 --module src.test -m \
    evaluator=waveform_reconstruction_evaluator \
    test_dataset=test_pulsedb \
    model=patchtst \
    evaluator.checkpoint_epoch=100

Common overrides: evaluator.direction_mode=single, evaluator.input_preprocessing.source.vital=ppg, model@evaluator.model=...

Note: Checkpoint paths include training parameters (batch_size, num_epochs, learning_rate, seed). When evaluating, these must match the training configuration used to create the checkpoint.

Using Demographics

Demographic information (age, gender, height, weight, BMI) is available only for PulseDB dataset and can improve predictions:

# With demographics: model.use_demographics=true, model.num_demographic_channels=5
# Without demographics: model.use_demographics=false
torchrun --standalone --nproc_per_node=1 --module src.train -m \
    model=patchtst \
    model.use_demographics=true \
    model.num_demographic_channels=5 \
    trainer=refinement_trainer_patchtst \
    train_dataset=train_pulsedb_refinement_bp

Feature Extraction

Extract physiological features from waveforms:

# ECG features: processor=waveform_processor_ecg_features, evaluator.directions=ppg2ecg
# PPG features: processor=waveform_processor_ppg_features, evaluator.directions=ecg2ppg
torchrun --standalone --nproc_per_node=1 --module src.test -m \
    evaluator=feature_extraction_evaluator \
    processor=waveform_processor_ecg_features \
    test_dataset=test_uci \
    evaluator.directions=ppg2ecg \
    evaluator.checkpoint_epoch=100

# Feature analysis
python -m src.script.features.feature_analysis \
    feature_analysis.gt_features_file=path/to/features/DATASET/ground_truth/seed_SEED/features_DIRECTION.h5 \
    feature_analysis.model_name=mdvisco \
    feature_analysis.seed=42 \
    feature_analysis.direction=PPG2ECG \
    feature_analysis.dataset_name=PulseDB

Custom Experiments

CLI Overrides (Quick Experiments)

Use CLI overrides for quick parameter tuning:

torchrun --standalone --nproc_per_node=1 --module src.train -m \
    train_dataset=train_uci \
    test_dataset=test_uci \
    model=patchtst \
    model.d_model=256 \
    model.num_encoder_layers=6 \
    trainer=approximation_trainer_patchtst \
    trainer.optimizer.lr=0.0005 \
    trainer.num_epochs=150

Configuration Structure

MD-ViSCo uses Hydra with ConfigStore for type-safe, composable configuration management. Configurations in src/conf/ are organized by concern: model/, processor/, train_dataset/, test_dataset/, trainer/, criterion/, optimizer/, scheduler/, early_stopping/, directions/.

Common Configuration Issues

  • InterpolationKeyError: A referenced key like ${train_dataset.input_size} cannot be resolved. Solution: Always specify train_dataset and test_dataset when training.
  • Shape Mismatch Errors: Occur when input_length or num_targets are manually overridden. Solution: Let Hydra interpolation handle these values automatically from dataset configs; don't manually set model.input_length or model.num_targets.

Reference

Supported Models

  • MD-ViSCo (proposed): Unified model combining 1D U-Net with Swin Transformer and AdaIN for multi-directional vital sign waveform conversion. Based on: Swin Transformer | Swin-Unet | PatchTST Time Series | PatchTST BP Estimation

  • NABNet: Baseline model for vital sign conversion. Note: On MIMIC PERform Large, NABNet requires overrides: model.model_depth=5, model.model_width=32, model.attention_type=lstm. Paper | Code

  • PatchTST: Time series transformer baseline with optional demographic fusion. Paper | PatchTST BP Estimation | Code | Docs

  • PPG2ABP: Baseline models (UNetDS64, MultiResUNet1D). Paper | Code

  • P2E-WGAN: Generative adversarial network baseline. Paper | Code

  • WaveNet: WaveNet architecture for waveform generation. Paper | Code

Datasets

The same model config can be reused across datasets via Hydra variable interpolation. Changing train_dataset automatically adapts input_length, BP normalization bounds, and processor configuration.

  • PulseDB: Large, cleaned dataset based on MIMIC-III and VitalDB for benchmarking cuff-less blood pressure estimation methods. Includes demographics (age_raw, gender_raw, height_raw, weight_raw, bmi_raw). Input length: 1280, BP bounds: dbp_min=2.34, sbp_max=286.58. Paper | Repository

  • UCI: Cuff-Less Blood Pressure Estimation dataset from UCI Machine Learning Repository. Input length: 1024, BP bounds: dbp_min=50.0, sbp_max≈189.98. Repository | Preprocessing

  • MIMIC PERform AF: Dataset for atrial fibrillation classification tasks. Contains AF labels required for AF Classifier models. Paper | Repository

  • MIMIC PERform Large: Large-scale dataset for vital sign analysis. Note: Lacks ABP waveforms, so refinement models (which require ABP for BP prediction) cannot be used. Approximation models can convert between ECG, PPG, and IMP waveforms. Paper | Repository

Explanation

Architecture Overview

MD-ViSCo combines:

  • 1D U-Net: Encoder-decoder architecture for waveform processing
  • Swin Transformer: Hierarchical vision transformer for feature extraction
  • AdaIN: Adaptive instance normalization for waveform style adaptation
  • Text Embeddings: Patient demographic information integration (PulseDB only)

Two-Stage vs Single-Stage Models

Two-Stage Scaling Models

MD-ViSCo, PatchTST, NABNet:

  • Stage 1: Produces normalized ABP waveform from source signals (PPG/ECG)
  • Stage 2: Predicts SBP/DBP scalars for unscaling the normalized waveform to mmHg
  • Checkpoint loading: Separate managers for stage1 (approximation) and stage2 (refinement)

Single-Stage Models

PPG2ABP, WaveNet, P2E-WGAN:

  • Direct ABP waveform output: No separate scaling step required
  • Cascade architecture: Stage1 approximation → Stage2 refinement
  • Checkpoint loading: Standard single-model manager

Contributing

Code Quality

Before submitting changes, ensure your code passes formatting, linting, and type checks:

ruff format src/
ruff check src/ --fix
pyright src/

How to Contribute

  • Issues: Report bugs or request features via GitHub Issues
  • Discussions: Ask questions and share ideas in GitHub Discussions
  • Pull requests: Open a pull request against the appropriate branch. Ensure changes pass formatting and type checks

Citation

If you use MD-ViSCo in your research, please cite:

@ARTICLE{11366001,
  author={Meyer, Franck and Hur, Kyunghoon and Choi, Edward},
  journal={IEEE Journal of Biomedical and Health Informatics},
  title={MD-ViSCo: A Unified Model for Multi-Directional Vital Sign Waveform Conversion},
  year={2026},
  volume={},
  number={},
  pages={1-15},
  doi={10.1109/JBHI.2025.3639315},
  ISSN={2168-2208},
  url={https://ieeexplore.ieee.org/document/11366001}
}

License

This project is licensed under the MIT License. See the LICENSE file for the full text.

Acknowledgments