MD-ViSCo is a unified deep learning framework for converting vital sign waveforms (ECG, PPG, ABP) using a single model. It combines a 1D U-Net with a Swin Transformer using AdaIN for waveform style adaptation, and integrates patient demographic information via text embeddings for enhanced predictions.
Published in IEEE Journal of Biomedical and Health Informatics (2026).
- Features
- Requirements
- Installation
- Tutorials
- How-To Guides
- Reference
- Explanation
- Contributing
- Citation
- License
- Acknowledgments
- Multi-directional conversion between ECG, PPG, ABP, and IMP waveforms
- Unified model architecture for all conversion tasks
- Demographic integration for enhanced predictions (PulseDB only)
- Multiple baseline implementations (MD-ViSCo, NABNet, PatchTST, PPG2ABP, P2E-WGAN, WaveNet)
- Blood pressure estimation from PPG/ECG signals
- Feature extraction capabilities for physiological analysis
- Atrial fibrillation classification from ECG using MIMIC PERform AF dataset
- Python 3.10
- CUDA 12.1+ (optional, for GPU support)
- Conda or Miniconda
-
Clone the repository:
git clone https://github.com/fr-meyer/MD-ViSCo cd MD-ViSCo -
Create conda environment:
conda env create -f environment.yml
-
Activate environment:
conda activate mdvisco
-
Verify installation:
Check CUDA/GPU support (optional):
python -c "import torch; print(torch.cuda.is_available())"For CPU-only training, unset
CUDA_VISIBLE_DEVICESand use--nproc_per_node=1.Verify dataset paths:
- Check
dataset_path,dataset_folder, andfile_nameintrain_dataset/test_datasetconfigs undersrc/conf/ - Ensure preprocessing was run and the output file exists at the configured location
- Check
This section provides a minimal end-to-end flow to get you from raw data to a trained and evaluated model. For more detailed and task-specific instructions, see How-To Guides.
Prepare the dataset once by converting the raw files into the HDF5 format expected by the training and evaluation pipelines:
# PulseDB: preprocessor=pulsedb_preprocessing, input_file=path/to/PulseDB_data.mat
# UCI: preprocessor=uci_preprocessing, input_file=path/to/UCI_data.h5
python -m src.script.preprocess.preprocess \
preprocessor=pulsedb_preprocessing \
preprocessor.input_file=path/to/PulseDB_data.mat \
preprocessor.output_file=path/to/output/directory/processed_pulsedb.h5Train a baseline model on PulseDB. You can later swap train_dataset / test_dataset, model, or trainer for other experiments:
# Single GPU: --nproc_per_node=1
# Multi-GPU: --nproc_per_node=2 (or number of GPUs)
torchrun --standalone --nproc_per_node=1 --module src.train -m \
train_dataset=train_pulsedb \
test_dataset=test_pulsedb \
model=patchtst \
trainer=approximation_trainer_patchtstRun evaluation on the test split using the saved checkpoint.
torchrun --standalone --nproc_per_node=1 --module src.test -m \
evaluator=waveform_reconstruction_evaluator \
test_dataset=test_pulsedb \
model=patchtst \
evaluator.checkpoint_epoch=100This section focuses on task-oriented recipes: pick the guide that matches the problem you want to solve.
Train models to convert between waveforms (e.g., PPG→ABP, ECG→PPG):
# MD-ViSCo: model=mdvisco_approximation, trainer=approximation_trainer_mdvisco
# PatchTST: model=patchtst, trainer=approximation_trainer_patchtst
# NABNet: model=nabnet_approximation, trainer=approximation_trainer_nabnet
torchrun --standalone --nproc_per_node=1 --module src.train -m \
train_dataset=train_pulsedb \
test_dataset=test_pulsedb \
model=patchtst \
trainer=approximation_trainer_patchtstTrain models to predict BP scalars (SBP/DBP) from waveforms:
# PPG→BP: trainer.directions=ppg2bp
# ECG→BP: trainer.directions=ecg2bp
# PPG+ECG→BP: trainer.directions=ppg2bp_ecg2bp, trainer=refinement_trainer_mdvisco
torchrun --standalone --nproc_per_node=1 --module src.train -m \
trainer=refinement_trainer_nabnet \
trainer.directions=ppg2bp \
train_dataset=train_pulsedbTrain AF classification models on the MIMIC PERform AF dataset:
torchrun --standalone --nproc_per_node=1 --module src.train -m \
trainer=classification_trainer \
train_dataset=train_mimic_perform_af_1024Evaluate trained models on test datasets:
# Waveform reconstruction: evaluator=waveform_reconstruction_evaluator
# Blood pressure: evaluator=blood_pressure_evaluator, test_dataset=test_pulsedb_refinement_bp
torchrun --standalone --nproc_per_node=1 --module src.test -m \
evaluator=waveform_reconstruction_evaluator \
test_dataset=test_pulsedb \
model=patchtst \
evaluator.checkpoint_epoch=100Common overrides: evaluator.direction_mode=single, evaluator.input_preprocessing.source.vital=ppg, model@evaluator.model=...
Note: Checkpoint paths include training parameters (batch_size, num_epochs, learning_rate, seed). When evaluating, these must match the training configuration used to create the checkpoint.
Demographic information (age, gender, height, weight, BMI) is available only for PulseDB dataset and can improve predictions:
# With demographics: model.use_demographics=true, model.num_demographic_channels=5
# Without demographics: model.use_demographics=false
torchrun --standalone --nproc_per_node=1 --module src.train -m \
model=patchtst \
model.use_demographics=true \
model.num_demographic_channels=5 \
trainer=refinement_trainer_patchtst \
train_dataset=train_pulsedb_refinement_bpExtract physiological features from waveforms:
# ECG features: processor=waveform_processor_ecg_features, evaluator.directions=ppg2ecg
# PPG features: processor=waveform_processor_ppg_features, evaluator.directions=ecg2ppg
torchrun --standalone --nproc_per_node=1 --module src.test -m \
evaluator=feature_extraction_evaluator \
processor=waveform_processor_ecg_features \
test_dataset=test_uci \
evaluator.directions=ppg2ecg \
evaluator.checkpoint_epoch=100
# Feature analysis
python -m src.script.features.feature_analysis \
feature_analysis.gt_features_file=path/to/features/DATASET/ground_truth/seed_SEED/features_DIRECTION.h5 \
feature_analysis.model_name=mdvisco \
feature_analysis.seed=42 \
feature_analysis.direction=PPG2ECG \
feature_analysis.dataset_name=PulseDBUse CLI overrides for quick parameter tuning:
torchrun --standalone --nproc_per_node=1 --module src.train -m \
train_dataset=train_uci \
test_dataset=test_uci \
model=patchtst \
model.d_model=256 \
model.num_encoder_layers=6 \
trainer=approximation_trainer_patchtst \
trainer.optimizer.lr=0.0005 \
trainer.num_epochs=150MD-ViSCo uses Hydra with ConfigStore for type-safe, composable configuration management. Configurations in src/conf/ are organized by concern: model/, processor/, train_dataset/, test_dataset/, trainer/, criterion/, optimizer/, scheduler/, early_stopping/, directions/.
- InterpolationKeyError: A referenced key like
${train_dataset.input_size}cannot be resolved. Solution: Always specifytrain_datasetandtest_datasetwhen training. - Shape Mismatch Errors: Occur when
input_lengthornum_targetsare manually overridden. Solution: Let Hydra interpolation handle these values automatically from dataset configs; don't manually setmodel.input_lengthormodel.num_targets.
-
MD-ViSCo (proposed): Unified model combining 1D U-Net with Swin Transformer and AdaIN for multi-directional vital sign waveform conversion. Based on: Swin Transformer | Swin-Unet | PatchTST Time Series | PatchTST BP Estimation
-
NABNet: Baseline model for vital sign conversion. Note: On MIMIC PERform Large, NABNet requires overrides:
model.model_depth=5,model.model_width=32,model.attention_type=lstm. Paper | Code -
PatchTST: Time series transformer baseline with optional demographic fusion. Paper | PatchTST BP Estimation | Code | Docs
-
PPG2ABP: Baseline models (UNetDS64, MultiResUNet1D). Paper | Code
-
P2E-WGAN: Generative adversarial network baseline. Paper | Code
-
WaveNet: WaveNet architecture for waveform generation. Paper | Code
The same model config can be reused across datasets via Hydra variable interpolation. Changing train_dataset automatically adapts input_length, BP normalization bounds, and processor configuration.
-
PulseDB: Large, cleaned dataset based on MIMIC-III and VitalDB for benchmarking cuff-less blood pressure estimation methods. Includes demographics (
age_raw,gender_raw,height_raw,weight_raw,bmi_raw). Input length: 1280, BP bounds:dbp_min=2.34,sbp_max=286.58. Paper | Repository -
UCI: Cuff-Less Blood Pressure Estimation dataset from UCI Machine Learning Repository. Input length: 1024, BP bounds:
dbp_min=50.0,sbp_max≈189.98. Repository | Preprocessing -
MIMIC PERform AF: Dataset for atrial fibrillation classification tasks. Contains AF labels required for AF Classifier models. Paper | Repository
-
MIMIC PERform Large: Large-scale dataset for vital sign analysis. Note: Lacks ABP waveforms, so refinement models (which require ABP for BP prediction) cannot be used. Approximation models can convert between ECG, PPG, and IMP waveforms. Paper | Repository
MD-ViSCo combines:
- 1D U-Net: Encoder-decoder architecture for waveform processing
- Swin Transformer: Hierarchical vision transformer for feature extraction
- AdaIN: Adaptive instance normalization for waveform style adaptation
- Text Embeddings: Patient demographic information integration (PulseDB only)
MD-ViSCo, PatchTST, NABNet:
- Stage 1: Produces normalized ABP waveform from source signals (PPG/ECG)
- Stage 2: Predicts SBP/DBP scalars for unscaling the normalized waveform to mmHg
- Checkpoint loading: Separate managers for stage1 (approximation) and stage2 (refinement)
PPG2ABP, WaveNet, P2E-WGAN:
- Direct ABP waveform output: No separate scaling step required
- Cascade architecture: Stage1 approximation → Stage2 refinement
- Checkpoint loading: Standard single-model manager
Before submitting changes, ensure your code passes formatting, linting, and type checks:
ruff format src/
ruff check src/ --fix
pyright src/- Issues: Report bugs or request features via GitHub Issues
- Discussions: Ask questions and share ideas in GitHub Discussions
- Pull requests: Open a pull request against the appropriate branch. Ensure changes pass formatting and type checks
If you use MD-ViSCo in your research, please cite:
@ARTICLE{11366001,
author={Meyer, Franck and Hur, Kyunghoon and Choi, Edward},
journal={IEEE Journal of Biomedical and Health Informatics},
title={MD-ViSCo: A Unified Model for Multi-Directional Vital Sign Waveform Conversion},
year={2026},
volume={},
number={},
pages={1-15},
doi={10.1109/JBHI.2025.3639315},
ISSN={2168-2208},
url={https://ieeexplore.ieee.org/document/11366001}
}This project is licensed under the MIT License. See the LICENSE file for the full text.
-
GenHPF Framework: The code architecture and organization is based on GenHPF: General Healthcare Predictive Framework for Multi-Task Multi-Source Learning (code)
-
pyPPG: PPG feature extraction uses pyPPG: A Python toolbox for comprehensive photoplethysmography signal analysis (code)
-
NeuroKit2: ECG feature extraction uses NeuroKit2: A Python toolbox for neurophysiological signal processing (code)