ddsp2sc

Train a DDSP (Differentiable Digital Signal Processing) model on your own audio, export its learned spectral controls, and sonify / transform them in SuperCollider through a graphical control panel.

This repo bridges Google's DDSP training stack and a custom SuperCollider performance environment: DDSP learns how a sound is built (harmonics + filtered noise); Python extracts and reshapes that data; SuperCollider plays it back with real-time pitch, stretch, bounce, and mix controls via GUI.

What is DDSP?

DDSP (Engel et al., ICLR 2020) is a neural audio synthesis framework. Instead of generating raw waveforms directly, a small RNN predicts control parameters for classical DSP building blocks:

Component	Role
Harmonic oscillator bank	Up to 60 sinusoidal partials with time-varying frequency and amplitude
Filtered noise	65 STFT-like noise bands shaping the noisy / breathy / percussive part of the timbre
Reverb	Trainable room effect (solo-instrument model)

The model is trained on short audio chunks (default 4 s at 16 kHz). At inference, it decodes f0, loudness, and spectral envelopes frame-by-frame, then resynthesizes audio through the harmonic + noise processors.

In this project, the solo-instrument architecture is defined in ginn/models/solo_instrument.gin:

RnnFcDecoder.output_splits = (('amps', 1),
                              ('harmonic_distribution', 60),
                              ('noise_magnitudes', 65))

The goal is not only to reproduce training audio, but to extract the control data and use it as a flexible spectral instrument in SuperCollider.

Pipeline overview

source_audio/*.wav
       │
       ▼  [Docker + DDSP training]
training_out/          ← TensorFlow checkpoints
       │
       ▼  [2-chunks2envelopes.py]
envelopes/*.npz        ← per-chunk harmonic + noise control data (+ WAV previews)
       │
       ▼  [optional: enrich_spectrum.py, modify_envelopes.py]
envelopes/*_enriched.npz
       │
       ▼  [3-envelopes2csv.py]
csv_exports/*_unified_*.csv   ← canonical format for SuperCollider
       │
       ▼  [SuperCollider GUI app]
real-time sonification, pitch/stretch/bounce, visualization

DDSP training outputs (important)

After training and envelope extraction, the repo produces three layers of data. Understanding these formats is central to using the system.

1. Checkpoints — `training/training_out/`

TensorFlow model checkpoints and gin operative configs (operative_config-*.gin). Used only for re-extracting envelopes; not consumed by SuperCollider directly.

2. Envelope archives — `training/envelopes/*_envelopes.npz`

The primary numpy export from a trained checkpoint. Each chunk (4 s of source audio) becomes one .npz file plus companion WAVs.

.npz arrays:

Key	Shape	Description
`frequency_envelopes`	`[n_samples, 60]`	Per-harmonic instantaneous frequency (Hz), sample-rate resolution
`amplitude_envelopes`	`[n_samples, 60]`	Per-harmonic amplitude
`noise_magnitudes_frames`	`[1, n_frames, 65]`	Filtered-noise band energies at model frame rate
`f0_hz_frames`	`[n_frames]`	Fundamental frequency per frame
`sample_rate`	scalar	e.g. 16000
`frame_rate`	scalar	model control rate, e.g. ~250 Hz

Companion WAV files (per chunk):

File	Content
`chunk_XX_1oscbank.wav`	Harmonics only
`chunk_XX_1noise.wav`	Noise only
`chunk_XX_1oscbank_noise.wav`	Harmonic + noise mix

These are useful for auditioning what DDSP learned before moving to SuperCollider.

3. Unified CSV — `training/csv_exports/_unified_.csv`

The canonical interchange format for SuperCollider. Harmonics and noise bands are merged into one file, synchronized by frame_index.

Metadata line (comment):

# frame_rate=250.57,sample_rate=44100

Columns:

frame_index,f0_hz,component_type,component_index,frequency,value
0,203.77,harmonic,0,203.77,0.000577
0,203.77,harmonic,1,407.53,0.006631
0,203.77,noise_band,0,123.08,0.001200

Column	Meaning
`frame_index`	Time frame (harmonics and noise aligned)
`f0_hz`	Fundamental at this frame
`component_type`	`harmonic` or `noise_band`
`component_index`	Partial index (0–59 harmonics, 0–64 noise bands)
`frequency`	Hz (instantaneous for harmonics; band center for noise)
`value`	Amplitude (harmonics) or magnitude (noise)

By default, 3-envelopes2csv.py downsamples from sample rate to frame rate to keep files manageable. Use --full-resolution or python/downsample_csv.py to tune size vs. fidelity.

See docs/CSV_FORMAT_GUIDE.md for full specification.

Modifying outputs (Python)

Between .npz and CSV export, several tools reshape the learned spectra:

Script	Purpose
`python/enrich_spectrum.py`	Extend 8 kHz DDSP bandwidth to full spectrum (44.1 kHz) via harmonic extrapolation and shaped noise
`python/modify_envelopes.py`	Rescale frequency range, resample duration, fix pitch, remap partials
`python/downsample_csv.py`	Shrink CSVs for faster SC experimentation (`--frame-step`, `--max-harmonics`, `--max-noise-bands`)
`python/single_batch_export.py`	Batch-combine multiple chunks into one unified CSV

Typical enrichment workflow:

cd training
python enrich_spectrum.py \
  --input envelopes/chunk_00_envelopes.npz \
  --output envelopes/chunk_00_enriched.npz

python 3-envelopes2csv.py --chunk-key envelopes/chunk_00_enriched

SuperCollider: GUI sonification app

The heart of the performance layer is sc/harmonic_noise_unified_controller_bouncing.scd (recommended). Evaluating this file opens a full graphical application — no need to edit code for everyday use.

Launch

Open SuperCollider.
Evaluate the entire .scd file (Cmd+Enter on the block).
Two windows appear:
- Unified Controller — control panel (sliders, buttons, file import)
- Unified Harmonic + Noise Visualization — live spectral plot

GUI features

The control panel is organized into sections:

Section	Controls
Data file	Import CSV button — load any unified CSV at runtime
Temperament	TET size (12 = semitones, 24 = quarter-tones, 53 = Turkish/Arabic, etc.)
Master controls	Master pitch (±48 steps), spectral stretch (0.25×–4×), anchor mode (center partial / fixed f0 / manual Hz)
Volume & mix	Harmonic volume, partial gain, noise volume, noise magnitude scale, control smoothing
Conductor	Start frame, frame count, update rate — read a window of the CSV on a clock
Sonification	Playback rate, Start / Stop
Bouncing	Timed spectral-stretch "bounce" with min/max scale, speed, jitter, duration
Utilities	Reset all, open visualization, preset stretch factors

All transforms apply in real time to both harmonics and noise bands in parallel. Pitch uses configurable equal temperament; stretch uses log-frequency scaling around a selectable anchor.

Optional modules

File	Role
`sc/harmonic_noise_unified_controller.scd`	Earlier version (hardcoded CSV path)
`sc/spectral_mapping.scd`	Map learned partials onto user-defined target spectra

Quick start with SuperCollider

# After CSV export:
cp training/csv_exports/chunk_00_unified_*.csv examples/

# In SuperCollider: evaluate sc/harmonic_noise_unified_controller_bouncing.scd
# Then click "Import CSV..." and select your file

Training workflow

Full step-by-step details: training/README.md.

# 1. Place .wav files in training/source_audio/

# 2. Start GPU Docker container
cd training
./0-start_docker.sh
# inside container:
./1-start_training.sh

# 3. Extract envelopes (host, with DDSP env)
# activate your DDSP environment
python 2-chunks2envelopes.py

# 4. Export unified CSV
python 3-envelopes2csv.py --all-chunks

# 5. Play in SuperCollider (see above)

Repository layout

ddsp2sc/
├── ginn/                  # DDSP gin configs (solo_instrument, datasets, eval)
├── training/              # Training workspace (Docker, scripts, outputs)
│   ├── source_audio/      # Your training WAVs
│   ├── prepared/          # TFRecord shards
│   ├── training_out/      # Checkpoints
│   ├── envelopes/         # .npz + preview WAVs
│   └── csv_exports/       # Unified CSVs for SuperCollider
├── python/                # Export, enrichment, modification utilities
├── sc/                    # SuperCollider GUI sonification apps
├── docs/                  # CSV format & downsampling guides
├── docker/                # Docker image definition
└── examples/              # Sample CSVs for SC (copy exports here)

Requirements

Training (Docker): TensorFlow + DDSP — provided by the training Docker image (training/0-start_docker.sh).

Export / manipulation (host):

pip install numpy scipy pandas soundfile

Sonification: SuperCollider 3.12+

Documentation

docs/CSV_FORMAT_GUIDE.md — unified CSV specification
docs/DOWNSAMPLE_README.md — reducing CSV size for experimentation
training/README.md — training directory layout and commands

Citation

DDSP:

@inproceedings{engel2020ddsp,
  title={DDSP: Differentiable Digital Signal Processing},
  author={Jesse Engel and Lamtharn (Hanoi) Hantrakul and Chenjie Gu and Adam Roberts},
  booktitle={ICLR},
  year={2020}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ddsp2sc

What is DDSP?

Pipeline overview

DDSP training outputs (important)

1. Checkpoints — `training/training_out/`

2. Envelope archives — `training/envelopes/*_envelopes.npz`

3. Unified CSV — `training/csv_exports/_unified_.csv`

Modifying outputs (Python)

SuperCollider: GUI sonification app

Launch

GUI features

Optional modules

Quick start with SuperCollider

Training workflow

Repository layout

Requirements

Documentation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docker		docker
docs		docs
ginn		ginn
python		python
sc		sc
training		training
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ddsp2sc

What is DDSP?

Pipeline overview

DDSP training outputs (important)

1. Checkpoints — training/training_out/

2. Envelope archives — training/envelopes/*_envelopes.npz

3. Unified CSV — training/csv_exports/*_unified_*.csv

Modifying outputs (Python)

SuperCollider: GUI sonification app

Launch

GUI features

Optional modules

Quick start with SuperCollider

Training workflow

Repository layout

Requirements

Documentation

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Checkpoints — `training/training_out/`

2. Envelope archives — `training/envelopes/*_envelopes.npz`

3. Unified CSV — `training/csv_exports/_unified_.csv`

Packages