Skip to content

Phase 2 refinement: multi-task concat AMICA (improve k-factor) #33

@neuromechanist

Description

@neuromechanist

Problem

Phase 2 AMICA on ThePresent alone is under-determined. With ~210 s × 100 Hz × ~120 channels:

  • Samples available: 21,000
  • Weights to learn: 128² = 16,384
  • k-factor: 1.1 (samples per weight)

Standard ICA practice wants k ≥ 20. At k ≈ 1, AMICA cannot decompose the data; weights collapse toward identity and each "component" maps to a single electrode. Visual inspection confirms: the 3-subject AMICA outputs from PR #31 are essentially channel-level, not source-level. All 3 subjects hit reached_max_iter=true at iter 2000 with median RV 0.37-0.42, both consistent with the under-determined diagnosis.

Solution

Match the reference pipeline (~/Documents/git/HBN_BIDS_analysis/study_handy_scripts.m): concatenate multiple HBN tasks for ICA training, apply the resulting weights back to ThePresent-only for analysis. The analysis stays ThePresent-only; only the training set grows.

Locked decision: use the four passive-viewing movie tasks (DespicableMe, DiaryOfAWimpyKid, FunwithFractals, ThePresent), ≈11 min total per subject. Lifts k from 1.1 to ≈3.5 — still tight but the cleanest expansion that stays within "movie watching" ecological context.

Changes

  • src/matlab/phase2_amica.m: add IcaTasks opt (default ["DespicableMe","DiaryOfAWimpyKid","FunwithFractals","ThePresent"]). When numel(IcaTasks) > 1: ensure Phase 1 .set exists for each task (re-run phase1 if missing), load all, intersect channel sets, pop_mergeset, AMICA on merged, attach weights to the Task (ThePresent) .set.
  • src/matlab/+hbn/: new helper for the per-subject multi-task pre-merge step.
  • qa_amica.csv gains ica_tasks, ica_samples, k_factor columns.
  • params.json records IcaTasks and the per-subject samples used for ICA.
  • tests/matlab/test_phase2_smoke.m: pass IcaTasks=["ThePresent"] so the fixture-based smoke test (which only has ThePresent) still works.

Acceptance

  • 3-subject re-run produces topographies that look component-like (bilateral / dipolar) rather than single-electrode hotspots.
  • Median RV improves vs single-task pass (target <0.30 across most ICs, not just the variance-dominant top).
  • reached_max_iter=false for at least 2 of 3 subjects (convergence within budget at the higher k).
  • eeg-qa-neuroscientist Phase 2 review: PASS or PASS-WITH-FINDINGS (no critical findings).

Refs

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew featurematlabMATLAB / EEGLAB pipelinephaseIndividual phase of an epic

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions