Problem
Phase 2 AMICA on ThePresent alone is under-determined. With ~210 s × 100 Hz × ~120 channels:
- Samples available: 21,000
- Weights to learn: 128² = 16,384
- k-factor: 1.1 (samples per weight)
Standard ICA practice wants k ≥ 20. At k ≈ 1, AMICA cannot decompose the data; weights collapse toward identity and each "component" maps to a single electrode. Visual inspection confirms: the 3-subject AMICA outputs from PR #31 are essentially channel-level, not source-level. All 3 subjects hit reached_max_iter=true at iter 2000 with median RV 0.37-0.42, both consistent with the under-determined diagnosis.
Solution
Match the reference pipeline (~/Documents/git/HBN_BIDS_analysis/study_handy_scripts.m): concatenate multiple HBN tasks for ICA training, apply the resulting weights back to ThePresent-only for analysis. The analysis stays ThePresent-only; only the training set grows.
Locked decision: use the four passive-viewing movie tasks (DespicableMe, DiaryOfAWimpyKid, FunwithFractals, ThePresent), ≈11 min total per subject. Lifts k from 1.1 to ≈3.5 — still tight but the cleanest expansion that stays within "movie watching" ecological context.
Changes
src/matlab/phase2_amica.m: add IcaTasks opt (default ["DespicableMe","DiaryOfAWimpyKid","FunwithFractals","ThePresent"]). When numel(IcaTasks) > 1: ensure Phase 1 .set exists for each task (re-run phase1 if missing), load all, intersect channel sets, pop_mergeset, AMICA on merged, attach weights to the Task (ThePresent) .set.
src/matlab/+hbn/: new helper for the per-subject multi-task pre-merge step.
qa_amica.csv gains ica_tasks, ica_samples, k_factor columns.
params.json records IcaTasks and the per-subject samples used for ICA.
tests/matlab/test_phase2_smoke.m: pass IcaTasks=["ThePresent"] so the fixture-based smoke test (which only has ThePresent) still works.
Acceptance
- 3-subject re-run produces topographies that look component-like (bilateral / dipolar) rather than single-electrode hotspots.
- Median RV improves vs single-task pass (target <0.30 across most ICs, not just the variance-dominant top).
reached_max_iter=false for at least 2 of 3 subjects (convergence within budget at the higher k).
- eeg-qa-neuroscientist Phase 2 review: PASS or PASS-WITH-FINDINGS (no critical findings).
Refs
Problem
Phase 2 AMICA on ThePresent alone is under-determined. With ~210 s × 100 Hz × ~120 channels:
Standard ICA practice wants k ≥ 20. At k ≈ 1, AMICA cannot decompose the data; weights collapse toward identity and each "component" maps to a single electrode. Visual inspection confirms: the 3-subject AMICA outputs from PR #31 are essentially channel-level, not source-level. All 3 subjects hit
reached_max_iter=trueat iter 2000 with median RV 0.37-0.42, both consistent with the under-determined diagnosis.Solution
Match the reference pipeline (
~/Documents/git/HBN_BIDS_analysis/study_handy_scripts.m): concatenate multiple HBN tasks for ICA training, apply the resulting weights back to ThePresent-only for analysis. The analysis stays ThePresent-only; only the training set grows.Locked decision: use the four passive-viewing movie tasks (DespicableMe, DiaryOfAWimpyKid, FunwithFractals, ThePresent), ≈11 min total per subject. Lifts k from 1.1 to ≈3.5 — still tight but the cleanest expansion that stays within "movie watching" ecological context.
Changes
src/matlab/phase2_amica.m: addIcaTasksopt (default["DespicableMe","DiaryOfAWimpyKid","FunwithFractals","ThePresent"]). Whennumel(IcaTasks) > 1: ensure Phase 1.setexists for each task (re-run phase1 if missing), load all, intersect channel sets,pop_mergeset, AMICA on merged, attach weights to theTask(ThePresent).set.src/matlab/+hbn/: new helper for the per-subject multi-task pre-merge step.qa_amica.csvgainsica_tasks,ica_samples,k_factorcolumns.params.jsonrecordsIcaTasksand the per-subject samples used for ICA.tests/matlab/test_phase2_smoke.m: passIcaTasks=["ThePresent"]so the fixture-based smoke test (which only has ThePresent) still works.Acceptance
reached_max_iter=falsefor at least 2 of 3 subjects (convergence within budget at the higher k).Refs
.context/research.mdL92-94 (escalation path for low-IC subjects)