Skip to content

feat: EEGLAB Community Assistant (Epic #97)#124

Merged
neuromechanist merged 12 commits into
developfrom
epic/issue-97-eeglab
Jan 28, 2026
Merged

feat: EEGLAB Community Assistant (Epic #97)#124
neuromechanist merged 12 commits into
developfrom
epic/issue-97-eeglab

Conversation

@neuromechanist

@neuromechanist neuromechanist commented Jan 27, 2026

Copy link
Copy Markdown
Member

Closes #97

Epic Summary

Complete implementation of the EEGLAB community assistant with comprehensive knowledge base integration.

Phases Completed

Phase 1: Basic Community Setup (#99)

  • Community configuration (repos, docs, mailing lists)
  • YAML-based registry system
  • Standard knowledge tools (GitHub, papers, docs)

Phase 1.5: YAML Testing Framework (#111)

  • Generic YAML-based tests for ALL communities
  • Automatic test coverage for new communities
  • No community-specific test code required

Phase 2: Docstring Extraction (#115)

  • Generic docstring extraction for MATLAB and Python
  • Function signature parsing
  • GitHub source linking
  • FTS5 full-text search

Phase 3: Mailing List FAQ Agent (#101, #121)

  • Mailman archive scraper (2004-present)
  • LLM-based FAQ summarization
  • Quality scoring and categorization
  • Thread linking

Phase 4: Integration & Testing (#102, #123)

  • Plugin tools (search_eeglab_docstrings, search_eeglab_faqs)
  • Comprehensive integration tests (19 tests, 100% coverage)
  • User and developer documentation
  • All PR review issues addressed

Changes

New Files:

  • src/assistants/eeglab/config.yaml - Community configuration
  • src/assistants/eeglab/tools.py - Plugin tools
  • src/knowledge/docstring_sync.py - Docstring extraction
  • src/knowledge/matlab_parser.py - MATLAB parser
  • src/knowledge/python_parser.py - Python parser
  • src/knowledge/mailman_sync.py - Mailing list scraper
  • src/knowledge/faq_summarizer.py - FAQ generation
  • .context/eeglab-developer-guide.md - Developer docs
  • .context/eeglab-user-guide.md - User docs
  • tests/test_assistants/test_eeglab_integration.py - Integration tests
  • tests/test_assistants/test_community_yaml_generic.py - Generic YAML tests

Modified Files:

  • src/knowledge/db.py - Schema for docstrings and FAQs
  • src/knowledge/search.py - Search functions
  • Various CLI and sync utilities

Testing

  • ✅ Generic YAML-based tests (automatically covers EEGLAB + all communities)
  • ✅ 19 EEGLAB-specific integration tests passing
  • ✅ 100% coverage on plugin tools
  • ✅ All review issues addressed
  • ✅ Real database operations (no mocks)

Documentation

  • ✅ User guide with examples
  • ✅ Developer guide with architecture
  • ✅ Sync workflow documentation
  • ✅ Troubleshooting guide

neuromechanist and others added 12 commits January 27, 2026 08:02
* Make langfuse optional to fix Python 3.14 compatibility

- Move langfuse from dependencies to optional-dependencies[observability]
- Add lazy import with try/except in get_langfuse_handler()
- Provides helpful warning if langfuse not installed
- Fixes Pydantic v1 compatibility issues on Python 3.14
- Relates to #108

* feat: add EEGLAB community assistant configuration

- Create config.yaml with complete EEGLAB assistant setup
- 25+ documentation sources from sccn.github.io
- 6 GitHub repos (eeglab, ICLabel, clean_rawdata, EEG-BIDS, LSL)
- 3 core paper DOIs and 6 citation queries
- Custom system prompt with EEG workflow guidance
- Auto-generated knowledge tools (docs, discussions, papers)
- Add comprehensive unit tests (26 tests, all passing)

Follows HED assistant pattern for Phase 1 implementation.

* docs: add Phase 1 implementation summary

* fix: address PR review findings

- Fix DOI typo in summary (ffinf -> fninf)
- Fix line counts (340 lines, not 504)
- Remove duplicate doc entry (Installation/quickstart same URL)
- Update doc counts (26 total: 2 preloaded + 24 on-demand)

All review issues addressed, tests passing.

* docs: add local testing and epic branch workflow guides

- Add comprehensive local testing guide for backend/CLI
- Add epic branch workflow for multi-phase features
- Add quick test script for verification

* docs: update epic workflow to use worktrees

- Epic branch should be a worktree, not in main repo
- Update all instructions to reflect worktree structure
- Add worktree management section

* Simplify EEGLAB tests with shared fixtures

- Consolidate fixtures at module level (setup_registry, eeglab_config, eeglab_assistant)
- Remove duplicate fixture definitions across test classes
- Reduce test file from 449 to 317 lines (29% reduction)
- Improve test readability and maintainability
- All 26 tests still pass

* fix: address all PR review findings

**Critical Fixes:**
- Fix 23/26 broken documentation URLs to match actual SCCN GitHub structure
- Move test_eeglab_interactive.py to scripts/ directory
- Fix documentation count inconsistencies (25 → 26 sources)

**Documentation Improvements:**
- Update config.yaml line count (340 → 334)
- Clarify preload limit is a guideline (2-3 recommended, not required)
- Mark knowledge base numbers as projections (~150+ issues, ~470+ PRs expected)
- Update doc titles to match actual files (e.g., channel spectra, ERP images)

**Test Enhancements (26 → 33 tests):**
- Add database error handling test (graceful degradation)
- Add SSRF validation test (security - rejects localhost/private IPs)
- Add GitHub repo format validation test (rejects invalid formats)
- Add system prompt substitution test (no unfilled placeholders)
- Add tool input validation tests (empty queries, long queries)
- Add preload handling test (creation without preload)

**All tests passing:** 33/33 ✅

Addresses review findings from code-reviewer, pr-test-analyzer, and comment-analyzer agents.
* feat: add generic YAML-based testing framework

Create unified test suite that automatically validates ALL communities:
- Parametrized tests run against HED, EEGLAB, and future communities
- No community-specific hardcoded values
- Validates configuration structure, metadata, URLs, repos, DOIs
- Auto-validates system prompt completeness and tool generation
- Includes slow tests for URL/GitHub accessibility validation

**Results:**
- 30 generic tests (all passing for structure validation)
- Works for both HED and EEGLAB without any community-specific code
- URL validation caught 10 broken EEGLAB URLs and 1 broken HED URL
- Eliminates need for ~30 hardcoded tests per community

**Test markers:**
- Fast tests (default): Structure and format validation
- Slow tests (-m slow): HTTP requests for URL/GitHub validation
- Security tests: SSRF protection (localhost/private IP detection)

Implements Phase 1.5.1 of #111

* fix: update broken documentation URLs for HED and EEGLAB

EEGLAB (10 URLs fixed):
- Extensions: others/EEGLAB_Extensions.md
- Re-referencing: rereferencing.md (lowercase)
- Resampling: resampling.md (lowercase)
- Channel rejection: Channel_rejection.md
- ICLabel: plugins/ICLabel/index.md
- clean_rawdata: plugins/clean_rawdata/index.md
- Scrolling data: Scrolling_data.md
- Selecting epochs: removed (doesn't exist)
- BIDS: plugins/EEG-BIDS/index.md
- LSL: README.rst

HED (1 URL fixed):
- HedAndEEGLAB: removed refs/heads, changed .md to .html

* refactor: remove hardcoded YAML tests, keep minimal behavioral tests

EEGLAB Phase 1 has no custom tools or unique behavioral logic.
All YAML configuration validation is now handled by the generic
test_community_yaml_generic.py parametrized tests.

Changes:
- Removed ~420 lines of hardcoded YAML value tests
- Kept 1 behavioral test confirming standard CommunityAssistant usage
- Added documentation explaining separation of concerns

HED already covered by generic tests (no test_hed_config.py existed).

Benefits:
- Reduced test duplication (90 tests, same coverage)
- Future communities get full coverage automatically
- Maintainable: single generic suite vs N duplicated suites

* feat: add --community flag to validate command

Extends 'osa validate' with community mode for full test suite validation.

Usage:
- File mode: osa validate <config_path>
  - YAML syntax, schema validation, env vars
- Community mode: osa validate --community <id>
  - Full pytest suite including URL accessibility, GitHub repos
  - Shows community info before running tests
  - Supports --verbose flag for detailed output

Features:
- Auto-discovers available communities
- Lists available communities if ID not found
- Prevents using both modes together
- Returns exit code 0 on success, 1 on failure
- Color-coded output for easy scanning

Examples:
  osa validate --community eeglab
  osa validate --community hed --verbose
  osa validate src/assistants/eeglab/config.yaml

Closes Phase 1.5.3 requirement from issue #111

* fix: address all PR review findings

Documentation fixes:
- Fix incorrect pytest marker usage instructions (use -m 'not slow' to skip)
- Add --verbose flag documentation to validate() docstring
- Enhance docstrings with implementation details (subprocess, registry clearing)
- Document MagicMock exception in testing guidelines

Code improvements:
- Use public registry.list_all() instead of private _assistants attribute
- Add behavioral tests for tool descriptions, system prompt content, tool callability
- Add configuration summary output to validate --community command

Testing enhancements:
- Add test_retrieve_docs_tool_description_includes_doc_list
- Add test_system_prompt_contains_actual_github_repos
- Add test_knowledge_tools_are_callable
- Document why registry clearing is needed at collection time

All tests passing (36/36 for generic tests, 86/86 overall excluding slow HTTP tests)
* chore: bump version to 0.5.3.dev0

* fix: comprehensive error handling improvements for streaming

Fixes all critical and important issues from PR review #107:

**Critical Fixes:**
1. Fix message state corruption on streaming errors
   - Track message indices explicitly to avoid race conditions
   - Prevent wrong messages from being removed on error
   - Properly handle streaming vs non-streaming error paths

2. Improve saveHistory() error handling
   - Distinguish error types (QuotaExceededError, SecurityError)
   - Provide actionable user feedback for each error type
   - Re-throw errors so callers know save failed
   - Log errors with structured context

3. Fix backend exception masking
   - Add specific handlers for ValueError (input errors)
   - Preserve HTTPException for proper HTTP error codes
   - Add error IDs to all backend errors for support tracking
   - Include structured logging context

**Important Fixes:**
4. Add fetch timeout to streaming requests
   - 120 second timeout for connection + streaming
   - Prevents indefinite hangs on connection failures

5. Fix worker error detail loss
   - Check response.ok before processing
   - Pass through backend HTTP error codes (400, 403, 429, etc.)
   - Extract and forward backend error messages
   - Only categorize actual network/proxy errors

6. Improve reader resource leak handling
   - Check if reader exists before releasing
   - Log cleanup failures (indicates serious issues)
   - Don't silently swallow releaseError

**Impact:**
- Prevents data loss from message corruption
- Better user feedback on storage failures
- Proper error categorization for debugging
- Support can track errors with error IDs
- No more indefinite hangs
- Users see actual backend errors (not generic 500)

All syntax checks pass. Ready for production.

* fix: preserve prompt caching in tool-bound models (#113)

* fix: preserve prompt caching when tools are bound to models

- Update CachingLLMWrapper.bind_tools() to wrap tool-bound models
- Add stream() and astream() methods with caching support
- Allow wrapping Runnable types (e.g., RunnableBinding) not just BaseChatModel
- Add comprehensive test suite for caching functionality

Fixes #104

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add comprehensive error handling and validation to CachingLLMWrapper

- Add logging to all wrapper methods for debugging
- Add validation to __init__ to prevent double-wrapping
- Add error handling to bind_tools with tool validation
- Add error handling to _add_cache_control with message validation
- Add error handling to stream/astream with input validation
- Add unit tests for all error conditions
- Add async tests for ainvoke and astream methods
- Coverage improved from 41% to 60% for litellm_llm.py

* fix: replace unsafe eval with safe AST-based calculator in tests

- Replace eval() with ast.parse() and safe operator mapping
- Only allow basic arithmetic operations (+, -, *, /, %, **)
- Prevents code injection in test calculator tool
- Maintains full test functionality

* fix: eliminate silent failures and improve error handling

Critical fixes addressing PR review:
- Replace silent message skipping with hard errors (ValueError)
- Replace silent None content fallback with fail-fast validation
- Replace overly broad exception catching with specific exception types
- Add type annotations to stream/astream methods (Iterator/AsyncIterator)
- Add comprehensive error logging to invoke/ainvoke/_generate/_agenerate
- Improve hasattr checks with callable() validation
- Add actionable guidance to NotImplementedError messages
- Change non-list input logging from DEBUG to WARNING (cost impact)

All tests still pass. Coverage remains 36% (error paths require real LLM tests).

* docs: improve docstrings for clarity and accuracy

Addressing documentation issues from PR review:

- Expand class docstring to explain nested wrapper chain
- Clarify bind_tools() two-step process (delegate then wrap)
- Update _add_cache_control() to document strict fail-fast validation
- Clarify is_cacheable_model() uses permissive heuristic
- Add clarifying comment to CACHEABLE_MODELS constant

All docstrings now accurately reflect implementation behavior.

* docs: document two-tier testing approach for NO MOCKS policy

Add comprehensive module docstring explaining:
- Why FakeListChatModel is used for unit tests (wrapper mechanics)
- Why real API calls are used for integration tests (LLM behavior)
- Clear separation between testing wrapper logic vs LLM responses

This addresses the code reviewer's recommendation to document the
exception to the NO MOCKS policy for wrapper mechanics testing.

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add generic docstring extraction tools

Add MATLAB and Python docstring parsers, sync system, and search
tools for indexing code documentation from any community repository.

Key components:
- MATLAB parser: regex-based extraction of function/script comments
- Python parser: AST-based extraction of docstrings
- Sync system: fetch files from GitHub and index docstrings
- Search: FTS5-powered docstring search with GitHub links
- CLI: 'osa sync docstrings' command with language filters
- Tools: LangChain tool factory for community integration

Tests: 28 new tests (parsers + integration)
Verified: 720 docstrings synced from sccn/eeglab

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: address PR review findings

Critical fixes:
- Add GitHub auth headers to avoid rate limiting
- Improve error handling with specific exception types
- Track and report failed files to users
- Make Python parser raise SyntaxError properly

Important fixes:
- Optimize method detection using parent map (O(n))
- Update FTS5 docs to clarify phrase-only search
- Document branch hardcoding limitation

Tests:
- Update test to expect SyntaxError
- Remove mocked error tests (violates NO MOCKS rule)
- All 71 tests passing

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* feat: implement mailing list FAQ agent (Phase 3)

Add generic, reusable tools to scrape Mailman archives and generate
searchable FAQ database using LLM summarization.

Database schema:
- Add mailing_list_messages table with FTS5 for raw message archive
- Add faq_entries table with FTS5 for LLM-generated FAQ summaries
- Add summarization_status table for tracking progress

Mailman scraper (src/knowledge/mailman_sync.py):
- HTML fetching with caching and rate limiting
- Parse year index, thread index, and message pages
- Convert HTML to markdown for clean storage
- Sync by year with progress tracking

FAQ summarization (src/knowledge/faq_summarizer.py):
- Two-stage approach: Haiku for scoring, Sonnet for summarization
- Quality scoring to filter valuable threads
- Cost estimation and tracking
- JSON-based summary extraction (question, answer, tags, category)

Search and tools:
- Add search_faq_entries function with FTS5
- Create FAQ search tool factory for LangChain integration
- Update create_knowledge_tools to include FAQ search

CLI commands:
- osa sync mailman: Sync mailing list archives
- osa sync faq: Generate FAQ summaries with LLM

Configuration:
- Add MailmanConfig class for community config
- Configure EEGLAB mailing list (eeglablist, 2004-present)

Tested:
- Database schema creation verified
- Mailman scraper tested with 2026 EEGLAB data (42 messages)
- FAQ search API verified
- Tool factory validated

Closes #101

* test: add comprehensive tests for Phase 3 mailing list FAQ tools

Add tool-centered tests (not community-specific) that validate:
- Mailman archive scraping (11 tests)
- FAQ summarization with LLM (18 tests)
- End-to-end pipeline integration (4 tests)

Total: 33 tests, all passing

Test coverage:
- faq_summarizer.py: 89%
- mailman_sync.py: 56%
- Full pipeline tested with mocked HTTP and LLM responses

Tests validate:
- HTML parsing (year index, thread index, messages)
- LLM quality scoring and summarization
- Cost estimation
- FTS5 search functionality
- Error handling and edge cases
- Complete scrape → store → search → summarize pipeline

Follows .rules/testing_guidelines.md:
- Dynamic tests (generic, reusable)
- No hardcoded community names
- Tests work for any Mailman-based mailing list

* fix: address critical PR review findings

- Replace broad exception catching with specific exception types
- Add proper error handling for database operations
- Return None vs 0.0 for LLM scoring errors to distinguish failures
- Implement basic thread reconstruction using subject normalization
- Remove manual thread_id workaround from E2E tests
- Add sqlite3 import for database error handling
- Improve error logging with structured context

Fixes critical issues identified in PR review:
- Thread reconstruction now works (subject-based grouping)
- Silent failures replaced with actionable errors
- Database errors separated from parsing errors
- LLM errors distinguishable from low-quality scores

All 33 Phase 3 tests passing.

* test: add comprehensive test coverage for PR review findings

- Add HTTP error scenario tests for mailman_sync
  - Test fetch failures (404, 500, timeout, network errors)
  - Test partial batch failures and malformed HTML handling

- Add database transaction safety tests
  - Test idempotency of sync operations
  - Test duplicate message handling
  - Test partial sync resumption
  - Test commit batching at boundaries

- Add LLM response variation tests for faq_summarizer
  - Test trailing commas, extra text, nested quotes
  - Test newlines, unicode characters, empty arrays
  - Test decimal format variations and verbose responses

All 122 Phase 3 tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* feat: add plugin tools for EEGLAB docstring and FAQ search

- Create src/assistants/eeglab/tools.py with two plugin tools
  - search_eeglab_docstrings: Search MATLAB/Python function docs
  - search_eeglab_faqs: Search 22 years of mailing list Q&A
- Update config.yaml to register plugin tools in extensions section
- Update system prompt with tool usage guidelines
- Tools use @tool decorator pattern for auto-discovery

* test: add comprehensive integration tests for EEGLAB assistant

- Test config loading and validation
- Test standard tool creation (discussions, papers, docs, recent)
- Test plugin tool loading (docstrings, faqs)
- Test system prompt includes tool references
- Test tool count (6 total: 4 standard + 2 plugin)
- Test plugin system integration
- Test individual tool implementations with empty DB
- All 13 tests passing, 3 skipped (require populated DB)

* docs: add comprehensive EEGLAB assistant documentation

- User guide: tool descriptions, example questions, tips, sync commands
- Developer guide: architecture, adding tools, maintenance, troubleshooting
- Covers all 6 tools (4 standard + 2 plugin)
- Documents sync workflows for all knowledge bases
- Includes performance monitoring and debugging tips

* fix: address all PR review issues (critical, important, nice-to-haves)

- Fix critical AttributeError bug in search_eeglab_docstrings tool
  - Changed from result.name/language/file_path to result.title/source/url
  - Updated docstring examples to match actual output format
- Standardize error messages with multi-line format and admin guidance
- Fix comment rot by changing '2004-2026' to 'since 2004' throughout
- Add performance context to benchmarks in developer guide
- Fix hardcoded paths in examples (/path/to/eeglab instead of ~/git/eeglab)
- Make test assertions resilient (check for required tools instead of count)
- Add populated_test_db fixture with sample docstring and FAQ entries
- Add 6 new tests for populated database scenarios
- All 19 tests passing with 100% coverage on tools.py

* docs: fix temporal references to prevent documentation rot

- Change '2004-2026' to 'since 2004' in config and user guide
- Change '22 years' to 'over 20 years' for longevity
- Add noqa comments for fixture side-effect patterns
- Add docstrings config section to YAML with branch per repo
- Store branch in docstrings table for correct GitHub URLs
- Separate docstring repos from issue/PR tracking repos
- EEGLAB now uses 'develop' branch, ICLabel uses 'master'

Resolves hardcoded 'main' branch issue for repos with different defaults.
- Add d.branch to SQL SELECT in search_docstrings (fixes KeyError)
- Add NULL fallback for branch column (backward compatibility)
- Remove broad Exception catches in sync loops (let bugs propagate)
- Update branch parameter help text to mention repo defaults

Issues addressed:
- Critical: SQL query missing branch column
- Critical: Broad exception catches hiding programming bugs
- Important: Misleading branch parameter documentation
- Add validation for FAQ category and answer length
- Validate category against allowed values, fallback to 'discussion'
- Truncate answers >10KB to prevent bloat
- Add test for branch parameter in GitHub URLs
- Add test for NULL branch fallback to 'main'

Issues addressed:
- Important: FAQ field validation (prevent XSS, handle oversized data)
- Test coverage: Branch validation tests (criticality 9/10)
- Add _migrate_db() to handle schema changes for existing databases
- Automatically add branch column to docstrings table on deployment
- Create comprehensive deployment checklist for dev environment
- Includes verification steps, smoke tests, and rollback plan

This ensures existing dev/prod databases are migrated smoothly when
the epic is merged to develop.
@neuromechanist neuromechanist merged commit da50aa0 into develop Jan 28, 2026
5 checks passed
@neuromechanist neuromechanist deleted the epic/issue-97-eeglab branch January 28, 2026 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant