Skip to content

Add session workflow profiles and archetypes#94

Merged
ccf merged 5 commits intomainfrom
feat/session-workflow-fingerprints
Mar 15, 2026
Merged

Add session workflow profiles and archetypes#94
ccf merged 5 commits intomainfrom
feat/session-workflow-fingerprints

Conversation

@ccf
Copy link
Owner

@ccf ccf commented Mar 13, 2026

Summary

  • add persisted session workflow profiles with fingerprints, archetypes, and supporting workflow signals
  • derive and expose workflow profiles on session detail, and reuse them in project workflow and playbook flows when available
  • update the roadmap and add backend/frontend coverage plus migration round-trip validation

Validation

  • pytest -q
  • pytest -q tests/test_workflow_profile_service.py tests/test_ingest.py tests/test_analytics.py tests/test_project_workspace.py tests/test_engineer_profile.py
  • ruff check .
  • ruff format --check src/primer/common/models.py src/primer/common/schemas.py src/primer/server/services/ingest_service.py src/primer/server/services/project_workspace_service.py src/primer/server/services/workflow_patterns.py src/primer/server/services/workflow_playbook_service.py src/primer/server/services/workflow_profile_service.py tests/test_workflow_profile_service.py tests/test_analytics.py alembic/versions/44c9b01ccad2_add_session_workflow_profiles.py
  • cd frontend && npm run test -- src/components/sessions/tests/session-detail-panel.test.tsx src/pages/tests/session-detail.test.tsx
  • cd frontend && npm run build
  • PRIMER_DATABASE_URL=sqlite:////tmp/primer_workflow_roundtrip.db alembic upgrade head
  • PRIMER_DATABASE_URL=sqlite:////tmp/primer_workflow_roundtrip.db alembic downgrade -1
  • PRIMER_DATABASE_URL=sqlite:////tmp/primer_workflow_roundtrip.db alembic upgrade head
  • PRIMER_DATABASE_URL=sqlite:////tmp/primer_workflow_roundtrip.db alembic current

Note

Medium Risk
Adds a new persisted derived-data table and wires it into ingestion and analytics paths; mistakes could impact session ingest performance or produce incorrect workflow fingerprint/archetype analytics until backfilled.

Overview
Adds persisted workflow profiles for sessions. Introduces session_workflow_profiles (Alembic migration + SQLAlchemy model) to store workflow fingerprint/label, step sequence, archetype (+ source/reason), top tools, delegation count, and verification run count.

Derives and upserts profiles during ingest and surfaces them in product. Ingest now computes workflow_profile via new workflow_profile_service heuristics and includes it in SessionDetailResponse; the session detail UI renders a new “Workflow Profile” card when present.

Reuses stored profiles in analytics. Project workflow summary and workflow playbook generation prefer persisted profile steps/fingerprint_id/label and fall back to step inference when missing; infer_workflow_steps is expanded to incorporate execution evidence, recovery strategies, and change-shape mutation signals.

Written by Cursor Bugbot for commit 8ed9c3d. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

  • ✅ Fixed: Substring "doc" falsely matches "docker" as docs archetype
    • Replaced bare substring hint matching with compiled regex using word boundaries (\bdocs?\b), so "doc"/"docs" match as standalone words but not inside "docker" or "docstring".
  • ✅ Fixed: Substring "port" falsely matches common words like "import"
    • Replaced bare substring hint matching with compiled regex using word boundaries (\bport\b), so "port" matches as a standalone word but not inside "import", "export", or "support".
  • ✅ Fixed: Verification run count tallies types not actual runs
    • Changed verification_run_count to iterate over the full execution_evidence list instead of the deduplicated execution_types set, so it counts actual verification runs rather than distinct type categories.

Create PR

Or push these changes by commenting:

@cursor push ffd74e90d1
Preview (ffd74e90d1)
diff --git a/src/primer/server/services/workflow_profile_service.py b/src/primer/server/services/workflow_profile_service.py
--- a/src/primer/server/services/workflow_profile_service.py
+++ b/src/primer/server/services/workflow_profile_service.py
@@ -1,5 +1,6 @@
 from __future__ import annotations
 
+import re
 from collections import Counter
 from dataclasses import dataclass
 
@@ -25,8 +26,8 @@
     "refactoring": "refactor",
     "research": "investigation",
 }
-_DOC_TEXT_HINTS = ("doc", "docs", "documentation", "readme", "changelog", "guide")
-_MIGRATION_TEXT_HINTS = ("migrate", "migration", "upgrade", "modernize", "deprecat", "port")
+_DOC_TEXT_RE = re.compile(r"\b(?:docs?|documentation|readme|changelog|guide)\b")
+_MIGRATION_TEXT_RE = re.compile(r"\b(?:migrat\w*|upgrade|moderniz\w*|deprecat\w*|port)\b")
 
 
 @dataclass
@@ -94,8 +95,8 @@
     )
     verification_run_count = sum(
         1
-        for evidence_type in execution_types
-        if evidence_type in {"test", "lint", "build", "verification"}
+        for evidence in execution_evidence
+        if _string_attr(evidence, "evidence_type") in {"test", "lint", "build", "verification"}
     )
 
     if not (
@@ -250,7 +251,7 @@
 
 
 def _looks_like_docs(text: str, named_files: list[str]) -> bool:
-    if any(hint in text for hint in _DOC_TEXT_HINTS):
+    if _DOC_TEXT_RE.search(text):
         return True
     if not named_files:
         return False
@@ -263,7 +264,7 @@
 
 
 def _looks_like_migration(text: str, change_shape: object | None) -> bool:
-    if not any(hint in text for hint in _MIGRATION_TEXT_HINTS):
+    if not _MIGRATION_TEXT_RE.search(text):
         return False
     return (
         _int_attr(change_shape, "files_touched_count") >= 2

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Skill tools double-classified as both delegate and integrate
    • Removed 'skill' from _INTEGRATE_HINTS so Skill-prefixed tools are only classified as delegates (via classify_tool), not also as integrate tools via substring matching.

Create PR

Or push these changes by commenting:

@cursor push fada5769f4
Preview (fada5769f4)
diff --git a/src/primer/server/services/workflow_patterns.py b/src/primer/server/services/workflow_patterns.py
--- a/src/primer/server/services/workflow_patterns.py
+++ b/src/primer/server/services/workflow_patterns.py
@@ -7,7 +7,7 @@
 _EDIT_HINTS = ("edit", "write", "patch", "replace", "insert", "delete", "remove", "rename", "move")
 _EXECUTE_HINTS = ("bash", "terminal", "exec", "command")
 _DELEGATE_HINTS = ("task", "agent", "delegate", "team", "sendmessage", "send_message")
-_INTEGRATE_HINTS = ("mcp", "plugin", "skill")
+_INTEGRATE_HINTS = ("mcp", "plugin")
 
 
 def infer_workflow_steps(

@ccf
Copy link
Owner Author

ccf commented Mar 13, 2026

@cursor push fada576

Remove 'skill' from _INTEGRATE_HINTS to prevent Skill:-prefixed tools
from matching _is_integrate_tool via substring matching. These tools are
already correctly classified as delegates via classify_tool returning
'skill' in _is_delegate_tool.

Applied via @cursor push command
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Short "rg" hint causes false search tool classification
    • Removed "rg" from _SEARCH_HINTS since "ripgrep" already covers the intended case and the 2-character substring falsely matches tools containing common substrings like "merge" or "target".
  • ✅ Fixed: Delegation count misses Skill-prefixed tools unlike step detection
    • Added classify_tool check to _is_delegation_tool in workflow_profile_service.py so it matches the logic in _is_delegate_tool from workflow_patterns.py, ensuring Skill:xxx and orchestration tools are counted in delegation_count.

Create PR

Or push these changes by commenting:

@cursor push 75f092099f
Preview (75f092099f)
diff --git a/src/primer/server/services/workflow_patterns.py b/src/primer/server/services/workflow_patterns.py
--- a/src/primer/server/services/workflow_patterns.py
+++ b/src/primer/server/services/workflow_patterns.py
@@ -2,7 +2,7 @@
 
 from primer.common.tool_classification import classify_tool
 
-_SEARCH_HINTS = ("grep", "glob", "search", "find", "ripgrep", "rg")
+_SEARCH_HINTS = ("grep", "glob", "search", "find", "ripgrep")
 _READ_HINTS = ("read", "fetch", "open", "cat", "view")
 _EDIT_HINTS = ("edit", "write", "patch", "replace", "insert", "delete", "remove", "rename", "move")
 _EXECUTE_HINTS = ("bash", "terminal", "exec", "command")

diff --git a/src/primer/server/services/workflow_profile_service.py b/src/primer/server/services/workflow_profile_service.py
--- a/src/primer/server/services/workflow_profile_service.py
+++ b/src/primer/server/services/workflow_profile_service.py
@@ -4,6 +4,7 @@
 from collections import Counter
 from dataclasses import dataclass
 
+from primer.common.tool_classification import classify_tool
 from primer.server.services.workflow_patterns import (
     infer_workflow_steps,
     workflow_fingerprint_id,
@@ -324,7 +325,7 @@
 
 def _is_delegation_tool(tool_name: str) -> bool:
     normalized = tool_name.lower()
-    return any(
+    return classify_tool(tool_name) in {"orchestration", "skill"} or any(
         hint in normalized
         for hint in (
             "task",

@ccf
Copy link
Owner Author

ccf commented Mar 13, 2026

@cursor push 75f0920

…tion count with step detection

- Remove 'rg' from _SEARCH_HINTS; 'ripgrep' already covers the intended case
  and the 2-char substring falsely matches tools containing 'merge', 'target', etc.

- Add classify_tool check to _is_delegation_tool in workflow_profile_service
  so Skill:xxx and orchestration tools are counted in delegation_count,
  matching the logic used by _is_delegate_tool in workflow_patterns.

Applied via @cursor push command
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Migration regex matches common "port" network term
    • Changed regex from \bport\b to \bport(?:ing|ed)\b to only match verb forms relevant to code migration, avoiding false matches on the networking term "port".
  • ✅ Fixed: Docs heuristic too aggressive on text-only matching
    • Removed the text-only early return so that a text regex match now requires at least one documentation file in named_files to classify as docs archetype.

Create PR

Or push these changes by commenting:

@cursor push c3bb06aea5
Preview (c3bb06aea5)
diff --git a/src/primer/server/services/workflow_profile_service.py b/src/primer/server/services/workflow_profile_service.py
--- a/src/primer/server/services/workflow_profile_service.py
+++ b/src/primer/server/services/workflow_profile_service.py
@@ -28,7 +28,7 @@
     "research": "investigation",
 }
 _DOC_TEXT_RE = re.compile(r"\b(?:docs?|documentation|readme|changelog|guide)\b")
-_MIGRATION_TEXT_RE = re.compile(r"\b(?:migrat\w*|upgrade|moderniz\w*|deprecat\w*|port)\b")
+_MIGRATION_TEXT_RE = re.compile(r"\b(?:migrat\w*|upgrade|moderniz\w*|deprecat\w*|port(?:ing|ed))\b")
 
 
 @dataclass
@@ -252,8 +252,7 @@
 
 
 def _looks_like_docs(text: str, named_files: list[str]) -> bool:
-    if _DOC_TEXT_RE.search(text):
-        return True
+    text_match = bool(_DOC_TEXT_RE.search(text))
     if not named_files:
         return False
     doc_files = 0
@@ -261,6 +260,8 @@
         normalized = path.lower()
         if normalized.endswith((".md", ".mdx", ".rst", ".txt")) or "/docs/" in normalized:
             doc_files += 1
+    if text_match and doc_files > 0:
+        return True
     return doc_files > 0 and doc_files >= max(1, len(named_files) // 2)

"research": "investigation",
}
_DOC_TEXT_RE = re.compile(r"\b(?:docs?|documentation|readme|changelog|guide)\b")
_MIGRATION_TEXT_RE = re.compile(r"\b(?:migrat\w*|upgrade|moderniz\w*|deprecat\w*|port)\b")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration regex matches common "port" network term

Medium Severity

_MIGRATION_TEXT_RE includes \bport\b which matches the standalone word "port" — an extremely common technical term for network ports (e.g., "listen on port 3000", "bind to port 8080"). Combined with the weak secondary check in _looks_like_migration requiring only files_touched_count >= 2, this causes sessions about server configuration or networking to be falsely classified as migration archetype. Since migration is checked before debugging and feature delivery in _infer_archetype, affected sessions get the wrong archetype.

Additional Locations (1)
Fix in Cursor Fix in Web

@ccf ccf merged commit 7fe97cb into main Mar 15, 2026
6 checks passed
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Duplicated delegation tool classification function across files
    • Removed the duplicate _is_delegation_tool from workflow_profile_service.py and reused the canonical is_delegate_tool (renamed from _is_delegate_tool) imported from workflow_patterns.py.
  • ✅ Fixed: Overly broad .txt extension in docs file detection
    • Removed .txt from the doc file extension check in _looks_like_docs, keeping only reliably documentation-specific extensions (.md, .mdx, .rst) to prevent false positives on files like requirements.txt.

Create PR

Or push these changes by commenting:

@cursor push 7d768e86a8
Preview (7d768e86a8)
diff --git a/src/primer/server/services/workflow_patterns.py b/src/primer/server/services/workflow_patterns.py
--- a/src/primer/server/services/workflow_patterns.py
+++ b/src/primer/server/services/workflow_patterns.py
@@ -32,7 +32,7 @@
         steps.append("test")
     if recovery_strategies and recovery_strategies.intersection({"edit_fix", "revert_or_reset"}):
         steps.append("fix")
-    if any(_is_delegate_tool(name) for name in tool_names):
+    if any(is_delegate_tool(name) for name in tool_names):
         steps.append("delegate")
     if any(_is_integrate_tool(name) for name in tool_names):
         steps.append("integrate")
@@ -79,7 +79,7 @@
     return normalized == "bash" or any(hint in normalized for hint in _EXECUTE_HINTS)
 
 
-def _is_delegate_tool(name: str) -> bool:
+def is_delegate_tool(name: str) -> bool:
     normalized = _normalized_tool_name(name)
     return classify_tool(name) in {"orchestration", "skill"} or any(
         hint in normalized for hint in _DELEGATE_HINTS

diff --git a/src/primer/server/services/workflow_profile_service.py b/src/primer/server/services/workflow_profile_service.py
--- a/src/primer/server/services/workflow_profile_service.py
+++ b/src/primer/server/services/workflow_profile_service.py
@@ -4,9 +4,9 @@
 from collections import Counter
 from dataclasses import dataclass
 
-from primer.common.tool_classification import classify_tool
 from primer.server.services.workflow_patterns import (
     infer_workflow_steps,
+    is_delegate_tool,
     workflow_fingerprint_id,
     workflow_fingerprint_label,
 )
@@ -92,7 +92,7 @@
     label = workflow_fingerprint_label(fingerprint_type, steps) if fingerprint_id else None
     top_tools = [tool_name for tool_name, _count in tool_counts.most_common(4)]
     delegation_count = sum(
-        count for tool_name, count in tool_counts.items() if _is_delegation_tool(tool_name)
+        count for tool_name, count in tool_counts.items() if is_delegate_tool(tool_name)
     )
     verification_run_count = sum(
         1
@@ -258,7 +258,7 @@
     doc_files = 0
     for path in named_files:
         normalized = path.lower()
-        if normalized.endswith((".md", ".mdx", ".rst", ".txt")) or "/docs/" in normalized:
+        if normalized.endswith((".md", ".mdx", ".rst")) or "/docs/" in normalized:
             doc_files += 1
     if text_match and doc_files > 0:
         return True
@@ -324,21 +324,6 @@
     )
 
 
-def _is_delegation_tool(tool_name: str) -> bool:
-    normalized = tool_name.lower()
-    return classify_tool(tool_name) in {"orchestration", "skill"} or any(
-        hint in normalized
-        for hint in (
-            "task",
-            "agent",
-            "delegate",
-            "team",
-            "sendmessage",
-            "send_message",
-        )
-    )
-
-
 def _bool_attr(value: object | None, field: str) -> bool:
     if value is None:
         return False

"sendmessage",
"send_message",
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated delegation tool classification function across files

Low Severity

_is_delegation_tool in workflow_profile_service.py is functionally identical to _is_delegate_tool in workflow_patterns.py — both check classify_tool membership in {"orchestration", "skill"} and the same set of substring hints. Since workflow_profile_service.py already imports from workflow_patterns.py, this is pure duplication that risks the two implementations drifting apart.

Additional Locations (1)
Fix in Cursor Fix in Web

doc_files += 1
if text_match and doc_files > 0:
return True
return doc_files > 0 and doc_files >= max(1, len(named_files) // 2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad .txt extension in docs file detection

Low Severity

_looks_like_docs counts any file ending in .txt as a documentation file. Common non-doc files like requirements.txt or LICENSE.txt would trigger the docs archetype when they make up a majority of touched files — even without any text hints about documentation — because the final return path (doc_files >= max(1, len(named_files) // 2)) doesn't require text_match.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants