Skip to content

Add AI-driven 3D scene editing via agent tools and chat panel#3913

Merged
georgi merged 3 commits into
mainfrom
claude/fervent-allen-x7hmqo
Jun 26, 2026
Merged

Add AI-driven 3D scene editing via agent tools and chat panel#3913
georgi merged 3 commits into
mainfrom
claude/fervent-allen-x7hmqo

Conversation

@georgi

@georgi georgi commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a comprehensive agent-tool interface for the 3D model editor, enabling the AI assistant to programmatically build and modify scenes. Introduces a new chat panel in the editor's left sidebar where users can converse with the assistant to drive scene creation and editing through natural language.

Key Changes

  • Model3D Tool Bridge (model3DToolBridge.ts): New module that registers a handler interface (Model3DToolHandler) exposing scene operations (list, add, select, delete, transform, visibility, rename, color, frame). The editor registers its implementation on mount and clears it on unmount, ensuring tools only work when an editor is open.

  • Frontend Tools (builtin/model3d.ts): Nine new ui_3d_* tools registered with the FrontendToolRegistry:

    • ui_3d_list_scene — enumerate all objects with their transforms and properties
    • ui_3d_add_object — create primitives (box, sphere, plane, cylinder, torus, lights)
    • ui_3d_select_object — drive the transform gizmo and Properties panel
    • ui_3d_delete_object — remove objects from the scene
    • ui_3d_set_transform — modify position, rotation (in degrees), and scale
    • ui_3d_set_visibility — show/hide objects
    • ui_3d_rename_object — change object names
    • ui_3d_set_material_color — set mesh material colors via hex strings
    • ui_3d_frame_scene — fit camera to scene bounds
  • Model3DChatPanel (Model3DChatPanel.tsx): New React component that embeds ChatView in the editor's left sidebar, wired to the shared GlobalChatStore. Reuses the existing chat infrastructure so the assistant can call the registered ui_3d_* tools. Includes a welcome placeholder with usage hints.

  • Editor Layout Refactor (Model3DEditor.tsx):

    • Left panel now hosts a TabGroup switching between "Scene" (existing SceneOutliner) and "Chat" (new Model3DChatPanel)
    • Increased left panel width from 240px to 320px to accommodate chat
    • Both panes stay mounted (toggled via CSS display) to preserve chat connection and scroll state across tab switches
    • Added helper functions: findObject() (lookup by uuid or name), toNode() (serialize Three.js objects to Model3DSceneNode), uniqueName() (collision-free naming), addPrimitiveObject() (extracted from handleAdd)
    • Registered the tool handler on mount via useEffect, using a selectedUuidRef to let stable callbacks read the latest selection without re-registering
  • Tests (model3dTools.test.ts): Comprehensive test suite covering tool registration, error handling (no editor open), parameter validation, and handler delegation for all nine tools.

  • Integration: Updated index.tsx and frontendToolsIpc.ts to import and register the new model3d tools module.

Notable Implementation Details

  • Rotation in degrees: All tool parameters and Model3DSceneNode use degrees for rotation (matching the Properties panel UI), while Three.js internally uses radians. Conversion happens at the handler boundary.
  • Object lookup: Tools accept either uuid or case-insensitive name, enabling natural language references ("the red box") without requiring users to know internal IDs.
  • Stable handler registration: The handler is registered once with callbacks that read selectedUuidRef.current, avoiding re-registration churn as selection changes.
  • Chat persistence: Both scene and chat panes remain mounted when switching tabs, preserving the WebSocket connection and message history.
  • Error messages: Tool errors (e.g., "No 3D model editor is open", "Object not found") are descriptive and surface back to the agent for recovery.

https://claude.ai/code/session_018KZo7SELSMLZaB47hJ6nmA

@georgi georgi left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code quality is excellent — the tool bridge pattern, tests, and Anthropic provider changes all look solid. However, this has a non-trivial merge conflict in packages/websocket/src/unified-websocket-runner.ts:

Main was refactored to use isProviderMessageEvent with a different control flow for handling tool calls and streaming (the old onToolCall callback / Phase 1+2 pattern was replaced). The image-content routing logic (extractToolResultImageContent, image-aware tool result handling) needs to be adapted to work with the new architecture.

The conflict can't be auto-resolved — it requires understanding where in the new isProviderMessageEvent flow to insert the image content extraction and routing. The anthropic-provider.ts changes merged cleanly.

To resolve: rebase onto main and port the image-content features to the new streaming pattern in unified-websocket-runner.ts. The extractToolResultImageContent method itself is still valid, just needs to be wired into the new flow.


Generated by Claude Code

Add a Chat tab alongside the Scene outliner in the 3D model editor's
left panel, reusing the existing ChatView wired to GlobalChatStore so the
assistant can build and edit the scene conversationally.

Expose scene operations to the agent via a UI tooling bridge:
- model3DToolBridge: a singleton the open editor populates on mount with
  scene-operation handlers (list/add/select/delete/transform/visibility/
  rename/material-color/frame) and clears on unmount.
- ui_3d_* frontend tools delegate to the bridge and are registered with
  FrontendToolRegistry, so they flow through the same manifest + tool-call
  path as the existing ui_* graph tools (both chat and agent sockets).

The tools fail with a clear message when no editor is open. Includes unit
tests for tool registration, bridge gating, and delegation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_018KZo7SELSMLZaB47hJ6nmA
@georgi georgi force-pushed the claude/fervent-allen-x7hmqo branch from b6577d7 to 3e75d6f Compare June 25, 2026 09:25

georgi commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator Author

Rebased onto main and ported the image-content routing to the new provider agent loop. Force-pushed as 3e75d6f3.

The runner no longer owns a Phase 1/2 loop — provider.generateLoop({ executeTool }) orchestrates and executeTool returns a string. To carry an image through that, I widened the contract rather than working around it:

  • BaseProvider.generateLoop: executeTool now returns string | MessageContent[]. An array is fed straight into the tool Message, so vision providers see it. The legacy inline onToolCall channel and the Claude Agent SDK's text-only MCP tools collapse image content to its text via a new toolResultToText helper (graceful fallback).
  • unified-websocket-runner: executeTool returns image MessageContent blocks for results carrying image_content (extractToolResultImageContent); the persisted/echoed tool message keeps only the note text so base64 never bloats history.
  • anthropic-provider: unchanged from before — renders MessageContent[] (text + image) inside tool_result blocks (merged cleanly).

Tests added for toolResultToText and for generateLoop routing an array result into the tool message, alongside the existing runner-extraction and Anthropic tool_result tests. Full suites green: runtime (base-provider 40, anthropic-coverage 36, claude-agent 22), websocket runner (29), web model3d (29); web typecheck + lint clean.


Generated by Claude Code

…re_view

Add an on-demand screenshot tool so a vision-capable assistant can visually
inspect the 3D scene, not just its structured graph.

Frontend:
- ui_3d_capture_view renders the live R3F viewport (gl.render + toDataURL)
  and returns it as { image_content: { data, mimeType } }.
- CaptureBridge publishes the renderer/scene/camera from inside the Canvas;
  the editor's captureView() grabs a PNG on demand.

Backend — image tool-results threaded through the provider agent loop:
- BaseProvider.generateLoop's executeTool may now return string | MessageContent[];
  an array is fed straight into the tool Message so vision providers see it.
  The legacy inline onToolCall channel and the Claude Agent SDK's text-only MCP
  tools collapse image content to its text via the new toolResultToText helper.
- unified-websocket-runner's executeTool returns image MessageContent blocks for
  results carrying image_content (extractToolResultImageContent); the persisted/
  echoed tool message keeps only the note text so base64 never bloats history.
- anthropic-provider renders MessageContent[] (text + image) inside tool_result
  blocks; the image-block conversion is factored into convertImagePart and
  reused by the user-message path. Other providers keep the text fallback.

Tests cover the tool contract, the runner's image extraction, toolResultToText,
generateLoop routing array results into the tool message, and the Anthropic
tool_result block conversion.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_018KZo7SELSMLZaB47hJ6nmA
@georgi georgi force-pushed the claude/fervent-allen-x7hmqo branch from 3e75d6f to 2cb7aba Compare June 25, 2026 09:26
@georgi georgi merged commit abea04e into main Jun 26, 2026
15 checks passed
@georgi georgi deleted the claude/fervent-allen-x7hmqo branch June 26, 2026 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants