Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ jobs:
- name: Unit tests
run: build\lvt_unit_tests.exe --gtest_output=xml:build\unit_test_results.xml

- name: Chromium plugin tests
run: build\lvt_chromium_tests.exe --gtest_output=xml:build\chromium_test_results.xml

- name: Integration tests
run: build\lvt_integration_tests.exe --gtest_output=xml:build\integration_test_results.xml

Expand All @@ -47,7 +50,6 @@ jobs:
with:
name: test-results
path: build\*_test_results.xml

- name: Upload build artifacts
uses: actions/upload-artifact@v4
if: always()
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,10 @@ jobs:
if: matrix.arch == 'x64'
run: ${{ matrix.build_dir }}\lvt_unit_tests.exe

- name: Chromium plugin tests
if: matrix.arch == 'x64'
run: ${{ matrix.build_dir }}\lvt_chromium_tests.exe

- name: Integration tests
if: matrix.arch == 'x64'
run: ${{ matrix.build_dir }}\lvt_integration_tests.exe
Expand Down
42 changes: 42 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,36 @@ add_custom_command(TARGET lvt_avalonia_tap POST_BUILD
COMMENT "Publishing managed Avalonia tree walker assembly"
)

# Chromium plugin DLL — runtime-loaded plugin for Chrome/Edge DOM tree support
add_library(lvt_chromium_plugin SHARED
src/plugin_chromium/lvt_chromium_plugin.cpp
)
target_compile_definitions(lvt_chromium_plugin PRIVATE WIN32_LEAN_AND_MEAN NOMINMAX)
target_include_directories(lvt_chromium_plugin PRIVATE src)
target_link_libraries(lvt_chromium_plugin PRIVATE nlohmann_json::nlohmann_json ole32 version)
set_target_properties(lvt_chromium_plugin PROPERTIES
OUTPUT_NAME "lvt_chromium_plugin"
RUNTIME_OUTPUT_DIRECTORY $<TARGET_FILE_DIR:lvt>/plugins
)

# Chromium native messaging host — relay between Chrome extension and lvt
add_executable(lvt_chromium_host
src/plugin_chromium/chromium_host.cpp
)
target_compile_definitions(lvt_chromium_host PRIVATE WIN32_LEAN_AND_MEAN NOMINMAX)
target_link_libraries(lvt_chromium_host PRIVATE advapi32)
set_target_properties(lvt_chromium_host PROPERTIES
RUNTIME_OUTPUT_DIRECTORY $<TARGET_FILE_DIR:lvt>/plugins/chromium
)

# Copy Chromium extension files to output
add_custom_command(TARGET lvt_chromium_plugin POST_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_directory
"${CMAKE_SOURCE_DIR}/src/plugin_chromium/extension"
"$<TARGET_FILE_DIR:lvt>/plugins/chromium/extension"
COMMENT "Copying Chromium extension files"
)

# --- Tests ---
enable_testing()
find_package(GTest CONFIG REQUIRED)
Expand Down Expand Up @@ -172,3 +202,15 @@ target_link_libraries(lvt_integration_tests PRIVATE
WIL::WIL nlohmann_json::nlohmann_json
)
add_test(NAME integration_tests COMMAND lvt_integration_tests)

# Chromium plugin tests — DOM JSON format and native messaging protocol
add_executable(lvt_chromium_tests
tests/chromium_tests.cpp
)
target_include_directories(lvt_chromium_tests PRIVATE src)
target_compile_definitions(lvt_chromium_tests PRIVATE WIN32_LEAN_AND_MEAN NOMINMAX)
target_link_libraries(lvt_chromium_tests PRIVATE
GTest::gtest GTest::gtest_main
nlohmann_json::nlohmann_json
)
add_test(NAME chromium_tests COMMAND lvt_chromium_tests)
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ A Windows CLI tool that inspects the visual tree of running applications. Design
## What it does

- Targets any running Windows app by HWND, PID, process name, or window title
- Detects UI frameworks in use: Win32, ComCtl, Windows XAML (UWP), WinUI 3, WPF, [Avalonia](docs/avalonia-plugin.md)
- Detects UI frameworks in use: Win32, ComCtl, Windows XAML (UWP), WinUI 3, WPF, [Avalonia](docs/avalonia-plugin.md), [Chrome/Edge](docs/chromium-plugin.md)
- Outputs a unified element tree as JSON or XML markup
- Captures annotated PNG screenshots with element IDs overlaid
- Elements get stable IDs (`e0`, `e1`, …) so AI agents can reference specific parts of the UI
Expand Down Expand Up @@ -180,6 +180,7 @@ See [src/plugin.h](src/plugin.h) for the plugin interface.
| Plugin | Framework | Docs |
|--------|-----------|------|
| **Avalonia** | [Avalonia UI](https://avaloniaui.net/) desktop apps | [docs/avalonia-plugin.md](docs/avalonia-plugin.md) |
| **Chromium** | Chrome/Edge browser DOM trees | [docs/chromium-plugin.md](docs/chromium-plugin.md) |

These plugins are built from source alongside lvt and deployed to `%USERPROFILE%\.lvt\plugins\`. See each plugin's documentation for installation and usage details.

Expand Down
176 changes: 176 additions & 0 deletions docs/chromium-plugin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Chromium Plugin — Chrome/Edge DOM Inspection

The Chromium plugin lets lvt inspect the DOM tree of web pages in Google Chrome and Microsoft Edge. It works via a browser extension that communicates with lvt through Chrome's [Native Messaging](https://developer.chrome.com/docs/extensions/develop/concepts/native-messaging) protocol.

## How it works

```
lvt.exe → plugin DLL → named pipe → native host → Chrome extension → chrome.debugger (CDP) → DOM
```

1. **lvt** detects Chrome/Edge by checking for `chrome.dll` or `msedge.dll` in the target process
2. The **plugin** connects to a named pipe served by the native messaging host
3. The **native messaging host** relays the request to the browser extension
4. The **extension** uses the `chrome.debugger` API (Chrome DevTools Protocol) to walk the DOM tree of the active tab
5. The DOM tree is returned as an lvt element tree with bounds, properties, and text content

## Prerequisites

- Google Chrome 110+ or Microsoft Edge 110+
- lvt built with the chromium plugin (included by default)

## Installation

### 1. Register the native messaging host

```powershell
build\plugins\chromium\lvt_chromium_host.exe --register
```

This creates registry entries for both Chrome and Edge and writes a `com.lvt.chromium.json` manifest file.

### 2. Load the browser extension

1. Open `chrome://extensions` (Chrome) or `edge://extensions` (Edge)
2. Enable **Developer mode** (toggle in top-right)
3. Click **Load unpacked**
4. Select the `build/plugins/chromium/extension/` directory

The extension icon should appear in the toolbar. The extension will automatically connect to the native messaging host.

## Usage

```powershell
# Inspect Chrome
lvt --name chrome

# Inspect Edge
lvt --name msedge

# Output as XML
lvt --name chrome --format xml

# Capture screenshot with element annotations
lvt --name chrome --screenshot page.png
```

## What you get

The DOM tree is mapped to lvt elements:

| DOM concept | lvt element field |
|-------------|-------------------|
| Tag name (`DIV`, `SPAN`) | `type` |
| Tag name (lowercase) | `className` |
| Text content | `text` |
| HTML attributes | `properties` |
| `getBoundingClientRect()` | `bounds` |
| Child elements | `children` |

Framework name is reported as `"chromium (Chrome)"` or `"chromium (Edge)"`.

### Example output (JSON)

```json
{
"id": "e0",
"type": "Window",
"framework": "win32",
"children": [
{
"id": "e1",
"type": "HTML",
"framework": "chromium (Chrome)",
"children": [
{
"id": "e2",
"type": "BODY",
"framework": "chromium (Chrome)",
"bounds": { "x": 0, "y": 0, "width": 1920, "height": 3000 },
"properties": { "class": "main-content" },
"children": [
{
"id": "e3",
"type": "DIV",
"properties": { "id": "app", "class": "container" },
"text": "Hello World"
}
]
}
]
}
]
}
```

### Example output (XML)

```xml
<Window id="e0" framework="win32">
<HTML id="e1" framework="chromium (Chrome)">
<BODY id="e2" bounds="0,0,1920,3000" class="main-content">
<DIV id="e3" html-id="app" class="container" text="Hello World" />
</BODY>
</HTML>
</Window>
```

## Architecture

### Browser Extension (Manifest V3)

- **Service worker** (`service-worker.js`): Connects to the native messaging host, dispatches DOM requests, uses `chrome.debugger` API for DOM walking
- Works on both Chrome and Edge (same Chromium extension format)
- Uses `chrome.debugger.sendCommand("DOM.getDocument", {depth: -1, pierce: true})` for full DOM including shadow DOM
- Gets element bounding boxes via `DOM.getBoxModel`

### Native Messaging Host (`lvt_chromium_host.exe`)

- Tiny C++ relay process launched by Chrome when the extension connects
- Bridges Chrome's stdin/stdout native messaging protocol with a Win32 named pipe (`\\.\pipe\lvt_chromium`)
- Supports `--register` to set up Windows registry entries

### Plugin DLL (`lvt_chromium_plugin.dll`)

- Implements the standard lvt plugin interface ([plugin.h](../src/plugin.h))
- Detection: checks for `chrome.dll` or `msedge.dll` loaded in the target process
- Enrichment: connects to the named pipe, sends a `getDOM` request, and parses the response

## Troubleshooting

### "Cannot connect to browser extension"

- Ensure the extension is loaded and active in Chrome/Edge (`chrome://extensions`)
- Run `lvt_chromium_host.exe --register` to (re-)register the native messaging host
- Check that the extension shows "Service worker: active" in the extensions page
- Try reloading the extension

### Empty DOM tree

- The tab must have finished loading (no spinner in the tab)
- Some pages may block debugger attachment (e.g., `chrome://` pages)
- Check `chrome://extensions` for extension errors

### Debug logging

Set `LVT_DEBUG=1` environment variable for verbose plugin logging:

```powershell
$env:LVT_DEBUG = "1"
lvt --name chrome
```

## Limitations

- Only inspects the **active tab** (tab selection by URL/title is planned)
- `chrome://` and `edge://` internal pages cannot be inspected
- The browser extension must be installed and the native host registered
- Shadow DOM content is included when `pierce: true` is used (default)

## Future work

- Tab selection by URL or title pattern
- iframe support (separate DOM walks per frame)
- WebView2 support (Chrome embedded in Win32 apps)
- Lazy loading for very large DOM trees
- Chrome Web Store / Edge Add-ons publication
23 changes: 21 additions & 2 deletions skills/lvt/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Use `lvt` whenever you need to understand the visual content or structure of a r
- **UI verification** — confirm that a UI change was applied correctly (e.g. a button label changed, a dialog appeared)
- **Finding UI elements** — locate a specific control, menu item, or text field in an app's visual tree
- **Screenshot capture** — take an annotated screenshot of an app with element IDs overlaid
- **Framework detection** — determine which UI frameworks an app uses (Win32, ComCtl, XAML, WinUI 3, WPF)
- **Framework detection** — determine which UI frameworks an app uses (Win32, ComCtl, XAML, WinUI 3, WPF, Chromium)
- **Automated UI interaction planning** — get element IDs and bounds to plan mouse clicks or keyboard input

## Prerequisites
Expand Down Expand Up @@ -118,7 +118,7 @@ Every element gets a stable ID like `e0`, `e1`, `e2`, etc., assigned in depth-fi
|----------|-------------|
| `id` | Stable element ID (e.g. `e0`) |
| `type` | Element type name (e.g. `Window`, `Button`, `TextBlock`) |
| `framework` | Which framework owns this element (`win32`, `comctl`, `xaml`, `winui3`, `wpf`) |
| `framework` | Which framework owns this element (`win32`, `comctl`, `xaml`, `winui3`, `wpf`, `chromium`) |
| `className` | Win32 window class name (Win32/ComCtl elements) |
| `text` | Visible text content or window title |
| `bounds` | Screen-relative bounding rectangle `{x, y, width, height}` |
Expand Down Expand Up @@ -169,3 +169,22 @@ Every element gets a stable ID like `e0`, `e1`, `e2`, etc., assigned in depth-fi
- For XAML/WinUI 3 apps, lvt injects a helper DLL into the target — this is safe and non-destructive but means `lvt_tap_{arch}.dll` must be next to `lvt.exe`
- For WPF apps, lvt injects `lvt_wpf_tap_{arch}.dll` and the managed `LvtWpfTap.dll` — both must be next to `lvt.exe`
- lvt.exe must match the target process architecture (x64, x86, or ARM64) — a clear error is shown on mismatch. Use `lvt-x86.exe` for 32-bit WPF apps.

## Chrome/Edge DOM inspection (optional one-time setup)

lvt can dump the DOM tree of web pages in Chrome and Edge. This requires a one-time setup:

```powershell
$lvtDir = "$env:USERPROFILE\.lvt"

# 1. Register the native messaging host for Chrome and Edge
& "$lvtDir\plugins\chromium\lvt_chromium_host.exe" --register

# 2. Load the browser extension in Chrome or Edge:
# - Open chrome://extensions (or edge://extensions)
# - Enable "Developer mode"
# - Click "Load unpacked" → select the extension folder:
Start-Process "$lvtDir\plugins\chromium\extension"
```

After setup, `lvt --name chrome` or `lvt --name msedge` will include the DOM tree of the active tab. If the extension is not installed, lvt still works for all other frameworks — it just won't show web content.
Loading