feat(google-cua): add screenshot pruning to prevent memory growth#1477
feat(google-cua): add screenshot pruning to prevent memory growth#1477
Conversation
GoogleCUAClient now prunes old screenshots from conversation history, keeping only the most recent maxImages (default: 3) screenshots. This matches the behavior of MicrosoftCUAClient and prevents unbounded memory growth during long agent sessions, especially on image-heavy websites. The maxImages option can be configured via clientOptions.maxImages. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 3b6588b The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Greptile SummaryAdds screenshot pruning to Key changes:
Impact:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent as GoogleCUAClient
participant API as Google GenAI API
participant History as Conversation History
Note over Agent: Step execution starts
Agent->>API: generateContent(history)
API-->>Agent: Response with function calls
Agent->>Agent: Process response & execute actions
Agent->>Agent: Capture screenshot
Agent->>History: Push function response with inlineData (screenshot)
Note over Agent,History: Screenshot pruning logic
Agent->>Agent: maybeRemoveOldScreenshots()
Agent->>History: Traverse from newest to oldest
loop For each history entry
Agent->>History: Check if entry has inlineData
alt Screenshot count > maxImages
Agent->>History: Filter out inlineData from entry.parts
Note over Agent,History: Keep structure, remove only screenshot data
else Screenshot count <= maxImages
Note over Agent,History: Keep entry unchanged
end
end
Note over Agent: Continue to next step with pruned history
|
Greptile's behavior is changing!From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
GoogleCUAClient accumulates screenshots in this.history without pruning. On image-heavy sites, each screenshot is ~2-3 MB base64, causing OOM errors on memory-constrained environments like AWS Lambda during long sessions. Add maybeRemoveOldScreenshots() that prunes inlineData from older history entries, keeping only the most recent maxImages (default: 3) screenshots. Configurable via clientOptions.maxImages. Matches the existing MicrosoftCUAClient behavior. Based on browserbase#1477 by @pkiv.
|
Hello ! Saw this was pretty close to being finished and wanted to revive it. Went ahead an open a fresh PR with the linter issues fixed |
|
moved to #2009 |
Summary
maxImagesproperty toGoogleCUAClient(default: 3) to limit screenshots kept in historymaybeRemoveOldScreenshots()method that prunes old screenshots after each stepMicrosoftCUAClientwhich already has this featureProblem
The
GoogleCUAClientaccumulates screenshots inthis.historywithout any pruning mechanism. On image-heavy websites like ncl.com, each screenshot is ~2-3 MB base64. During a 30-step session, this can cause memory growth from ~170 MB to ~400+ MB RSS, leading to OOM errors on memory-constrained environments like AWS Lambda.Solution
After pushing function responses to history, call
maybeRemoveOldScreenshots()which:maxImageslimit, removes theinlineDatawhile preserving the rest of the history structureThis reduces memory delta by ~40% in testing (from 232 MB to 138 MB over a typical session).
Test plan
clientOptions.maxImages🤖 Generated with Claude Code
Summary by cubic
Adds screenshot pruning to GoogleCUAClient to cap screenshots in history and prevent memory growth during long sessions. Keeps only the most recent maxImages screenshots (default 3), removes inlineData from older entries, is configurable via clientOptions.maxImages, and matches MicrosoftCUAClient behavior.
Written for commit 3b6588b. Summary will update automatically on new commits.