Skip to content

Test Coverage & Performance Improvements#222

Merged
asdek merged 28 commits intomasterfrom
feature/next_release
Mar 22, 2026
Merged

Test Coverage & Performance Improvements#222
asdek merged 28 commits intomasterfrom
feature/next_release

Conversation

@asdek
Copy link
Contributor

@asdek asdek commented Mar 22, 2026

Description of the Change

Problem

PentAGI codebase had several quality and performance issues requiring attention:

Test Coverage Gaps:

  • Multiple core packages lacked unit tests (config, terminal, server/response, server/context, graph/context, embeddings)
  • Existing community-contributed tests had inconsistent style and incomplete coverage
  • Tests didn't follow project conventions (missing edge cases, no t.Parallel(), hardcoded assertions)

HTTP Client Timeout Missing:

  • No timeout configured for external API calls, causing indefinite hangs when APIs stopped responding
  • Affected all 17 LLM provider integrations and search tools
  • Goroutine leaks from hung connections

Tool Call ID Detection Inefficiency:

  • Full detection algorithm ran on every PentAGI restart (6-10 LLM calls per provider)
  • Wasted tokens and time validating known templates already verified by developers
  • Slow startup experience for users

AWS Bedrock Compatibility:

  • Converse API returned ValidationException when conversation history contained tool blocks but current turn provided no tools
  • Required toolConfig field missing in certain scenarios

Solution

Comprehensive Test Coverage:

  • Added 200+ unit tests across 6 core packages with consistent table-driven approach
  • Standardized test style: t.Parallel() for independent tests, require.* for critical assertions, comprehensive edge cases
  • Brought community tests to project standards while maintaining their value
  • Packages now have hermetic, production-ready test suites

HTTP Client Timeout Configuration:

  • Added HTTP_CLIENT_TIMEOUT environment variable (default: 600 seconds)
  • Applied to all external API calls (LLM providers, search engines, embeddings)
  • Prevents indefinite hangs, improves reliability
  • Integrated into installer wizard UI with validation

Tool Call ID Template Optimization:

  • Implemented testTemplate() fast-path validation
  • Single LLM call validates known templates vs 6-10 calls for full detection
  • 83-90% reduction in startup time and token usage for 8 major providers
  • Automatic fallback to full algorithm if validation fails

Bedrock toolConfig Fix:

  • Reconstruct minimal tool definitions from conversation history when needed
  • Ensures toolConfig always present when required by Converse API
  • Fixes ValidationException for tool-heavy conversations

Dependency Updates:

  • Updated OpenTelemetry SDK to v1.39.0 (rollback from v1.40.0 for compatibility)
  • Updated golang.org/x/crypto to v0.46.0
  • Updated google.golang.org/grpc to v1.79.3
  • Frontend: dompurify 3.3.3, immutable 5.1.5

Closes #160

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🔧 Configuration change
  • 🧪 Test update
  • 🛡️ Security update

Areas Affected

  • Core Services (Frontend UI/Backend API)
  • AI Agents (Researcher/Developer/Executor)
  • Security Tools Integration
  • Memory System (Vector Store/Knowledge Base)
  • Monitoring Stack (Grafana/OpenTelemetry)
  • Analytics Platform (Langfuse)
  • External Integrations (LLM/Search APIs)
  • Documentation
  • Infrastructure/DevOps

Testing and Verification

Test Configuration

PentAGI Version: Latest development (feature/next_release)
Docker Version: 24.0.x+
Host OS: Linux/macOS
LLM Provider: OpenAI, Anthropic, AWS Bedrock
Go Version: 1.24.1

Test Steps

  1. Run all backend unit tests: cd backend && go test ./...
  2. Test HTTP timeout with slow external API (verify 600s default, custom values, zero=unlimited)
  3. Restart PentAGI and measure tool call ID template detection time (should be <1s for known providers)
  4. Test AWS Bedrock with multi-turn tool conversations (no ValidationException)
  5. Verify all installer wizard fields work correctly
  6. Run hermetic config tests with ambient environment variables set

Test Results

  • ✅ All 200+ new unit tests pass
  • ✅ Existing tests continue to pass
  • ✅ HTTP timeout prevents indefinite hangs (verified with mock slow server)
  • ✅ Tool call ID detection: <1s vs ~5-10s on startup for Anthropic/OpenAI
  • ✅ Bedrock toolConfig issue [Bug]: When using Bedrock, toolConfig is not defined causing failures. #160 resolved
  • ✅ Config tests hermetic (pass regardless of shell environment)
  • ✅ Backend: go fmt, go vet, golangci-lint clean
  • ✅ Frontend: npm run lint passed

Security Considerations

No Security Impact:

  • Test additions don't affect runtime behavior
  • HTTP timeout improves DoS resistance (prevents resource exhaustion from hung connections)
  • All changes maintain existing security model
  • Community test contributions reviewed and sanitized

Performance Impact

Improvements:

  • Tool Call ID Detection: 83-90% faster startup for major providers (1 LLM call vs 6-10)
  • HTTP Timeout: Prevents goroutine leaks from hung connections, improves resource management
  • Test Suite: No impact on runtime (development-time only)

Token Savings:

  • ~500-1000 tokens saved per provider on PentAGI restart (tool call ID validation)
  • Multiplied across 8 providers = 4000-8000 tokens saved per restart

Documentation Updates

  • README.md updates - HTTP_CLIENT_TIMEOUT variable documented
  • API documentation updates
  • Configuration documentation updates - backend/docs/config.md enhanced
  • GraphQL schema updates
  • Other: Installer wizard localization strings, .env.example

Deployment Notes

New Environment Variable (Optional):

# HTTP client timeout for external APIs (default: 600 seconds, 0 = unlimited)
HTTP_CLIENT_TIMEOUT=600

Configuration Notes:

  • Variable can be set via .env file or installer wizard
  • Default 600s suitable for most LLM operations
  • Set to 0 to disable timeout (not recommended in production)
  • Applied to: LLM providers, search engines, embedding APIs, external tools

Compatibility:

  • ✅ Fully backward compatible
  • ✅ No breaking changes
  • ✅ Existing deployments work without modification
  • ✅ New tests don't affect runtime

Deployment Steps:

  1. Pull latest changes
  2. Update .env if custom timeout needed
  3. Rebuild: docker compose build (dependency updates)
  4. Restart: docker compose up -d

Checklist

Code Quality

  • My code follows the project's coding standards
  • I have added/updated necessary documentation
  • I have added tests to cover my changes
  • All new and existing tests pass
  • I have run go fmt and go vet (for Go code)
  • I have run npm run lint (for TypeScript/JavaScript code)

Security

  • I have considered security implications
  • Changes maintain or improve the security model
  • Sensitive information has been properly handled

Compatibility

  • Changes are backward compatible
  • Breaking changes are clearly marked and documented
  • Dependencies are properly updated

Documentation

  • Documentation is clear and complete
  • Comments are added for non-obvious code
  • API changes are documented

Additional Notes

Key Changes by Category

Test Coverage Improvements

Added 200+ Unit Tests (PRs #198, #199, #200, #201, #202, #213, #214):

  • pkg/version - Binary version, develop mode detection, binary name
  • pkg/config - Configuration loading, defaults, env overrides, Installation ID handling
  • pkg/terminal - Markdown detection, interactive prompts, context cancellation, JSON/markdown output
  • pkg/server/response - HttpError type, 78 predefined errors, dev/production mode response handling
  • pkg/providers/embeddings - All 7 providers (OpenAI, Ollama, Mistral, Jina, Huggingface, GoogleAI, VoyageAI)
  • pkg/graph/context - User ID/type/permissions helpers, validation functions, admin regex
  • pkg/server/context - Gin context helpers (GetInt64, GetUint64, GetString, GetStringArray, GetStringFromSession)

Test Quality Standards Applied:

  • Table-driven tests with comprehensive edge cases
  • t.Parallel() for independent test execution
  • require.* for critical assertions (prevents panics)
  • Hermetic tests (clearConfigEnv helper prevents ambient environment interference)
  • Proper context cancellation testing (io.Pipe vs strings.NewReader)
  • Global state restoration (version.PackageVer save/restore pattern)

Contributors: @mason5052

Performance Optimizations

Tool Call ID Template Fast-Path (Commit 55e0ac5):

  • File: backend/pkg/providers/provider/agents.go
  • New testTemplate() function validates known templates with single LLM call
  • Integrated after cache lookup, before full detection algorithm
  • Providers optimized: OpenAI, Anthropic, Bedrock, Gemini, DeepSeek, Kimi, Qwen, GLM (8/10)
  • Result: 83-90% reduction in startup time and token usage

HTTP Client Timeout (PR #205, Commit 47ae6fe):

  • Files: backend/pkg/system/utils.go, backend/pkg/config/config.go
  • Added HTTP_CLIENT_TIMEOUT env var (default 600s, 0=unlimited)
  • Applied to all http.Client instances via GetHTTPClient()
  • Installer wizard integration with validation (must be >= 0)
  • Documentation in README.md, backend/docs/config.md
  • Result: Prevents indefinite hangs, improves reliability
  • Contributor: @liri-ha (original), enhanced by maintainers

Bug Fixes

Bedrock toolConfig Requirement (PR #196):

  • File: backend/pkg/providers/bedrock/bedrock.go
  • Functions restoreMissedToolsFromChain(), collectToolUsageFromChain(), inferSchemaFromArguments()
  • Reconstruct minimal tool definitions when conversation contains toolUse/toolResult blocks
  • Ensures Converse API requirement met: toolConfig must be defined when using tool blocks
  • Result: Fixes ValidationException in multi-turn tool conversations
  • Contributor: @manusjs

Dependencies

Go Modules (PR #215, Commit 47ae6fe):

  • go.opentelemetry.io/otel/sdk v1.36.0 → v1.39.0 (rolled back from v1.40.0 for compatibility)
  • golang.org/x/crypto v0.44.0 → v0.46.0
  • google.golang.org/grpc v1.73.0 → v1.79.3

Frontend Dependencies (PR #207):

  • dompurify 3.3.1 → 3.3.3 (security patches)
  • immutable 3.7.6 → 5.1.5 (major version update)

Contributors

This release includes contributions from:

  • @mason5052 - Comprehensive test coverage (7 packages, 200+ tests), test quality standards
  • @liri-ha - HTTP client timeout implementation
  • @manusjs - Bedrock toolConfig fix
  • @asdek (Dmitry Ng) - Test review and refinement, tool call ID optimization, integration

Special thanks to community contributors for improving PentAGI's code quality!

Merged Pull Requests

Issues Addressed

manusjs and others added 28 commits March 11, 2026 16:18
…/toolResult blocks

Ensure WithTools option is applied last in CallWithTools (consistent with
CallEx) so provider config options cannot accidentally overwrite restored
tool definitions. Add integration test covering the exact issue #160
scenario: chain with toolUse/toolResult blocks but no explicit tools in the
current turn.

Fixes #160

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive unit tests for GetBinaryVersion, IsDevelopMode, and
GetBinaryName functions covering all code paths including default values,
custom values, and combined version-revision formatting.
Add unit tests for IsMarkdownContent detection (headers, code blocks,
bold, links, lists, plain text), InteractivePromptContext (input reading,
whitespace trimming, context cancellation), GetYesNoInputContext (yes/no
variants, case insensitivity), PrintJSON (valid/invalid data), and
RenderMarkdown/PrintResult output functions.
Add unit tests for HttpError type (constructor, accessors, error interface),
predefined error variables (HTTP codes, error codes for 12 categories),
Success/Error response functions with gin test context including dev mode
vs production mode behavior for error detail exposure.
Add unit tests for the embedding provider factory function covering all
7 supported providers (OpenAI, Ollama, Mistral, Jina, Huggingface,
GoogleAI, VoyageAI), the none provider, unsupported provider error
handling, custom URL/key/model configuration, and IsAvailable behavior.
Use io.Pipe() instead of strings.NewReader("") so the reader
genuinely blocks, forcing the select to take the ctx.Done() branch.
Assert require.ErrorIs(err, context.Canceled) instead of generic
assert.Error. Pipe writer is closed in defer to prevent goroutine
leaks.
Add context_test.go with 20 tests covering GetUserID, SetUserID,
GetUserType, SetUserType, GetUserPermissions, SetUserPermissions,
validateUserType, and validatePermission. Includes table-driven tests
for validation functions and wrong-type assertion checks.
GetHTTPClient creates http.Client without a Timeout field, causing
goroutines to hang indefinitely when an external API (LLM provider,
search tool, etc.) stops responding. This affects all 17 call sites
across every provider (OpenAI, Anthropic, Gemini, DeepSeek, Kimi,
Qwen, GLM, Ollama, custom) and all search tools (Tavily, DuckDuckGo,
Sploitus, Perplexity, Google, Traversaal, SearxNG).

Changes:
- Add HTTP_CLIENT_TIMEOUT env var (default: 600s / 10 minutes)
- Set Timeout on all http.Client instances returned by GetHTTPClient
- When cfg is nil, return a client with the default timeout instead
  of http.DefaultClient (which has no timeout)
- Add 5 unit tests covering default, custom, zero, nil, and proxy
  timeout scenarios
- Document the new env var in .env.example

Relates to #176 (context canceled on agent delegation), which
identified the missing HTTP client timeout as a contributing factor.
…dates

Bumps the npm_and_yarn group with 2 updates in the /frontend directory: [dompurify](https://github.com/cure53/DOMPurify) and [immutable](https://github.com/immutable-js/immutable-js).


Updates `dompurify` from 3.3.1 to 3.3.3
- [Release notes](https://github.com/cure53/DOMPurify/releases)
- [Commits](cure53/DOMPurify@3.3.1...3.3.3)

Updates `immutable` from 3.7.6 to 5.1.5
- [Release notes](https://github.com/immutable-js/immutable-js/releases)
- [Changelog](https://github.com/immutable-js/immutable-js/blob/main/CHANGELOG.md)
- [Commits](immutable-js/immutable-js@3.7.6...v5.1.5)

---
updated-dependencies:
- dependency-name: dompurify
  dependency-version: 3.3.3
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: immutable
  dependency-version: 5.1.5
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Add context_test.go with 15 tests covering GetInt64, GetUint64,
GetString, GetStringArray, and GetStringFromSession. Each function
has table-driven subtests for found, missing, and wrong-type cases.
Session tests use gin-contrib/sessions/cookie with httptest.
Bumps the go_modules group with 3 updates in the /backend directory: [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go), [golang.org/x/crypto](https://github.com/golang/crypto) and [google.golang.org/grpc](https://github.com/grpc/grpc-go).


Updates `go.opentelemetry.io/otel/sdk` from 1.36.0 to 1.40.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.36.0...v1.40.0)

Updates `golang.org/x/crypto` from 0.44.0 to 0.45.0
- [Commits](golang/crypto@v0.44.0...v0.45.0)

Updates `google.golang.org/grpc` from 1.73.0 to 1.79.3
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](grpc/grpc-go@v1.73.0...v1.79.3)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.40.0
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: golang.org/x/crypto
  dependency-version: 0.45.0
  dependency-type: direct:production
  dependency-group: go_modules
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: direct:production
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <support@github.com>
Add comprehensive unit tests for NewConfig defaults, environment variable
overrides, URL parsing, provider server URL defaults, summarizer defaults,
search engine defaults, and ensureInstallationID logic including UUID
generation, file persistence, and invalid value handling.
Add clearConfigEnv helper that clears all Config struct env vars via
t.Setenv, and use t.Chdir(t.TempDir()) to isolate filesystem side
effects from godotenv.Load() and ensureInstallationID(). Tests now
pass regardless of environment variables set in the calling shell.
Signed-off-by: Dmitry Ng <19asdek91@gmail.com>
fix(bedrock): always include toolConfig when messages contain toolUse/toolResult blocks
test: add unit tests for pkg/version package
test: add unit tests for pkg/config package
test: add unit tests for pkg/terminal package
test: add unit tests for pkg/server/response package
test: add unit tests for pkg/providers/embeddings package
test: add unit tests for pkg/graph context helpers
test: add unit tests for pkg/server/context helpers
…go_modules-a62524d0e0

chore(deps): bump the go_modules group across 1 directory with 3 updates
…nd/npm_and_yarn-fa6563b207

chore(deps): bump the npm_and_yarn group across 1 directory with 2 updates
fix(system): add configurable timeout to HTTP client
- Updated relevant files including `.env.example`, `docker-compose.yml`, and documentation to reflect this new setting.
- Enhanced server settings and locale files to support the new timeout configuration.
- Adjusted HTTP client initialization to utilize the configured timeout value.
…tarting

- Modified prompt templates to enforce completion requirements for function calls.
…x not working

- Introduced tests for HTTP client timeout configuration, validating default, custom, and zero timeout scenarios.
- Added tests for agent supervision settings, ensuring correct defaults and overrides based on environment variables.
- Enhanced existing tests for context and response handling to improve coverage and reliability.
@asdek asdek merged commit e05062e into master Mar 22, 2026
6 checks passed
@asdek asdek deleted the feature/next_release branch March 22, 2026 16:16
@asdek asdek changed the title Feature/next release Test Coverage & Performance Improvements Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: When using Bedrock, toolConfig is not defined causing failures.

4 participants