LLMTracker is the single primitive that captures LLM calls — by token usage, latency, model, and request/response — across every agent shape AgentOpt supports (in-process SDKs, subprocess CLIs, container agents). Selection and routing both build on it; you can also use it standalone for tracking and caching.
from agentopt import LLMTrackerThe same code runs against the in-process proxy (default) or a long-lived gateway daemon (AGENTOPT_GATEWAY_URL=…). The env var is the entire deployment switch — no API change.
with LLMTracker(combo_id="run-1") as tracker:
agent.run(prompt)
print(tracker.get_usage())with LLMTracker() as tracker:
for i, q in enumerate(questions, 1):
with tracker.track(data_id=f"q{i}"):
agent.run(q)
tracker.print_summary() # grouped by data_idfrom agentopt import LLMTracker, RandomRouter
router = RandomRouter(candidates=["gpt-4o", "gpt-4o-mini"], seed=0)
with LLMTracker(router=router) as tracker:
for i, q in enumerate(questions, 1):
with tracker.track(data_id=f"q{i}"):
agent.run(q)
tracker.print_summary()router= alone is not single-session sugar — it sets a default that nested tracker.track() calls inherit, so the host pattern above works.
While a track() scope is active, LLMTracker patches subprocess.Popen so every child spawned inside it inherits HTTPS_PROXY + the merged CA bundle path automatically — no env-var plumbing in the agent's run():
import subprocess
with LLMTracker(combo_id="subproc") as tracker:
subprocess.run(["claude", "-p", prompt])Merge policy — explicit beats implicit:
env=None(caller said nothing — inheritos.environ):{**os.environ, **session_env}. Session wins over anyHTTPS_PROXYthe parent shell happened to set.env=<dict>(caller wrote an explicit env):{**session_env, **user_env}. Caller's keys win on conflicts. Setting e.g.env={"HTTPS_PROXY": "http://custom-proxy:8080"}inside atrack()scope is respected — agentopt never silently overrides an explicit env. The common caseenv={"PATH": ...}(set for non-LLM reasons) still gets tracking because the user didn't write the session keys, sosession_envfills them in.
Limitations of the ContextVar-based activation (same as the httpx patch):
- A new
threading.Threaddoesn't propagate_active_session_varautomatically — subprocess calls inside the thread won't be intercepted. Workaround: wrap the thread target withcontextvars.copy_context().run(...). multiprocessing.Process(spawn mode) starts a fresh Python interpreter that never calledtracker.start(), so the patch isn't installed there. Fork mode inherits the patch and works.
For agents that ignore HTTPS_PROXY and instead need the proxy URL / CA cert injected into a config file (OpenClaw is the canonical case), agentopt.get_current_session_proxy() is the escape hatch — see the helper below. See proxy.md for the full subprocess flow.
LLMTracker(
*,
data_id: str | None = None,
combo_id: str | None = None,
agent_id: str | None = None,
router: Router | None = None,
cache: bool = True,
cache_dir: str | Path | None = ".agentopt_cache",
)| Param | Description |
|---|---|
data_id / combo_id / agent_id |
If any is set, __enter__ auto-opens a single tracking session with those IDs and __exit__ closes it. The "single-session sugar" path. |
router |
A Router to apply at the proxy layer. On its own it sets a default that nested track() calls inherit; combined with one of the IDs above it also routes the auto-opened session. |
cache |
Enable response caching. Hits short-circuit before any network round-trip. |
cache_dir |
Persist cache to disk at this path. Pass None to keep it in memory only. |
AGENTOPT_GATEWAY_URL is read in __init__; when set, the tracker delegates to RemoteBackend and cache_dir is honored by the daemon, not the client.
| Method | Description |
|---|---|
start() |
Install the httpx redirect and prepare the backend. Idempotent. |
stop() |
Tear down live sessions, restore httpx, flush cache. Record queries remain valid after stop() — so tracker.print_summary() works right after a with block exits. |
close() |
Final teardown. For RemoteBackend this drops the long-lived control-plane HTTP client; for LocalBackend it's equivalent to stop(). Idempotent. |
__exit__ calls stop(), not close() — so post-with queries against the daemon don't fail with "client closed". Call tracker.close() explicitly when you want to release the HTTP client (or let __del__ do it).
@contextmanager
tracker.track(
data_id: str | None = None,
combo_id: str | None = None,
agent_id: str | None = None,
router: Router | None = None,
) -> SessionInfoOpen a tracking session. In local mode this eagerly spins up a per-session mitmproxy SessionMaster on an ephemeral port and sets a ContextVar so in-process httpx calls are attributed correctly. In daemon mode it POSTs /sessions and reuses the daemon's port.
All four params are optional. router= falls back to the router passed to LLMTracker(...) if you don't supply one here.
| Helper | Description |
|---|---|
get_session_env(session) |
{HTTPS_PROXY, SSL_CERT_FILE, REQUESTS_CA_BUNDLE, NODE_EXTRA_CA_CERTS} for subprocess agents. Identical shape in both modes. |
agentopt.get_current_session_proxy() |
Module-level convenience — reads the active session from the ContextVar and returns a SessionProxy (url, port, ca_pem, ca_bundle_path, env_dict()). Returns None outside a track() scope. |
| Method | Returns | Description |
|---|---|---|
records (property) |
List[CallRecord] |
All records captured so far. |
get_records(data_id=None, combo_id=None, agent_id=None) |
List[CallRecord] |
Filtered records. |
get_usage(...) |
Dict[str, Tuple[int, int]] |
{model: (input_tokens, output_tokens)} aggregated over matching records. |
get_cached_latency(...) |
float |
Total latency (seconds) of cache-hit records — useful for "how much wall time did caching save?" |
print_summary(data_id=None, combo_id=None, agent_id=None) |
None |
Model sequence, per-model tokens, and total latency. Grouped by data_id when records span multiple distinct values; flat otherwise. Thin wrapper over agentopt.routing.print_routing_summary. |
| Method | Description |
|---|---|
flush_cache() |
Flush dirty cache entries to disk. |
clear_cache() |
Drop all cached responses (memory + disk). |
clear() |
Clear locally archived records. |
register_provider(name, base_url, path_patterns) |
Add or replace an LLM provider so its hostnames are MITM-intercepted and its paths recognized by the in-process patch. In daemon mode this also POSTs /providers to keep the daemon in sync. |
Cache keys hash the routed model (not the requested one), so a router swapping gpt-4o → gpt-4o-mini produces a distinct cache entry. See router.md.
from agentopt import CallRecord| Field | Type | Description |
|---|---|---|
data_id |
str? |
Datapoint identifier. |
combo_id |
str? |
Model-combination identifier. |
agent_id |
str? |
Agent role identifier. |
model |
str |
Model name (the one actually used after any routing). |
prompt_tokens |
int |
Input tokens. |
completion_tokens |
int |
Output tokens. |
latency_seconds |
float |
API call duration. |
request_url |
str |
Upstream URL. |
request_body |
dict |
Parsed request payload. |
response_body |
dict |
Parsed response payload. |
timestamp |
str |
ISO 8601. |
cached |
bool |
Whether this was a cache hit. |
error |
str? |
Set when the upstream failed or token extraction couldn't parse a successful response. The model name is "<parse-failed>" in the latter case so the failure surfaces in summaries. |
ModelSelector instances accept a tracker= kwarg. By default they construct one internally and call start() in the constructor / stop() when select_best() returns. Pass your own when you want to share a cache across runs, point at a daemon, or post-process the records after selection completes:
tracker = LLMTracker(cache_dir="./shared_cache")
selector = ModelSelector(agent=..., models=..., tracker=tracker, ...)
results = selector.select_best()
# tracker has been stopped, but records remain queryable:
print(tracker.get_usage())See selectors.md for the full constructor surface.
from agentopt.proxy import ResponseCacheUsually owned by LLMTracker. Exposed for tests and advanced setups:
| Method | Description |
|---|---|
get(key) |
Look up a cached entry. |
put(key, entry) |
Store an entry (dirty until flush). |
flush() |
Write dirty entries to SQLite. |
clear() |
Drop memory + disk. |
close() |
Flush and stop the background flush thread. |