Conversation
Summary of ChangesHello @hanakannzashi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the system's ability to track and bill for cached tokens. It integrates a new Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Code ReviewTwo critical issues need to be addressed before merge. 1. Missing Database Migration (Critical - Deployment Blocker)The PR adds The highest existing migration is A prior migration ( -- V0046__add_cache_read_billing.sql
ALTER TABLE models ADD COLUMN cache_read_cost_per_token BIGINT NOT NULL DEFAULT 0;
ALTER TABLE model_history ADD COLUMN cache_read_cost_per_token BIGINT NOT NULL DEFAULT 0;
ALTER TABLE organization_usage_log ADD COLUMN cache_read_tokens INTEGER NOT NULL DEFAULT 0;2. Breaking API Change: alias to rename on Usage Struct (High)File: Changing
Additionally, The safest fix: Non-Issues (Confirmed Correct)
|
There was a problem hiding this comment.
Code Review
This pull request introduces functionality to record and account for cached tokens in token usage and cost calculations. The changes span across API models, database schemas, repositories, and services to support a new cache_read_cost_per_token pricing model. While the implementation is largely consistent, I've found a critical issue in a database query that could lead to runtime errors, and a medium-severity issue related to missing validation. My review includes suggestions to fix these issues.
| context_length, verifiable, is_active, owned_by, | ||
| provider_type, provider_config, attestation_supported, | ||
| input_modalities, output_modalities | ||
| ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16) |
There was a problem hiding this comment.
The VALUES clause is missing a parameter for the cache_read_cost_per_token column. There are 17 columns in the INSERT list but only 16 placeholders here. This will cause a runtime error. You also need to update the corresponding bind parameters in the create_model function to include cache_read_cost_per_token.
| ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16) | |
| ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17) |
| let input = input_tokens.unwrap_or(0); | ||
| let output = output_tokens.unwrap_or(0); | ||
| if input < 0 || output < 0 { | ||
| return Err(UsageError::ValidationError( | ||
| "token counts must be non-negative".into(), | ||
| )); | ||
| } | ||
| RecordUsageApiRequest::ImageGeneration { | ||
| model, | ||
| image_count, | ||
| id, | ||
| } => { | ||
| if id.trim().is_empty() { | ||
| return Err(UsageError::ValidationError( | ||
| "id must be a non-empty string".into(), | ||
| )); | ||
| } | ||
| if *image_count <= 0 { | ||
| return Err(UsageError::ValidationError( | ||
| "image_count must be positive".into(), | ||
| )); | ||
| } | ||
| ( | ||
| model.clone(), | ||
| 0, | ||
| 0, | ||
| Some(*image_count), | ||
| InferenceType::ImageGeneration, | ||
| id.clone(), | ||
| ) | ||
| if input == 0 && output == 0 { | ||
| return Err(UsageError::ValidationError( | ||
| "at least one of input_tokens or output_tokens must be positive".into(), | ||
| )); | ||
| } | ||
| }; | ||
| ( | ||
| model.clone(), | ||
| input, | ||
| output, | ||
| cached_tokens.unwrap_or(0), | ||
| None, | ||
| InferenceType::ChatCompletion, | ||
| id.clone(), | ||
| ) |
There was a problem hiding this comment.
For consistency and robustness, you should also validate that cached_tokens is non-negative, similar to how input_tokens and output_tokens are validated. While the downstream compute_token_cost function handles negative values safely, validating at the API boundary is a good practice.
let input = input_tokens.unwrap_or(0);
let output = output_tokens.unwrap_or(0);
let cached = cached_tokens.unwrap_or(0);
if input < 0 || output < 0 || cached < 0 {
return Err(UsageError::ValidationError(
"token counts must be non-negative".into(),
));
}
if input == 0 && output == 0 {
return Err(UsageError::ValidationError(
"at least one of input_tokens or output_tokens must be positive".into(),
));
}
(
model.clone(),
input,
output,
cached,
None,
InferenceType::ChatCompletion,
id.clone(),
)There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Free Tier Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.
| output_modalities = COALESCE(EXCLUDED.output_modalities, models.output_modalities), | ||
| updated_at = NOW() | ||
| RETURNING id, model_name, model_display_name, model_description, model_icon, | ||
| input_cost_per_token, output_cost_per_token, cost_per_image, |
There was a problem hiding this comment.
Wrong SQL parameter index in ON CONFLICT clause
High Severity
The CASE WHEN $11 IS NULL in the ON CONFLICT clause was not renumbered after cache_read_cost_per_token was inserted as $5. Before this change, $11 referred to owned_by; now $11 refers to is_active (which is always non-null due to unwrap_or(true)). This means owned_by will always be unconditionally overwritten on conflict, breaking the conditional-preserve logic. The reference needs to be $12.
Additional Locations (1)
| ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16) | ||
| RETURNING id, model_name, model_display_name, model_description, model_icon, | ||
| input_cost_per_token, output_cost_per_token, cost_per_image, | ||
| input_cost_per_token, output_cost_per_token, cost_per_image, cache_read_cost_per_token, |
There was a problem hiding this comment.
SQL column count mismatches VALUES parameter count
High Severity
In create_model, the column list now includes cache_read_cost_per_token (17 columns total), but the VALUES clause still has only 16 placeholders ($1–$16), and the params array also has only 16 entries (missing model.cache_read_cost_per_token). This will cause a PostgreSQL runtime error on every call to create_model.


Note
High Risk
Touches billing-critical logic by changing cost calculation to account for cached prompt tokens and by adding new pricing/usage columns via a DB migration. API/DB contract changes could affect downstream clients and historical usage reporting if not rolled out carefully.
Overview
Adds end-to-end support for cached prompt tokens: providers now expose
TokenUsage::cached_tokens(), APIUsageincludesprompt_tokens_details.cached_tokens, and streaming/non-streaming response aggregation carriestotal_cached_tokens.Introduces cache-aware billing by adding
cache_read_cost_per_tokento model pricing (including admin update/list responses and model history) and computing input cost as (non-cached prompt tokens * input rate) + (cached prompt tokens * cache-read rate).Persists and surfaces this data by migrating the DB (
models,model_history,organization_usage_log) and extending usage recording/history APIs to store/returncache_read_tokens(defaulting to 0 where unavailable).Written by Cursor Bugbot for commit a5d2ea4. This will update automatically on new commits. Configure here.