Optimize total_surplus query by jmg-duarte · Pull Request #4116 · cowprotocol/services

jmg-duarte · 2026-02-03T10:30:44Z

Description

We've been experiencing latency spikes on several endpoints, we've pinned this down to the time it takes to acquire DB connections from the pool; when checking RDS monitoring, the surplus query always shows up at the top.

The current theory is that the query is a bit slower than it could be, as more users request the main swap page, their surplus is loaded (even if they don't request it — i.e. load the wallet modal) and if some of these users have a larger amount of orders, they're taking up connections that other endpoints aren't getting.

I looked into the user distribution, here are the results:

log_bucket	bucket_start	num_users	pct	cumulative_users	cumulative_pct
0	1	295329	83.86	295329	83.86
1	10	51334	14.58	346663	98.44
2	100	5071	1.44	351734	99.88
3	1000	306	0.09	352040	99.96
4	10000	106	0.03	352146	99.99
5	100000	28	0.01	352174	100.00

Distribution Query

WITH buckets AS (
    SELECT 
        floor(log(order_count))::int as log_bucket,
        power(10, floor(log(order_count)))::int as bucket_start,
        count(*) as num_users
    FROM (
        SELECT owner, count(*) as order_count 
        FROM orders 
        GROUP BY owner
    ) sub
    GROUP BY 1, 2
)
SELECT 
    log_bucket,
    bucket_start,
    num_users,
    round(100.0 * num_users / sum(num_users) over(), 2) as pct,
    sum(num_users) over(order by log_bucket) as cumulative_users,
    round(100.0 * sum(num_users) over(order by log_bucket) / sum(num_users) over(), 2) as cumulative_pct
FROM buckets
ORDER BY log_bucket;

However, if it was just this, it would be too simple. Depending on the user, they might have no orders and only onchain orders (note that the max number of onchain orders is around ~10k), some will have skewed data distributions across tables too, leading to analyzing and optimizing this query a bit tricky.

There are two crucial changes to the query — removing ARRAY_AGG and adding indexes — the first makes it so that the DB does not have to materialize a potentially big array in memory, which would otherwise lead to bad estimations too; the second provides better "access paths" to some of the information the query requires.

RDS Stats (1h)

Before

Screenshot 2026-02-05 at 11-12-43 CloudWatch eu-central-1

After

Screenshot 2026-02-05 at 11-14-53 CloudWatch eu-central-1

Changes

Replace the query with the optimized one
Create indexes (already done to avoid issues during migration)

Query Plans

Before

Aggregate  (cost=1033006.55..1033006.56 rows=1 width=8) (actual time=130139.025..130143.456 rows=1 loops=1)
  ->  Append  (cost=1032650.26..1033006.51 rows=2 width=100) (actual time=31175.499..130090.776 rows=10998 loops=1)
        ->  Subquery Scan on "*SELECT* 1"  (cost=1032650.26..1032856.24 rows=1 width=100) (actual time=31175.498..130085.109 rows=10998 loops=1)
              ->  Nested Loop  (cost=1032650.26..1032856.23 rows=1 width=100) (actual time=31175.497..130079.454 rows=10998 loops=1)
                    Join Filter: (t.order_uid = o.uid)
                    InitPlan 3 (returns $5)
                      ->  Finalize Aggregate  (cost=1032601.57..1032601.58 rows=1 width=32) (actual time=30761.210..30765.632 rows=1 loops=1)
                            ->  Gather  (cost=1032601.35..1032601.56 rows=2 width=32) (actual time=30678.267..30689.666 rows=3 loops=1)
                                  Workers Planned: 2
                                  Workers Launched: 2
                                  ->  Partial Aggregate  (cost=1031601.35..1031601.36 rows=1 width=32) (actual time=30675.862..30675.863 rows=1 loops=3)
                                        ->  Parallel Bitmap Heap Scan on orders  (cost=24591.99..1030861.56 rows=295913 width=57) (actual time=278.099..30619.932 rows=226281 loops=3)
                                              Recheck Cond: (owner = '\x10dad59905d93ca37cd25a35f25349cb5956ba8e'::bytea)
                                              Rows Removed by Index Recheck: 737927
                                              Heap Blocks: exact=15162 lossy=55571
                                              ->  Bitmap Index Scan on order_owner  (cost=0.00..24414.44 rows=710192 width=0) (actual time=271.994..271.994 rows=749143 loops=1)
                                                    Index Cond: (owner = '\x10dad59905d93ca37cd25a35f25349cb5956ba8e'::bytea)
                    InitPlan 4 (returns $6)
                      ->  Aggregate  (cost=47.00..47.01 rows=1 width=32) (actual time=0.021..0.022 rows=1 loops=1)
                            ->  Bitmap Heap Scan on onchain_placed_orders  (cost=4.09..46.97 rows=11 width=57) (actual time=0.017..0.018 rows=0 loops=1)
                                  Recheck Cond: (sender = '\x10dad59905d93ca37cd25a35f25349cb5956ba8e'::bytea)
                                  ->  Bitmap Index Scan on order_sender  (cost=0.00..4.08 rows=11 width=0) (actual time=0.013..0.013 rows=0 loops=1)
                                        Index Cond: (sender = '\x10dad59905d93ca37cd25a35f25349cb5956ba8e'::bytea)
                    ->  Nested Loop  (cost=1.12..136.08 rows=2 width=188) (actual time=31172.849..109529.707 rows=10914 loops=1)
                          ->  Index Scan using orders_pkey on orders o  (cost=0.56..85.71 rows=10 width=123) (actual time=31172.113..101635.999 rows=678843 loops=1)
                                Index Cond: (uid = ANY (array_cat($5, $6)))
                          ->  Index Only Scan using order_rewards_pkey on order_execution oe  (cost=0.56..5.01 rows=3 width=65) (actual time=0.011..0.011 rows=0 loops=678843)
                                Index Cond: (order_uid = o.uid)
                                Heap Fetches: 170
                    ->  Index Scan using trade_order_uid on trades t  (cost=0.56..0.74 rows=2 width=81) (actual time=0.416..0.416 rows=1 loops=10914)
                          Index Cond: (order_uid = oe.order_uid)
                    SubPlan 1
                      ->  Index Scan using auction_prices_pkey on auction_prices ap  (cost=0.58..35.00 rows=16 width=11) (actual time=1.428..1.428 rows=1 loops=9675)
                            Index Cond: ((auction_id = oe.auction_id) AND (token = o.buy_token))
                    SubPlan 2
                      ->  Index Scan using auction_prices_pkey on auction_prices ap_1  (cost=0.58..35.00 rows=16 width=11) (actual time=1.596..1.596 rows=1 loops=1323)
                            Index Cond: ((auction_id = oe.auction_id) AND (token = o.sell_token))
        ->  Subquery Scan on "*SELECT* 2"  (cost=2.09..150.26 rows=1 width=100) (actual time=0.010..0.013 rows=0 loops=1)
              ->  Nested Loop  (cost=2.09..150.25 rows=1 width=100) (actual time=0.009..0.012 rows=0 loops=1)
                    Join Filter: (j.uid = t_1.order_uid)
                    ->  Nested Loop  (cost=1.54..79.45 rows=1 width=192) (actual time=0.009..0.010 rows=0 loops=1)
                          ->  Nested Loop Anti Join  (cost=0.98..74.32 rows=1 width=127) (actual time=0.009..0.010 rows=0 loops=1)
                                ->  Index Scan using jit_user_order_creation_timestamp on jit_orders j  (cost=0.42..33.62 rows=8 width=127) (actual time=0.008..0.009 rows=0 loops=1)
                                      Index Cond: (owner = '\x10dad59905d93ca37cd25a35f25349cb5956ba8e'::bytea)
                                ->  Index Only Scan using orders_pkey on orders o_1  (cost=0.56..5.08 rows=1 width=57) (never executed)
                                      Index Cond: (uid = j.uid)
                                      Heap Fetches: 0
                          ->  Index Only Scan using order_rewards_pkey on order_execution oe_1  (cost=0.56..5.11 rows=3 width=65) (never executed)
                                Index Cond: (order_uid = j.uid)
                                Heap Fetches: 0
                    ->  Index Scan using trade_order_uid on trades t_1  (cost=0.56..0.74 rows=2 width=81) (never executed)
                          Index Cond: (order_uid = oe_1.order_uid)
                    SubPlan 5
                      ->  Index Scan using auction_prices_pkey on auction_prices ap_2  (cost=0.58..35.00 rows=16 width=11) (never executed)
                            Index Cond: ((auction_id = oe_1.auction_id) AND (token = j.buy_token))
                    SubPlan 6
                      ->  Index Scan using auction_prices_pkey on auction_prices ap_3  (cost=0.58..35.00 rows=16 width=11) (never executed)
                            Index Cond: ((auction_id = oe_1.auction_id) AND (token = j.sell_token))
Planning Time: 4.788 ms
Execution Time: 130157.850 ms

After

Aggregate  (cost=14614.00..14614.01 rows=1 width=8) (actual time=1901.439..1902.539 rows=1 loops=1)
  Buffers: shared hit=11838 read=4075
  I/O Timings: shared read=2240.469
  ->  Gather Merge  (cost=14557.86..14604.91 rows=404 width=136) (actual time=1900.725..1902.018 rows=917 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=11838 read=4075
        I/O Timings: shared read=2240.469
        ->  Sort  (cost=13557.83..13558.25 rows=168 width=136) (actual time=762.591..762.628 rows=306 loops=3)
              Sort Key: "*SELECT* 2".uid
              Sort Method: quicksort  Memory: 49kB
              Buffers: shared hit=11838 read=4075
              I/O Timings: shared read=2240.469
              Worker 0:  Sort Method: quicksort  Memory: 115kB
              Worker 1:  Sort Method: quicksort  Memory: 25kB
              ->  Parallel Append  (cost=2.25..13551.63 rows=168 width=136) (actual time=2.247..762.040 rows=306 loops=3)
                    Buffers: shared hit=11824 read=4075
                    I/O Timings: shared read=2240.469
                    ->  Subquery Scan on "*SELECT* 2"  (cost=24.42..13550.79 rows=53 width=136) (actual time=3.496..1892.483 rows=709 loops=1)
                          Buffers: shared hit=9702 read=3415
                          I/O Timings: shared read=1854.511
                          ->  Nested Loop Left Join  (cost=24.42..13550.26 rows=53 width=136) (actual time=3.495..1892.175 rows=709 loops=1)
                                Buffers: shared hit=9702 read=3415
                                I/O Timings: shared read=1854.511
                                ->  Nested Loop  (cost=23.84..13414.32 rows=53 width=155) (actual time=1.455..762.309 rows=709 loops=1)
                                      Buffers: shared hit=8050 read=1521
                                      I/O Timings: shared read=736.402
                                      ->  Nested Loop  (cost=23.28..12573.83 rows=195 width=245) (actual time=1.418..667.512 rows=709 loops=1)
                                            Buffers: shared hit=5387 read=1337
                                            I/O Timings: shared read=647.346
                                            ->  Nested Loop  (cost=22.73..9169.44 rows=796 width=180) (actual time=1.397..584.403 rows=711 loops=1)
                                                  Buffers: shared hit=2649 read=1211
                                                  I/O Timings: shared read=570.801
                                                  ->  Bitmap Heap Scan on onchain_placed_orders opo  (cost=22.17..2343.76 rows=796 width=57) (actual time=0.699..6.255 rows=711 loops=1)
                                                        Recheck Cond: (sender = '\x8ef4fb956d0cb06ca9e3db76040f08154e8d0122'::bytea)
                                                        Buffers: shared hit=56 read=248
                                                        I/O Timings: shared read=2.028
                                                        ->  Bitmap Index Scan on order_sender  (cost=0.00..21.97 rows=796 width=0) (actual time=0.044..0.044 rows=711 loops=1)
                                                              Index Cond: (sender = '\x8ef4fb956d0cb06ca9e3db76040f08154e8d0122'::bytea)
                                                              Buffers: shared hit=4
                                                  ->  Index Scan using orders_pkey on orders o  (cost=0.56..8.57 rows=1 width=123) (actual time=0.812..0.812 rows=1 loops=711)
                                                        Index Cond: (uid = opo.uid)
                                                        Filter: (owner <> '\x8ef4fb956d0cb06ca9e3db76040f08154e8d0122'::bytea)
                                                        Buffers: shared hit=2593 read=963
                                                        I/O Timings: shared read=568.773
                                            ->  Index Only Scan using order_rewards_pkey on order_execution oe  (cost=0.56..4.25 rows=3 width=65) (actual time=0.115..0.115 rows=1 loops=711)
                                                  Index Cond: (order_uid = o.uid)
                                                  Heap Fetches: 16
                                                  Buffers: shared hit=2738 read=126
                                                  I/O Timings: shared read=76.545
                                      ->  Index Only Scan using trades_covering on trades t  (cost=0.56..4.29 rows=2 width=81) (actual time=0.132..0.133 rows=1 loops=709)
                                            Index Cond: (order_uid = o.uid)
                                            Heap Fetches: 0
                                            Buffers: shared hit=2663 read=184
                                            I/O Timings: shared read=89.056
                                ->  Index Scan using auction_prices_pkey on auction_prices ap  (cost=0.58..31.94 rows=16 width=40) (actual time=1.590..1.590 rows=1 loops=709)
                                      Index Cond: ((auction_id = oe.auction_id) AND (token = CASE o.kind WHEN 'sell'::orderkind THEN o.buy_token ELSE o.sell_token END))
                                      Buffers: shared hit=1652 read=1894
                                      I/O Timings: shared read=1118.109
                    ->  Subquery Scan on "*SELECT* 1"  (cost=2.25..4428.09 rows=350 width=136) (actual time=3.233..393.256 rows=208 loops=1)
                          Buffers: shared hit=2119 read=660
                          I/O Timings: shared read=385.958
                          ->  Nested Loop  (cost=2.25..4424.59 rows=350 width=136) (actual time=3.233..393.181 rows=208 loops=1)
                                Join Filter: (t_1.order_uid = o_1.uid)
                                Buffers: shared hit=2119 read=660
                                I/O Timings: shared read=385.958
                                ->  Nested Loop Left Join  (cost=1.70..3742.63 rows=156 width=149) (actual time=3.212..369.737 rows=208 loops=1)
                                      Buffers: shared hit=1320 read=618
                                      I/O Timings: shared read=364.992
                                      ->  Nested Loop  (cost=1.12..3316.62 rows=156 width=188) (actual time=1.266..28.471 rows=208 loops=1)
                                            Buffers: shared hit=849 read=49
                                            I/O Timings: shared read=26.775
                                            ->  Index Only Scan using orders_owner_covering on orders o_1  (cost=0.56..187.36 rows=638 width=123) (actual time=1.247..2.194 rows=210 loops=1)
                                                  Index Cond: (owner = '\x8ef4fb956d0cb06ca9e3db76040f08154e8d0122'::bytea)
                                                  Heap Fetches: 1
                                                  Buffers: shared hit=24 read=7
                                                  I/O Timings: shared read=2.038
                                            ->  Index Only Scan using order_rewards_pkey on order_execution oe_1  (cost=0.56..4.87 rows=3 width=65) (actual time=0.124..0.124 rows=1 loops=210)
                                                  Index Cond: (order_uid = o_1.uid)
                                                  Heap Fetches: 4
                                                  Buffers: shared hit=825 read=42
                                                  I/O Timings: shared read=24.736
                                      ->  Index Scan using auction_prices_pkey on auction_prices ap_1  (cost=0.58..34.99 rows=16 width=40) (actual time=1.639..1.639 rows=1 loops=208)
                                            Index Cond: ((auction_id = oe_1.auction_id) AND (token = CASE o_1.kind WHEN 'sell'::orderkind THEN o_1.buy_token ELSE o_1.sell_token END))
                                            Buffers: shared hit=471 read=569
                                            I/O Timings: shared read=338.217
                                ->  Index Only Scan using trades_covering on trades t_1  (cost=0.56..4.29 rows=2 width=81) (actual time=0.109..0.109 rows=1 loops=208)
                                      Index Cond: (order_uid = oe_1.order_uid)
                                      Heap Fetches: 0
                                      Buffers: shared hit=799 read=42
                                      I/O Timings: shared read=20.966
                    ->  Subquery Scan on "*SELECT* 3"  (cost=2.67..94.60 rows=1 width=136) (actual time=0.012..0.014 rows=0 loops=1)
                          Buffers: shared hit=3
                          ->  Nested Loop Left Join  (cost=2.67..94.59 rows=1 width=136) (actual time=0.011..0.013 rows=0 loops=1)
                                Buffers: shared hit=3
                                ->  Nested Loop  (cost=2.09..91.83 rows=1 width=159) (actual time=0.011..0.013 rows=0 loops=1)
                                      Buffers: shared hit=3
                                      ->  Nested Loop  (cost=1.54..87.55 rows=1 width=208) (actual time=0.011..0.012 rows=0 loops=1)
                                            Buffers: shared hit=3
                                            ->  Nested Loop Anti Join  (cost=0.98..82.50 rows=1 width=127) (actual time=0.011..0.011 rows=0 loops=1)
                                                  Buffers: shared hit=3
                                                  ->  Index Scan using jit_user_order_creation_timestamp on jit_orders j  (cost=0.42..37.21 rows=9 width=127) (actual time=0.010..0.010 rows=0 loops=1)
                                                        Index Cond: (owner = '\x8ef4fb956d0cb06ca9e3db76040f08154e8d0122'::bytea)
                                                        Buffers: shared hit=3
                                                  ->  Index Only Scan using orders_pkey on orders o_2  (cost=0.56..5.02 rows=1 width=57) (never executed)
                                                        Index Cond: (uid = j.uid)
                                                        Heap Fetches: 0
                                            ->  Index Only Scan using trades_covering on trades t_2  (cost=0.56..5.03 rows=2 width=81) (never executed)
                                                  Index Cond: (order_uid = j.uid)
                                                  Heap Fetches: 0
                                      ->  Index Only Scan using order_rewards_pkey on order_execution oe_2  (cost=0.56..4.25 rows=3 width=65) (never executed)
                                            Index Cond: (order_uid = t_2.order_uid)
                                            Heap Fetches: 0
                                ->  Index Scan using auction_prices_pkey on auction_prices ap_2  (cost=0.58..35.01 rows=16 width=40) (never executed)
                                      Index Cond: ((auction_id = oe_2.auction_id) AND (token = CASE j.kind WHEN 'sell'::orderkind THEN j.buy_token ELSE j.sell_token END))
Planning:
  Buffers: shared hit=892 read=21
  I/O Timings: shared read=14.023
Planning Time: 18.375 ms
Execution Time: 1902.681 ms

How to test

Due to the fact that floating point addition is not commutative and the order specified in the old query is not deterministic (the ORDER BY uid is merely an approximation that matches) the validation script leaves some room for differences, 1e-9 to be precise.

Validation script

#!/usr/bin/env python3
"""
Compare original and optimized surplus queries for correctness.
Picks random addresses and verifies both queries return identical results.
"""

import os
import sys
import psycopg2
from psycopg2 import sql
from decimal import Decimal

# Connection settings - override with environment variables
DB_CONFIG = {
    "host": os.getenv("DB_HOST", "localhost"),
    "port": os.getenv("DB_PORT", "5432"),
    "dbname": os.getenv("DB_NAME", "your_database"),
    "user": os.getenv("DB_USER", "your_user"),
    "password": os.getenv("DB_PASSWORD", ""),
}

ORIGINAL_QUERY = """
WITH regular_orders AS (
    SELECT ARRAY_AGG(uid) AS ids FROM orders WHERE owner = $1
),
onchain_orders AS (
    SELECT ARRAY_AGG(uid) AS ids FROM onchain_placed_orders WHERE sender = $1
),
trade_components AS (
    SELECT
       CASE kind
          WHEN 'sell' THEN t.buy_amount
          WHEN 'buy' THEN t.sell_amount - t.fee_amount
       END AS trade_amount,
       CASE kind
          WHEN 'sell' THEN (t.sell_amount - t.fee_amount) * o.buy_amount / o.sell_amount
          WHEN 'buy' THEN t.buy_amount * o.sell_amount / o.buy_amount
       END AS limit_amount,
       o.kind,
       CASE kind
          WHEN 'sell' THEN (SELECT price FROM auction_prices ap WHERE ap.token = o.buy_token AND ap.auction_id = oe.auction_id)
          WHEN 'buy' THEN (SELECT price FROM auction_prices ap WHERE ap.token = o.sell_token AND ap.auction_id = oe.auction_id)
       END AS surplus_token_native_price
    FROM orders o
    JOIN trades t ON o.uid = t.order_uid
    JOIN order_execution oe ON o.uid = oe.order_uid
    WHERE o.uid = ANY(ARRAY_CAT((SELECT ids FROM regular_orders), (SELECT ids FROM onchain_orders)))
    UNION ALL
    SELECT
       CASE j.kind
          WHEN 'sell' THEN t.buy_amount
          WHEN 'buy' THEN t.sell_amount - t.fee_amount
       END AS trade_amount,
       CASE j.kind
          WHEN 'sell' THEN (t.sell_amount - t.fee_amount) * j.buy_amount / j.sell_amount
          WHEN 'buy' THEN t.buy_amount * j.sell_amount / j.buy_amount
       END AS limit_amount,
       j.kind,
       CASE j.kind
          WHEN 'sell' THEN (SELECT price FROM auction_prices ap WHERE ap.token = j.buy_token AND ap.auction_id = oe.auction_id)
          WHEN 'buy' THEN (SELECT price FROM auction_prices ap WHERE ap.token = j.sell_token AND ap.auction_id = oe.auction_id)
       END AS surplus_token_native_price
    FROM jit_orders j
    JOIN trades t ON j.uid = t.order_uid
    JOIN order_execution oe ON j.uid = oe.order_uid
    WHERE j.owner = $1 AND NOT EXISTS (
        SELECT 1
        FROM orders o
        WHERE o.uid = j.uid
    )
),
trade_surplus AS (
    SELECT
        CASE kind
            WHEN 'sell' THEN (trade_amount - limit_amount) * surplus_token_native_price
            WHEN 'buy' THEN (limit_amount - trade_amount) * surplus_token_native_price
        END / POWER(10, 18) AS surplus_in_wei
    FROM trade_components
)
SELECT
   COALESCE(SUM(surplus_in_wei), 0) AS total_surplus_in_wei
FROM trade_surplus;
"""

OPTIMIZED_QUERY = """
WITH trade_components AS (
    -- Regular orders: join trades first, then order_execution
    SELECT
        o.uid,
        CASE o.kind
            WHEN 'sell' THEN t.buy_amount
            WHEN 'buy' THEN t.sell_amount - t.fee_amount
        END AS trade_amount,
        CASE o.kind
            WHEN 'sell' THEN (t.sell_amount - t.fee_amount) * o.buy_amount / o.sell_amount
            WHEN 'buy' THEN t.buy_amount * o.sell_amount / o.buy_amount
        END AS limit_amount,
        o.kind,
        ap.price AS surplus_token_native_price
    FROM orders o
    JOIN trades t ON t.order_uid = o.uid
    JOIN order_execution oe ON oe.order_uid = t.order_uid
    LEFT JOIN auction_prices ap
        ON ap.auction_id = oe.auction_id
        AND ap.token = CASE o.kind WHEN 'sell' THEN o.buy_token ELSE o.sell_token END
    WHERE o.owner = $1

    UNION ALL

    -- Onchain placed orders (if sender differs from owner)
    SELECT
        o.uid,
        CASE o.kind
            WHEN 'sell' THEN t.buy_amount
            WHEN 'buy' THEN t.sell_amount - t.fee_amount
        END AS trade_amount,
        CASE o.kind
            WHEN 'sell' THEN (t.sell_amount - t.fee_amount) * o.buy_amount / o.sell_amount
            WHEN 'buy' THEN t.buy_amount * o.sell_amount / o.buy_amount
        END AS limit_amount,
        o.kind,
        ap.price AS surplus_token_native_price
    FROM onchain_placed_orders opo
    JOIN orders o ON o.uid = opo.uid AND o.owner != $1
    JOIN trades t ON t.order_uid = o.uid
    JOIN order_execution oe ON oe.order_uid = t.order_uid
    LEFT JOIN auction_prices ap
        ON ap.auction_id = oe.auction_id
        AND ap.token = CASE o.kind WHEN 'sell' THEN o.buy_token ELSE o.sell_token END
    WHERE opo.sender = $1

    UNION ALL

    -- JIT orders
    SELECT
        j.uid,
        CASE j.kind
            WHEN 'sell' THEN t.buy_amount
            WHEN 'buy' THEN t.sell_amount - t.fee_amount
        END AS trade_amount,
        CASE j.kind
            WHEN 'sell' THEN (t.sell_amount - t.fee_amount) * j.buy_amount / j.sell_amount
            WHEN 'buy' THEN t.buy_amount * j.sell_amount / j.buy_amount
        END AS limit_amount,
        j.kind,
        ap.price AS surplus_token_native_price
    FROM jit_orders j
    JOIN trades t ON t.order_uid = j.uid
    JOIN order_execution oe ON oe.order_uid = t.order_uid
    LEFT JOIN auction_prices ap
        ON ap.auction_id = oe.auction_id
        AND ap.token = CASE j.kind WHEN 'sell' THEN j.buy_token ELSE j.sell_token END
    WHERE j.owner = $1
      AND NOT EXISTS (SELECT 1 FROM orders o WHERE o.uid = j.uid)
)
SELECT
    COALESCE(SUM(surplus_in_wei ORDER BY uid), 0) AS total_surplus_in_wei
FROM (
    SELECT
        uid,
        CASE kind
            WHEN 'sell' THEN (trade_amount - limit_amount) * surplus_token_native_price
            WHEN 'buy' THEN (limit_amount - trade_amount) * surplus_token_native_price
        END / POWER(10, 18) AS surplus_in_wei
    FROM trade_components
) ts;
"""

# Query to get random addresses with varying order counts
SAMPLE_ADDRESSES_QUERY = """
WITH address_order_counts AS (
    SELECT owner AS address, COUNT(*) AS order_count
    FROM orders
    GROUP BY owner
),
bucketed AS (
    SELECT
        address,
        order_count,
        CASE
            WHEN order_count < 10 THEN '0: 1-9'
            WHEN order_count < 100 THEN '1: 10-99'
            WHEN order_count < 1000 THEN '2: 100-999'
            WHEN order_count < 10000 THEN '3: 1000-9999'
            ELSE '4: 10000+'
        END AS bucket
    FROM address_order_counts
)
SELECT address, order_count, bucket
FROM (
    SELECT address, order_count, bucket,
           ROW_NUMBER() OVER (PARTITION BY bucket ORDER BY RANDOM()) AS rn
    FROM bucketed
) ranked
WHERE rn <= %s
ORDER BY bucket, order_count;
"""


def get_connection():
    return psycopg2.connect(**DB_CONFIG)


def fetch_sample_addresses(conn, samples_per_bucket=5):
    """Get random addresses from each order count bucket."""
    with conn.cursor() as cur:
        cur.execute(SAMPLE_ADDRESSES_QUERY, (samples_per_bucket,))
        return cur.fetchall()


def run_query(conn, query, address):
    """Run a query with the given address parameter."""
    # Convert $1 placeholder to %s for psycopg2
    pg_query = query.replace("$1", "%s")
    # Count how many parameters we need
    param_count = pg_query.count("%s")
    with conn.cursor() as cur:
        cur.execute(pg_query, tuple([address] * param_count))
        result = cur.fetchone()
        return result[0] if result else None


def compare_results(original, optimized, tolerance=1e-9):
    """Compare two numeric results with floating point tolerance."""
    if original is None and optimized is None:
        return True, "both NULL"
    if original is None or optimized is None:
        return False, f"NULL mismatch: original={original}, optimized={optimized}"

    # Convert to float for comparison
    orig_float = float(original)
    opt_float = float(optimized)

    if orig_float == opt_float:
        return True, "exact match"

    # Check relative difference for non-zero values
    if orig_float != 0:
        rel_diff = abs(orig_float - opt_float) / abs(orig_float)
        if rel_diff < tolerance:
            return True, f"within tolerance (rel_diff={rel_diff:.2e})"

    # Check absolute difference for values near zero
    abs_diff = abs(orig_float - opt_float)
    if abs_diff < tolerance:
        return True, f"within tolerance (abs_diff={abs_diff:.2e})"

    return False, f"MISMATCH: original={orig_float}, optimized={opt_float}, diff={abs_diff}"


def main():
    samples_per_bucket = int(sys.argv[1]) if len(sys.argv) > 1 else 5

    print(f"Comparing original vs optimized query ({samples_per_bucket} samples per bucket)")
    print("=" * 80)

    conn = get_connection()

    # Get sample addresses
    print("\nFetching sample addresses...")
    addresses = fetch_sample_addresses(conn, samples_per_bucket)
    print(f"Found {len(addresses)} addresses across buckets\n")

    passed = 0
    failed = 0
    current_bucket = None

    for address, order_count, bucket in addresses:
        if bucket != current_bucket:
            current_bucket = bucket
            print(f"\n--- Bucket: {bucket} ---")

        # Run both queries
        original_result = run_query(conn, ORIGINAL_QUERY, address)
        optimized_result = run_query(conn, OPTIMIZED_QUERY, address)

        # Compare
        match, reason = compare_results(original_result, optimized_result)

        addr_hex = "0x" + address.hex() if isinstance(address, (bytes, memoryview)) else str(address)
        status = "✓" if match else "✗"

        print(f"  {status} {addr_hex[:18]}... ({order_count:>5} orders): {reason}")

        if match:
            passed += 1
        else:
            failed += 1
            # Print detailed values on failure
            print(f"      Original:  {original_result}")
            print(f"      Optimized: {optimized_result}")

    # Summary
    print("\n" + "=" * 80)
    print(f"SUMMARY: {passed} passed, {failed} failed out of {passed + failed} tests")

    conn.close()

    return 0 if failed == 0 else 1


if __name__ == "__main__":
    sys.exit(main())

To validate the performance, I think it's best we give it a run in prod for anywhere from 30 minutes to 2 hours.
Even while requiring indexes, the new query should be faster.

gemini-code-assist

Code Review

The pull request optimizes the total_surplus query by replacing an inefficient ARRAY_AGG approach with a UNION ALL of separate CTEs, which yields significant performance improvements. The logic appears correct, and the performance gains are well-documented. The original comment regarding significant code duplication in the CTEs and the suggestion to refactor for improved maintainability is valid and aligns with best practices, as none of the provided rules contradict it. Therefore, the comment has been kept as is.

squadgazzz

Nice! A few comments.

Also, for such PRs, I'd expect to see EXPLAIN ANALYZE before and after the change.

crates/orderbook/src/database/total_surplus.rs

squadgazzz · 2026-02-03T11:57:10Z

crates/orderbook/src/database/total_surplus.rs

-
-    -- Additional query for jit_orders
+    -- deduplicate orders
+    WHERE o.owner != $1


If I didn't miss anything, the old query included orders where owner = $1 OR sender = $1 (via ARRAY_CAT). The new onchain_order_trades CTE explicitly excludes orders where owner = $1.
This means if an order has both owner = $1 and sender = $1, it will now be counted only once (in regular_order_trades), whereas before it may have been processed differently through the array concatenation.

While deduplication seems reasonable, I just want to ensure that doesn't break any business logic.

crates/orderbook/src/database/total_surplus.rs

m-sz

Thanks for providing the data and script. LGTM.

squadgazzz · 2026-02-03T12:37:54Z

I got a benchmark test + result checker script
SQL Query Benchmark

I am not sure this approach is correct. Postgres heavily uses caching. With the first query, you already warmed up a lot of things, so the CTE one runs much faster. For proper comparison, this requires running it locally on a meaningful data snapshot in an isolated way.

MartinquaXD · 2026-02-03T12:51:56Z

crates/orderbook/src/database/total_surplus.rs

+    WHERE o.owner = $1
 ),
-trade_components AS (
+onchain_order_trades AS (


I believe we could drop the onchain_order_trades altogether, no?
Every onchain placed order also has an entry in the regular orders table so we would already count those traders there if we do:

WHERE COALESCE(onchain_placed_order.sender, o.owner) = $1

Would have to be tested whether this is faster, though (and leads to the same result).

jmg-duarte · 2026-02-03T13:07:05Z

I am not sure this approach is correct. Postgres heavily uses caching. With the first query, you already warmed up a lot of things, so the CTE one runs much faster. For proper comparison, this requires running it locally on a meaningful data snapshot in an isolated way.

For both benchmarks I queried for different addresses, which is closer to what we would experience in prod. The user queries once and the subsequent queries would take advantage of that cache.

I'm open to ideas to improve them!

fafk · 2026-02-04T08:30:58Z

I am not sure this approach is correct. Postgres heavily uses caching. With the first query, you already warmed up a lot of things, so the CTE one runs much faster. For proper comparison, this requires running it locally on a meaningful data snapshot in an isolated way.

For both benchmarks I queried for different addresses, which is closer to what we would experience in prod. The user queries once and the subsequent queries would take advantage of that cache.

I'm open to ideas to improve them!

There is a way to avoid caching but it's mighty unpleasant: pull the DB data, run a local postgres instance in docker, load the data. Then run query 1. Turn off docker. Turn on docker. Run query 2 & compare results. More details here: #3542 (comment) + https://pastebin.com/ijpEmVDs

To be fair I think as long as you are sure the queries are functionally equivalent (give the same results) I'd just deploy it and observe the speed diff and revert if it doesn't work.

jmg-duarte · 2026-02-04T09:26:11Z

There is a way to avoid caching but it's mighty unpleasant

Understatement for +1TB of data. Performing the same setup in AWS will also take a bunch 😭

MartinquaXD · 2026-02-04T09:39:14Z

To be fair I think as long as you are sure the queries are functionally equivalent (give the same results) I'd just deploy it and observe the speed diff and revert if it doesn't work.

Yeah, when no additional indexes are needed and confidence is reasonably high we can just temporarily deploy to prod for a bit.

fafk · 2026-02-04T14:23:55Z

There is a way to avoid caching but it's mighty unpleasant

Understatement for +1TB of data. Performing the same setup in AWS will also take a bunch 😭

The trick is to only to only pull the tables you need. The biggest table (e.g. prices) you wouldn't need. Last time I did this I ended up with 10 gigs.

github-actions · 2026-02-05T11:35:18Z

Reminder: Please update the DB Readme and comment whether migrations are reversible (include rollback scripts if applicable).

If creating new tables, update the tables list.
When adding a new index, consider using CREATE INDEX CONCURRENTLY for tables involved in the critical execution path.
For breaking changes, remember that during rollout k8s starts the new autopilot, runs the Flyway migration, and only then shuts down the old pod. That overlap means the previous version can still be processing requests on the migrated schema, so make it compatible first and ship the breaking DB change in the following release.

Caused by:

database/sql/V103__create_convering_indexes_for_surplus.sql

gemini-code-assist

Code Review

This pull request optimizes the total_surplus query by restructuring it to avoid ARRAY_AGG and adding covering indexes, which significantly improves performance as shown by the query plans. However, I've identified a critical pre-existing issue in the query logic where trades and auction executions are joined. This can lead to incorrect surplus calculations due to a Cartesian product. While this PR is an optimization, the underlying logic bug should be addressed.

crates/orderbook/src/database/total_surplus.rs

squadgazzz

LGTM. The execution plan already shows some improvements. I assume the validation script is meant to only test the business logic, not performance, right? Since it warms up the cache before executing the optimized query.

squadgazzz · 2026-02-05T14:37:09Z

database/README.md

 [CoWSwapEthFlow](https://github.com/cowprotocol/ethflowcontract/blob/main/src/CoWSwapEthFlow.sol) we actually deployed twice so events related to the staging environment should only show up in the staging DB and likewise for production.
 It's also important to note that we only index events from blocks that we are certain will not get reorged. That means specifically that events will be indexed with a block delay of at least 64.

+> For lessons learned on migrations, refer to the respective [Notion page](https://www.notion.so/cownation/Database-migration-learnings-2fb8da5f04ca8076bf05c433e461a139?utm_content=2fb8da5f-04ca-8076-bf05-c433e461a139&utm_campaign=T035UKY5NUB&pvs=6)


This page is private, which doesn't look right for an open-source repo.

Optimize total_surplus query

de0000e

jmg-duarte marked this pull request as ready for review February 3, 2026 11:50

jmg-duarte requested a review from a team as a code owner February 3, 2026 11:50

gemini-code-assist bot reviewed Feb 3, 2026

View reviewed changes

jmg-duarte added the hotfix Labels PRs that should be applied into production right away label Feb 3, 2026

squadgazzz reviewed Feb 3, 2026

View reviewed changes

crates/orderbook/src/database/total_surplus.rs Show resolved Hide resolved

m-sz approved these changes Feb 3, 2026

View reviewed changes

MartinquaXD reviewed Feb 3, 2026

View reviewed changes

jmg-duarte marked this pull request as draft February 4, 2026 10:23

Rework the query

e1fcb81

jmg-duarte force-pushed the jmgd/surplus branch from 0191b65 to e1fcb81 Compare February 4, 2026 14:24

jmg-duarte added 3 commits February 4, 2026 14:25

re-add comments

f8b5bf1

Rework the query again

d43fba4

covering indexes

c51871e

jmg-duarte requested review from MartinquaXD, m-sz and squadgazzz February 5, 2026 11:39

jmg-duarte marked this pull request as ready for review February 5, 2026 11:39

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

crates/orderbook/src/database/total_surplus.rs Show resolved Hide resolved

jmg-duarte added 2 commits February 5, 2026 12:02

add migration notes mention

e9389b1

fix query

6c94793

squadgazzz approved these changes Feb 5, 2026

View reviewed changes

Conversation

jmg-duarte commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

RDS Stats (1h)

Changes

Query Plans

How to test

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

squadgazzz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

squadgazzz Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

m-sz left a comment

Choose a reason for hiding this comment

Uh oh!

squadgazzz commented Feb 3, 2026

Uh oh!

MartinquaXD Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

jmg-duarte commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fafk commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmg-duarte commented Feb 4, 2026

Uh oh!

MartinquaXD commented Feb 4, 2026

Uh oh!

fafk commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

squadgazzz left a comment

Choose a reason for hiding this comment

Uh oh!

squadgazzz Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jmg-duarte commented Feb 3, 2026 •

edited

Loading

jmg-duarte commented Feb 3, 2026 •

edited

Loading

fafk commented Feb 4, 2026 •

edited

Loading

fafk commented Feb 4, 2026 •

edited

Loading