pgbench can be used with PGAdapter, but with some limitations.
Follow these steps to initialize and run benchmarks with pgbench with PGAdapter:
The default data model that is generated by pgbench does not include primary keys for the tables.
Cloud Spanner requires all tables to have primary keys. Execute the following command to manually
create the data model for pgbench:
psql -h /tmp -p 5432 -d my-database \
-c "START BATCH DDL;
CREATE TABLE pgbench_accounts (
aid integer primary key NOT NULL,
bid integer NULL,
abalance integer NULL,
filler varchar(84) NULL
);
CREATE TABLE pgbench_branches (
bid integer primary key NOT NULL,
bbalance integer NULL,
filler varchar(88) NULL
);
CREATE TABLE pgbench_history (
tid integer NOT NULL DEFAULT -1,
bid integer NOT NULL DEFAULT -1,
aid integer NOT NULL DEFAULT -1,
delta integer NULL,
mtime timestamptz NULL,
filler varchar(22) NULL,
primary key (tid, bid, aid)
);
CREATE TABLE pgbench_tellers (
tid integer primary key NOT NULL,
bid integer NULL,
tbalance integer NULL,
filler varchar(84) NULL
);
RUN BATCH;"pgbench deletes and inserts data into PostgreSQL using a combination of truncate, insert and
copy statements. These statements all run in a single transaction. The amount of data that is
modified during this transaction will exceed the transaction mutation limits of Cloud Spanner. This
can be worked around by adding the following options to the pgbench initialization command:
pgbench "host=/tmp port=5432 dbname=my-database \
options='-c spanner.force_autocommit=on -c spanner.autocommit_dml_mode=\'partitioned_non_atomic\''" \
-i -Ig \
--scale=100These additional options do the following:
spanner.force_autocommit=true: This instructs PGAdapter to ignore any transaction statements and execute all statements in autocommit mode. This prevents the initialization from being executed as a single, large transaction.spanner.autocommit_dml_mode='partitioned_non_atomic': This instructs PGAdapter to use Partitioned DML for (large) update statements. This ensures that a single statement succeeds even if it would exceed the transaction limits of Cloud Spanner, including largecopyoperations.-iactivates initialization mode ofpgbench.-Iginstructspgbenchto generate test data client side. PGAdapter does not support generating test data server side.
You can run different benchmarks after finishing the steps above.
Run a default benchmark to verify that everything works as expected.
It is recommended to always use either --protocol=extended or --protocol=prepared for Cloud
Spanner. The default --protocol=simple mode that is used by pgbench, will include all query
parameters as literals in the query string and cause additional query parsing time on Cloud Spanner.
pgbench "host=/tmp port=5432 dbname=my-database" --protocol=extendedIncrease the number of clients and threads to increase the number of parallel transactions.
pgbench "host=/tmp port=5432 dbname=my-database" \
--client=100 --jobs=100 \
--progress=10 \
--protocol=extendedNote that PGAdapter by default uses a gRPC channel pool with 4 channels.
Running pgbench with more than 400 clients efficiently requires more channels.
Starting PGAdapter using Java with a larger gRPC channel pool:
java -jar pgadapter.jar -p my-project -i my-instance -d my-database \
-r="numChannels=16"Starting PGAdapter using Docker with a larger gRPC channel pool:
docker run -p 5432:5432 \
gcr.io/cloud-spanner-pg-adapter/pgadapter \
-p my-project -i my-instance -x \
-r="numChannels=16"pgbench "host=/tmp port=5432 dbname=my-database" \
--client=800 --jobs=100 \
--progress=10 \
--protocol=extendedExecute the following command to remove the pgbench tables from your database if you no longer
need them.
psql -h /tmp -p 5432 -d my-database \
-c "START BATCH DDL;
DROP TABLE pgbench_history;
DROP TABLE pgbench_tellers;
DROP TABLE pgbench_branches;
DROP TABLE pgbench_accounts;
RUN BATCH;"