Skip to content

nryanov/parquet2postgres-go

Repository files navigation

parquet2postgres-go

CLI tool to load Parquet files from S3-compatible storage into a PostgreSQL table. It lists objects under a prefix, infers schema from the first file, optionally creates or truncates the target table, and loads data in batches with parallelism limited by the DB pool size.

Documentation

https://nryanov.com/parquet2postgres-go/

Quick example

go build -o parquet2postgres-go .

export P2PG_DB_PASSWORD=postgres P2PG_DB_USER=postgres
export P2PG_S3_ACCESS_KEY=admin P2PG_S3_SECRET_ACCESS_KEY=password

./parquet2postgres-go dataloader \
  --schema public --table my_table --bucket warehouse \
  --path my/prefix/ --db-host localhost --s3-endpoint localhost:9000

See the docs for flags, environment variables, and a full local Docker setup.

About

CLI tool to load Parquet files from S3-compatible storage into a PostgreSQL table.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors