Skip to content

feat(query): Support Geometry aggregate functions#19620

Open
b41sh wants to merge 2 commits intodatabendlabs:mainfrom
b41sh:feat-geo-agg-funcs
Open

feat(query): Support Geometry aggregate functions#19620
b41sh wants to merge 2 commits intodatabendlabs:mainfrom
b41sh:feat-geo-agg-funcs

Conversation

@b41sh
Copy link
Copy Markdown
Member

@b41sh b41sh commented Mar 26, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR introduces geometry scalar and aggregate functions, implements a consistent overlay pipeline for union/intersection/difference.

New Functions

Scalar Functions

  • st_centroid — returns the centroid of a geometry/geography.
  • st_envelope — returns the bounding rectangle as a polygon.
  • st_union — returns the union of two geometries/geographies.
  • st_intersection — returns the intersection of two geometries/geographies.
  • st_difference — returns the portion of the first geometry/geography not in the second.
  • st_symdifference — returns the symmetric difference of two geometries/geographies.

Aggregate Functions

  • st_union_agg — unions all input geometries/geographies into one.
  • st_intersection_agg — intersects all input geometries/geographies into one.
  • st_envelope_agg — returns a bounding polygon for all inputs.
  • st_collect — collects inputs into a Multi* or GeometryCollection.

Union, Intersection and Difference overlay operations details

  • Geometry overlay operations normalizes inputs by dimension (points/lines/polygons), performs union/intersection/difference/symdifference and assembles the result as a unified geometry (Multi* when homogenous, GeometryCollection when mixed).
  • For union-like cases with disjoint inputs, the pipeline reduces work by directly combining elements, while still preserving correct output types and ordering.

fixes: #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Mar 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

🤖 CI Job Analysis (Retry 1)

Workflow: 23950646773

📊 Summary

  • Total Jobs: 87
  • Failed Jobs: 1
  • Retryable: 0
  • Code Issues: 1

NO RETRY NEEDED

All failures appear to be code/test issues requiring manual fixes.

🔍 Job Details

  • linux / test_compat_client_cluster: Not retryable (Code/Test)

🤖 About

Automated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds new spatial (Geometry/Geography) scalar and aggregate functions and introduces a shared overlay/aggregation implementation in the expression layer, with updated output formatting and extensive test coverage updates.

Changes:

  • Add Geometry/Geography overlay-based scalar functions (centroid/envelope/union/intersection/difference/symdifference and boolean predicates) and new aggregates (union/intersection/envelope/collect).
  • Introduce a geography projection model (best-SRID selection + project/overlay/unproject with coordinate rounding) and a unified aggregate implementation for spatial types.
  • Update result formatting/serialization (notably Geography EWKT/EWKB) and refresh SQL logic + golden test outputs accordingly.

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
THIRD-PARTY-NOTICES.txt Adds third-party license notices/text for adapted GEOS/PostGIS-derived components.
tests/sqllogictests/suites/query/functions/02_0076_function_geography.test Updates Geography outputs (EWKT includes SRID) and adds new scalar/aggregate test cases.
tests/sqllogictests/suites/query/functions/02_0060_function_geometry.test Adds new Geometry scalar/aggregate test cases.
src/query/service/src/servers/http/v1/query/blocks_serializer.rs Adjusts HTTP result serialization for Geography, forcing SRID=4326 for EWKT/EWKB.
src/query/functions/tests/it/scalars/testdata/geometry.txt Updates golden outputs and adds coverage for new Geometry scalar functions.
src/query/functions/tests/it/scalars/testdata/function_list.txt Registers new function overloads in the golden function list.
src/query/functions/tests/it/scalars/geometry.rs Adds scalar integration tests for new Geometry functions.
src/query/functions/tests/it/scalars/geography.rs Adds scalar integration tests for new Geography functions/predicates.
src/query/functions/tests/it/aggregates/testdata/agg.txt Adds aggregate golden outputs for new spatial aggregates (incl. Geography).
src/query/functions/tests/it/aggregates/testdata/agg_group_by.txt Adds group-by aggregate golden outputs for new spatial aggregates.
src/query/functions/tests/it/aggregates/agg.rs Adds aggregate test cases and new sample datasets for spatial aggregates.
src/query/functions/src/scalars/geographic/src/register.rs Introduces shared registration helpers for Geometry/Geography scalar functions and SRID checks.
src/query/functions/src/scalars/geographic/src/lib.rs Wires in the new scalar registration module.
src/query/functions/src/aggregates/mod.rs Switches spatial aggregates to the new unified aggregate module.
src/query/functions/src/aggregates/aggregator.rs Registers new spatial aggregate functions in the aggregate factory.
src/query/functions/src/aggregates/aggregate_st_collect.rs Removes the old Geometry-only st_collect aggregate implementation.
src/query/functions/src/aggregates/aggregate_geographic_agg.rs Adds unified spatial aggregates: st_collect, st_union_agg, st_intersection_agg, st_envelope_agg.
src/query/formats/src/field_encoder/values.rs Updates Geography field encoding for text/CSV/etc output formats (SRID handling for EWKT/EWKB).
src/query/expression/src/utils/display.rs Adjusts internal scalar display formatting for Geography (via ToGeo + WKT).
src/query/expression/src/lib.rs Exposes the new geographic module from the expression crate.
src/query/expression/src/geographic/mod.rs Adds the expression-layer geographic module wiring.
src/query/expression/src/geographic/srid.rs Adds best-SRID selection and projection/unprojection with rounding for Geography overlay execution.
src/query/expression/src/geographic/gbox.rs Adds geocentric bounds (gbox) computations used for best-SRID selection.
src/query/expression/src/geographic/aggregate.rs Adds overlay-based aggregate ops used by both scalar and aggregate functions.
src/query/expression/Cargo.toml Adds proj4rs dependency for projection support.
src/common/io/src/lib.rs Re-exports new geography formatting helper(s).
src/common/io/src/geometry.rs Adds rect_to_polygon helper used by envelope-related ops.
src/common/io/src/geography.rs Adds geography_format helper for consistent Geography formatting across output types.
Cargo.toml Bumps geo-index and proj4rs workspace dependency versions.
Cargo.lock Updates lockfile entries for the bumped/added dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@b41sh b41sh force-pushed the feat-geo-agg-funcs branch from be3f973 to a8d32fe Compare April 2, 2026 06:18
@b41sh b41sh changed the title feat(query): Support Geometry and Geography aggregate functions feat(query): Support Geometry aggregate functions Apr 2, 2026
@b41sh b41sh marked this pull request as ready for review April 2, 2026 06:23
@b41sh b41sh requested review from Copilot and sundy-li April 2, 2026 06:23
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8d32fe170

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 29 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@b41sh b41sh force-pushed the feat-geo-agg-funcs branch from a8d32fe to d376bba Compare April 3, 2026 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants