Skip to content

HAVING COUNT(*) fails when COUNT(*) has no alias #50

Description

@fupelaqu

Description

When a GROUP BY query uses COUNT(*) without an alias and references it in a HAVING clause, Elasticsearch rejects the query with:

[action_request_validation_exception] Validation Failed: 1: No aggregation found for path [*]

The bucket_selector aggregation generated for HAVING uses * as its buckets_path, which is not a valid aggregation name in Elasticsearch.

Reproduction

SELECT profile.city AS city,
       COUNT(*),
       AVG(age) AS avg_age
FROM dql_users
GROUP BY profile.city
HAVING COUNT(*) >= 1

Without alias, COUNT(*) is translated to an aggregation named *, and the HAVING clause generates:

"having_filter": {
  "bucket_selector": {
    "buckets_path": { "*": "*" },
    "script": { "source": "params.* >= 1" }
  }
}

Elasticsearch cannot resolve * as an aggregation path.

Workaround

Add an alias to COUNT(*):

SELECT profile.city AS city,
       COUNT(*) AS cnt,
       AVG(age) AS avg_age
FROM dql_users
GROUP BY profile.city
HAVING COUNT(*) >= 1

Expected Behaviour

COUNT(*) without alias should generate a valid aggregation name (e.g., count_*) and the HAVING clause should reference that name correctly.

Code Path Analysis

Aggregation naming

  1. Identifier.metricName (sql/…/package.scala:924-941): when COUNT(*) has no alias, aliasOrName returns "*", then metricName maps it to "count_all"
  2. ElasticAggregation.aggName (bridge/…/ElasticAggregation.scala:131-145): without alias, builds "count__index" via s"${aggType}_${sourceField}" (sourceField is "_index" for COUNT(*))

HAVING → bucket_selector

  1. Having.script (sql/…/query/Having.scala:36-44): generates painless script via MetricSelectorScript.metricSelector(criteria)
  2. Criteria.extractAllMetricsPath (sql/…/query/Where.scala:66-74): recursively extracts metric names from HAVING expressions — returns identifier.allMetricsPath
  3. extractMetricsPathForBucket (bridge/…/ElasticAggregation.scala:737-753): looks up metric name via agg.aggName == metricName || agg.field == metricName

Root cause

The metric name extracted from the HAVING expression ("*" from the identifier) does not match the actual ES aggregation name ("count__index"). When an alias is provided (e.g., AS cnt), both the aggregation name and the HAVING reference use "cnt", so the lookup succeeds.

Context

Related to issue #41 (computed alias for unnamed columns). The auto-alias count_* is already generated for output column naming, but it is not used as the aggregation name in the Elasticsearch query.

Key Files

File Purpose
sql/…/package.scala (lines 906, 924-941) Identifier.aliasOrName, metricName
bridge/…/ElasticAggregation.scala (lines 131-145) aggName computation
bridge/…/ElasticAggregation.scala (lines 513-531) bucket_selector generation for HAVING
bridge/…/ElasticAggregation.scala (lines 737-753) extractMetricsPathForBucket lookup
sql/…/query/Having.scala (lines 36-44) Having.script painless generation
sql/…/query/Where.scala (lines 66-74, 294-299) extractAllMetricsPath from criteria
sql/…/transform/TransformBucketSelectorConfig.scala bucket_selector JSON structure

SBT Modules

Module sbt project Purpose
core core ElasticConversion, response parsing
sql sql SQL parser, Having, Identifier, transform layer
bridge (shared) softclient4es-sql-bridge ElasticAggregation (shared code)
es8/bridge softclient4es8-sql-bridge ES8-specific bridge
testkit softclient4es-core-testkit Shared test specs (ReplGatewayIntegrationSpec)
es8/java es8java ES8 Java client + tests (JavaClient8ReplGatewayIntegrationSpec)

Quick test command

sbt "es8java/testOnly *JavaClient8ReplGateway*"

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions