Skip to content

fix(ebean-dao): do not cache empty schema loads (SharedSchemaCache poisoning, #618 regression)#625

Open
NatalliaUlashchick wants to merge 1 commit into
masterfrom
naulashc/fix-sharedschemacache-empty-column-poisoning
Open

fix(ebean-dao): do not cache empty schema loads (SharedSchemaCache poisoning, #618 regression)#625
NatalliaUlashchick wants to merge 1 commit into
masterfrom
naulashc/fix-sharedschemacache-empty-column-poisoning

Conversation

@NatalliaUlashchick

Copy link
Copy Markdown
Contributor

Summary

PR #618 (gma 0.6.185) moved the SharedSchemaCache pre-warm into the EbeanLocalAccess constructor so it always runs at startup. When the DAO is constructed before schema migration creates the entity table — integration tests, or deploys that run migrations as a separate job (LinkedIn's prod pattern) — loadColumns() returns an empty set, and caching it pinned a stale "table has no columns" view for the full 10-minute TTL. Every later columnExists() then returned false, so EbeanLocalAccess.batchGetUnion dropped the aspect from its query and EbeanLocalRelationshipQueryDAO mis-resolved tables.

Reported by the AIM ai-metadata-gms team: bumping datahub-gma to 0.6.184–0.6.189 broke 7 MlProject integ tests — 404s on aspect reads (Projects.getDeprecation), wrong autofill defaults, silently dropped writes, and wrong snapshot / member-update counts. Their bisect: 0.6.183 PASS → 0.6.186 FAIL (0.6.184/0.6.185 were unpublished to artifactory); testMode=false, NEW_SCHEMA_ONLY. Root-caused to #618.

Fix: never cache an empty column/index load — an empty result means the table is not present yet, so leave it uncached and let a later access re-resolve against the migrated schema. Applied in both:

  • SharedSchemaCache (shared path used by EbeanLocalAccess reads), and
  • SchemaValidatorUtil's local cache (used by EbeanLocalRelationshipQueryDAO).

Index-expression results may be legitimately empty and are left unchanged. The cold-start pre-warm perf intent of #618 is preserved.

Testing Done

  • Added 2 SharedSchemaCacheTest regression tests (pre-warm and lazy-read before table creation). They fail on the pre-fix behavior (4 failures across both nonDollarVirtualColumnsEnabled factory variants) and pass with the fix.
  • Full ./gradlew :dao-impl:ebean-dao:test suite passes (BUILD SUCCESSFUL, 40m) — no collateral.
  • AIM to verify end-to-end on ai-metadata-gms once this lands as a published release.

Checklist

🤖 Generated with Claude Code

…isoning, #618 regression)

PR #618 moved the SharedSchemaCache pre-warm into the EbeanLocalAccess constructor
so it always runs at startup. When the DAO is constructed before schema migration
creates the entity table -- integration tests, or deploys that run migrations as a
separate job (LinkedIn's prod pattern) -- loadColumns() returns an empty set, and
caching it pinned a stale "table has no columns" view for the full 10-minute TTL.
Every later columnExists() then returned false, so EbeanLocalAccess.batchGetUnion
dropped the aspect from its query and the relationship query DAO mis-resolved
tables. Consumers (AIM ai-metadata-gms) saw 404s on aspect reads
(Projects.getDeprecation), wrong autofill defaults, silently dropped writes, and
wrong snapshot/member-update counts.

Never cache an empty column/index load -- an empty result means the table is not
present yet, so leave it uncached and let a later access re-resolve against the
migrated schema. Applied in both SharedSchemaCache (shared path, used by
EbeanLocalAccess reads) and SchemaValidatorUtil's local cache (used by
EbeanLocalRelationshipQueryDAO). Index-expression results may be legitimately
empty and are unchanged.

Adds SharedSchemaCacheTest regression coverage (pre-warm and lazy-read before
table creation) that fails on the pre-fix behavior and passes with the fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.88%. Comparing base (e528785) to head (5e12922).

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #625      +/-   ##
============================================
+ Coverage     66.86%   66.88%   +0.01%     
- Complexity     1886     1888       +2     
============================================
  Files           148      148              
  Lines          7274     7277       +3     
  Branches        881      885       +4     
============================================
+ Hits           4864     4867       +3     
  Misses         2025     2025              
  Partials        385      385              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants