ENG-2635: Reconcile set_schema with namespace_meta for Postgres connectors#7510
Draft
ENG-2635: Reconcile set_schema with namespace_meta for Postgres connectors#7510
Conversation
Add PostgresNamespaceMeta schema and wire namespace support through PostgresQueryConfig, PostgreSQLConnector, and RDSPostgresConnector. Remove stubbed retrieve_data/mask_data from RDSPostgresConnector so it inherits SQLConnector's implementations, enabling DSR execution. Update RDS Postgres admin UI tags to include "DSR Automation". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guard against None secrets in NamespaceMetaValidationStep. Update test_validate_unsupported_connection_type to use mariadb (postgres is now a supported namespace type). Add Postgres-specific validation tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Postgres defaults to the public schema when neither namespace_meta nor db_schema is configured. Return empty fallback fields so validation doesn't reject existing postgres connections that lack both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove extra parentheses around bind parameters in expected query strings to match current SQLAlchemy output format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The dry-run query test passes db=None to TaskResources, which flows into query_config() -> get_namespace_meta(). Return None early when no db session is available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a dataset's namespace_meta has no overlapping fields with the connection type's namespace schema (e.g. BigQuery namespace_meta on a Postgres connection), skip validation rather than raising an error. This happens when datasets of mixed types are bulk-linked to a single connection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add GoogleCloudSQLPostgresNamespaceMeta with database_name (optional) and schema (required), following the same pattern as Snowflake/BigQuery. Update GoogleCloudSQLPostgresQueryConfig with generate_table_name() for schema-qualified SQL, and GoogleCloudSQLPostgresConnector to fetch namespace_meta from DB and pass it to the query config. Also includes shared fixes from PR #7500: - Guard against None db session in get_namespace_meta - Guard against None secrets in namespace validation - Skip namespace validation for mismatched connection types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ort' into ENG-2635-set-schema-reconciliation
When namespace_meta is present, set_schema() is now skipped because table names are already schema-qualified in the generated SQL. The namespace_meta state is stored on the connector instance via query_config() and checked in set_schema(). Also deduplicates get_qualified_table_name into the base SQLConnector. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ticket ENG-2635
Description Of Changes
Reconcile
set_schema()withnamespace_metafor Postgres connectors (PR 3 of 3 for ENG-2635).When
namespace_metais present, table names are already schema-qualified in the generated SQL (e.g."billing"."customer"). In that case,set_schema()— which sets PostgreSQL'ssearch_path— should be skipped to avoid conflicts. Previously both mechanisms could run simultaneously, which could cause unexpected behavior ifdb_schemawas also configured.This PR also consolidates the duplicated
get_qualified_table_name()override from three Postgres connectors into the baseSQLConnector, since all three had identical logic. The base implementation usesgetattrto check forschemaanddatabase_nameon the parsed namespace_meta, falling back to the plain collection name when no namespace is configured.Stacks on top of:
Code Changes
src/fides/api/service/connectors/sql_connector.py- Added_current_namespace_metainstance variable in__init__; liftedget_qualified_table_name()from subclasses into the base with namespace-aware logicsrc/fides/api/service/connectors/postgres_connector.py-set_schema()skips when_current_namespace_metais set;query_config()stores namespace_meta on the instance; removed duplicateget_qualified_table_name()src/fides/api/service/connectors/google_cloud_postgres_connector.py- Sameset_schemareconciliation pattern; removed duplicateget_qualified_table_name()src/fides/api/service/connectors/rds_postgres_connector.py- Removed duplicateget_qualified_table_name()(inherits from base)tests/ops/service/connectors/test_postgres_connector.py- AddedTestPostgreSQLConnectorSetSchemawith 5 unit teststests/ops/service/connectors/test_google_cloud_postgres_connector.py- AddedTestGoogleCloudSQLPostgresConnectorSetSchemawith 5 unit testsSteps to Confirm
These scenarios verify the three-way logic in
set_schema()and the call flow throughquery_config() → set_schema()during DSR execution.Scenario 1: namespace_meta present →
set_schemaskipped, SQL uses qualified table names{ "fides_key": "my_postgres_dataset", "fides_meta": { "namespace": { "schema": "billing" } }, "collections": [...] }SELECT ... FROM "billing"."customer" WHERE .... TheSET search_pathstatement is NOT executed (check logs — noSetting PostgreSQL search_path before retrieving datamessage). This is becausequery_config()stores the namespace_meta on the connector instance, andset_schema()sees it and returns early.Scenario 2: No namespace_meta,
db_schemain connection secrets →set_schemarunsfides_meta.namespace.db_schemain secrets:{ "host": "...", "dbname": "...", "db_schema": "billing", ... }SELECT ... FROM "customer" WHERE .... TheSET search_path to 'billing'statement IS executed before the query (check logs forSetting PostgreSQL search_path before retrieving data). This is the legacy path —_current_namespace_metaisNone, soset_schema()proceeds to checkdb_schema.Scenario 3: Neither namespace_meta nor
db_schema→ no-op (backward compatible)fides_meta.namespace.db_schemain secrets.SELECT ... FROM "customer" WHERE .... NoSET search_pathis executed. Queries run against the defaultpublicschema. This confirms backward compatibility — existing Postgres connections without any schema configuration continue to work unchanged.Code path reference (for reviewers reading the code):
sql_connector.py:retrieve_data()callsself.query_config(node)(line 184), which stores_current_namespace_metaon the connector instance, then callsself.set_schema(connection)(line 197), which checks the stored state.mask_data()(lines 229, 243) andexecute_standalone_retrieval_query()(lines 160, 171).Manual verification completed against fidesplus-slim container with branch code loaded:
namespace_metapresent →set_schemaskippeddb_schemaset →set_schemarunsdb_schema→ no-opnamespace_metapresent →set_schemaskippedget_qualified_table_nameno namespace → plain nameget_qualified_table_nameschema only →schema.tableget_qualified_table_nameschema+db →db.schema.tablePre-Merge Checklist
CHANGELOG.mdupdated🤖 Generated with Claude Code