Skip to content

Conversation

@Meldiron
Copy link
Contributor

@Meldiron Meldiron commented Jan 15, 2026

Appending during migration happens multiple times when re-running. Changes here prevent that, ensuring only one attribute exist in metadata of same name

Summary by CodeRabbit

  • Bug Fixes

    • Consolidated attribute writes into a single update, deduplicating entries for both single and bulk creations and relaxing duplicate validation during migrations with shared tables.
  • Tests

    • Added end-to-end tests validating single and batch attribute creation while migrating to ensure no duplicates and correct final attribute state.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 15, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Attribute creation now reads existing metadata, deduplicates attributes by ID, appends new attribute documents in-memory, and writes the full attributes array once for both single and bulk creation paths. Two e2e tests were added to validate idempotent behavior during migrating/shared-table scenarios.

Changes

Cohort / File(s) Summary
Attribute Duplicate Prevention
src/Database/Database.php
Reworked single (createAttribute) and bulk (createAttributes) creation to: read metadata → filter existing attributes by ID (dedupe) → aggregate new attributes → single metadata write. validateAttribute updated to relax strict duplicate checks during shared-table migration (treat DuplicateException as no-op in that path).
Attribute Migration Tests
tests/e2e/Adapter/Scopes/AttributeTests.php
Added testCreateAttributeWhileMigrating() and testCreateAttributesWhileMigrating() to assert idempotent single and batch attribute creation while migration/shared-tables mode is enabled; includes setup/teardown toggling migrating and collection lifecycle.

Sequence Diagram(s)

(omitted — changes are localized to DB logic and tests and do not involve a multi-actor sequential flow that requires visualization)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • abnegate

Poem

🐇
I nosed the metadata, found a twin,
Read the list, nudged duplicates to the bin.
Gathered new friends, wrote them all tight,
Migrating mornings now cozy and bright,
Hop — attributes tucked in for the night.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: preventing double metadata entries during migration by deduplicating attributes. It directly reflects the core changes in the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af14b29 and 64bd5f3.

📒 Files selected for processing (1)
  • tests/e2e/Adapter/Scopes/AttributeTests.php
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/e2e/Adapter/Scopes/AttributeTests.php
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (MongoDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/MongoDB)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (Schemaless/MongoDB)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Unit Test

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where duplicate attributes were being added to collection metadata during migration scenarios when attribute creation was re-run multiple times. The fix ensures that only one attribute of each name exists in metadata.

Changes:

  • Modified createAttribute and createAttributes methods to filter out duplicate attributes before appending to collection metadata
  • Added comprehensive test coverage for the singular createAttribute method during migration
  • Updated open-telemetry/exporter-otlp dependency from version 1.3.3 to 1.3.4

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/Database/Database.php Implemented deduplication logic in createAttribute and createAttributes methods to prevent duplicate attributes in metadata
tests/e2e/Adapter/Scopes/AttributeTests.php Added test case verifying that repeated attribute creation during migration doesn't create duplicates
composer.lock Updated open-telemetry/exporter-otlp dependency version and checksums

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/Database/Database.php`:
- Around line 2050-2054: The dedupe filter compares against the raw $id but
stored attributes use normalized IDs (ID::custom($id)), so compute the
normalized id the same way as validateAttribute (e.g. $normalizedId =
ID::custom($id)) and use that in the array_filter comparison (and null-check
$attr['$id'] before comparing) so duplicates are removed using the
normalized/case-folded ID rather than the raw $id.
🧹 Nitpick comments (1)
src/Database/Database.php (1)

2176-2186: Bulk dedupe: use strict membership + avoid shadowing $attributes param (and consider intra-batch uniqueness)

  • \in_array() should be strict (true) to avoid loose comparisons.
  • Reusing $attributes as both the method parameter and the “existing metadata attributes” array is confusing.
  • If the input $attributes contains duplicate '$id' values, $attributeDocuments can still contain duplicates and get appended twice.
Proposed fix (minimal + strict + clearer names)
-        // Append attribute, removing any duplicates
-        $attributes = $collection->getAttribute('attributes', []);
-        $newAttributeIds = \array_map(fn ($attr) => $attr['$id'], $attributeDocuments);
-        $attributes = \array_filter($attributes, fn ($attr) => !\in_array($attr['$id'], $newAttributeIds));
+        // Append attributes, removing any duplicates
+        $existingAttributes = $collection->getAttribute('attributes', []);
+        $newAttributeIds = \array_map(fn ($attr) => $attr['$id'], $attributeDocuments);
+        $existingAttributes = \array_filter(
+            $existingAttributes,
+            fn ($attr) => !\in_array($attr['$id'], $newAttributeIds, true)
+        );

         foreach ($attributeDocuments as $attributeDocument) {
-            $attributes[] = $attributeDocument;
+            $existingAttributes[] = $attributeDocument;
         }

-        $collection->setAttribute('attributes', \array_values($attributes));
+        $collection->setAttribute('attributes', \array_values($existingAttributes));
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e6a033 and 3f91272.

⛔ Files ignored due to path filters (1)
  • composer.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • src/Database/Database.php
  • tests/e2e/Adapter/Scopes/AttributeTests.php
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/AttributeTests.php
📚 Learning: 2025-10-29T12:27:57.071Z
Learnt from: ArnabChatterjee20k
Repo: utopia-php/database PR: 747
File: src/Database/Adapter/Mongo.php:1449-1453
Timestamp: 2025-10-29T12:27:57.071Z
Learning: In src/Database/Adapter/Mongo.php, when getSupportForAttributes() returns false (schemaless mode), the updateDocument method intentionally uses a raw document without $set operator for replacement-style updates, as confirmed by the repository maintainer ArnabChatterjee20k.

Applied to files:

  • src/Database/Database.php
📚 Learning: 2025-11-25T10:46:37.666Z
Learnt from: fogelito
Repo: utopia-php/database PR: 763
File: src/Database/Database.php:8671-8751
Timestamp: 2025-11-25T10:46:37.666Z
Learning: In Utopia\Database\Database::processRelationshipQueries, ensure the system select for '$id' is added only when the collection has relationship attributes (i.e., !empty($relationships)), and avoid capturing the unused '$idAdded' from Query::addSelect to satisfy static analysis.

Applied to files:

  • src/Database/Database.php
🧬 Code graph analysis (1)
src/Database/Database.php (1)
src/Database/Document.php (2)
  • getAttribute (224-231)
  • setAttribute (244-261)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Setup & Build Docker Image
🔇 Additional comments (1)
tests/e2e/Adapter/Scopes/AttributeTests.php (1)

2199-2243: LGTM!

The test properly validates the duplicate-prevention logic during migrations. Good coverage with three consecutive attribute creation attempts, verifying that only one attribute entry exists after each call. The test correctly:

  • Guards for shared tables support
  • Enables/disables migration mode appropriately
  • Cleans up resources after completion

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

src/Database/Database.php:4177

  • Similar to the attribute creation issue that was fixed, index creation during migrations could also result in duplicate indexes in metadata. The same fix should be applied here: filter out existing indexes by ID before appending to prevent duplicates during migration re-runs.
        $collection->setAttribute('indexes', $index, Document::SET_TYPE_APPEND);

src/Database/Database.php:3498

  • Relationship attributes are appended using SET_TYPE_APPEND without checking for duplicates. During migrations, this could lead to duplicate relationship attributes in metadata. Consider applying the same duplicate-prevention pattern used in createAttribute().
        $collection->setAttribute('attributes', $relationship, Document::SET_TYPE_APPEND);
        $relatedCollection->setAttribute('attributes', $twoWayRelationship, Document::SET_TYPE_APPEND);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/Database/Database.php (1)

2249-2266: Post-create metadata dedupe is case-sensitive but validateAttribute uses case-insensitive comparison—migration reruns can accumulate case-variant duplicates.

When migration skips duplicate checks via $isLoose = true, the post-create dedupe at lines 2050–2053 and 2176–2186 must handle all ID-equivalence rules. However, post-create dedupe uses exact string comparison ($attribute['$id'] !== $id), while validateAttribute uses case-insensitive comparison (\strtolower()). On reruns, old metadata with "MyAttr" won't be removed if the new attribute is "myattr", leaving duplicates. Additionally, schema validation applies adapter.filter() but metadata validation doesn't, creating inconsistent normalization rules.

Update post-create dedupe to use case-insensitive comparison and apply adapter.filter() consistently with the validation checks.

♻️ Duplicate comments (2)
src/Database/Database.php (2)

2050-2054: Fix attribute metadata dedupe to use normalized, case-insensitive IDs (current compare can still allow duplicates).

Right now Line 2052 compares '$id' !== $id (raw, case-sensitive) while metadata IDs are treated case-insensitively (Line 2246) and the new attribute ID is created via ID::custom($id) (Line 2280). This can miss duplicates and defeats the PR objective in edge cases.

Proposed fix
         // Append attribute, removing any duplicates
         $metadataAttributes = $collection->getAttribute('attributes', []);
-        $metadataAttributes = \array_filter($metadataAttributes, fn ($attribute) => $attribute['$id'] !== $id);
+        $normalizedId = $attribute->getId(); // already ID::custom($id)
+        $metadataAttributes = \array_filter($metadataAttributes, function ($existing) use ($normalizedId) {
+            $existingId = $existing instanceof Document
+                ? $existing->getId()
+                : ($existing['$id'] ?? null);
+
+            return $existingId === null
+                || \strtolower($existingId) !== \strtolower($normalizedId);
+        });
         $metadataAttributes[] = $attribute;
         $collection->setAttribute('attributes', \array_values($metadataAttributes));

2176-2186: Batch metadata dedupe: avoid O(n*m) + apply case-insensitive/strict matching.

Line 2179 uses in_array() inside array_filter() (O(n*m)) and is case-sensitive/non-strict. Since attribute IDs are case-insensitive (Line 2246), this can also miss duplicates in migration reruns.

Proposed fix
         // Append attribute, removing any duplicates
-        $newAttributeIds = \array_map(fn ($attr) => $attr['$id'], $attributeDocuments);
+        $newAttributeIds = \array_map(fn (Document $attr) => $attr->getId(), $attributeDocuments);
+        $newAttributeIdsMap = \array_fill_keys(\array_map('strtolower', $newAttributeIds), true);
         $metadataAttributes = $collection->getAttribute('attributes', []);
-        $metadataAttributes = \array_filter($metadataAttributes, fn ($attr) => !\in_array($attr['$id'], $newAttributeIds));
+        $metadataAttributes = \array_filter($metadataAttributes, function ($existing) use ($newAttributeIdsMap) {
+            $existingId = $existing instanceof Document
+                ? $existing->getId()
+                : ($existing['$id'] ?? null);
+
+            return $existingId === null
+                || !isset($newAttributeIdsMap[\strtolower($existingId)]);
+        });
 
-        foreach ($attributeDocuments as $attributeDocument) {
-            $metadataAttributes[] = $attributeDocument;
-        }
-
-        $collection->setAttribute('attributes', \array_values($metadataAttributes));
+        $metadataAttributes = \array_merge(\array_values($metadataAttributes), $attributeDocuments);
+        $collection->setAttribute('attributes', $metadataAttributes);
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3923367 and af14b29.

📒 Files selected for processing (1)
  • src/Database/Database.php
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • src/Database/Database.php
📚 Learning: 2025-07-30T19:17:53.630Z
Learnt from: ArnabChatterjee20k
Repo: utopia-php/database PR: 642
File: src/Database/Validator/PartialStructure.php:43-52
Timestamp: 2025-07-30T19:17:53.630Z
Learning: In PartialStructure validator, when filtering for required attributes validation using the $requiredAttributes parameter, $this->attributes should be used instead of the merged $attributes array because this validation is specifically for internal attributes like $createdAt and $updatedAt that are defined in the base Structure class, not collection-specific attributes.

Applied to files:

  • src/Database/Database.php
📚 Learning: 2025-11-25T10:46:37.666Z
Learnt from: fogelito
Repo: utopia-php/database PR: 763
File: src/Database/Database.php:8671-8751
Timestamp: 2025-11-25T10:46:37.666Z
Learning: In Utopia\Database\Database::processRelationshipQueries, ensure the system select for '$id' is added only when the collection has relationship attributes (i.e., !empty($relationships)), and avoid capturing the unused '$idAdded' from Query::addSelect to satisfy static analysis.

Applied to files:

  • src/Database/Database.php
📚 Learning: 2025-10-16T09:37:33.531Z
Learnt from: fogelito
Repo: utopia-php/database PR: 733
File: src/Database/Adapter/MariaDB.php:1801-1806
Timestamp: 2025-10-16T09:37:33.531Z
Learning: In the MariaDB adapter (src/Database/Adapter/MariaDB.php), only duplicate `_uid` violations should throw `DuplicateException`. All other unique constraint violations, including `PRIMARY` key collisions on the internal `_id` field, should throw `UniqueException`. This is the intended design to distinguish between user-facing document duplicates and internal/user-defined unique constraint violations.

Applied to files:

  • src/Database/Database.php
🧬 Code graph analysis (1)
src/Database/Database.php (2)
src/Database/Document.php (2)
  • getAttribute (224-231)
  • setAttribute (244-261)
src/Database/Adapter.php (1)
  • getSharedTables (212-215)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Schemaless/MongoDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MongoDB)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MongoDB)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Unit Test
  • GitHub Check: Agent

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@Meldiron Meldiron marked this pull request as draft January 15, 2026 14:04
@Meldiron
Copy link
Contributor Author

Do not merge, our understanding of why condition is for shared tables only isnt full, so we are avoiding this approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants