-
Notifications
You must be signed in to change notification settings - Fork 54
Fix: Double metadata while migrating #787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAttribute creation now reads existing metadata, deduplicates attributes by ID, appends new attribute documents in-memory, and writes the full attributes array once for both single and bulk creation paths. Two e2e tests were added to validate idempotent behavior during migrating/shared-table scenarios. Changes
Sequence Diagram(s)(omitted — changes are localized to DB logic and tests and do not involve a multi-actor sequential flow that requires visualization) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a bug where duplicate attributes were being added to collection metadata during migration scenarios when attribute creation was re-run multiple times. The fix ensures that only one attribute of each name exists in metadata.
Changes:
- Modified
createAttributeandcreateAttributesmethods to filter out duplicate attributes before appending to collection metadata - Added comprehensive test coverage for the singular
createAttributemethod during migration - Updated
open-telemetry/exporter-otlpdependency from version 1.3.3 to 1.3.4
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/Database/Database.php | Implemented deduplication logic in createAttribute and createAttributes methods to prevent duplicate attributes in metadata |
| tests/e2e/Adapter/Scopes/AttributeTests.php | Added test case verifying that repeated attribute creation during migration doesn't create duplicates |
| composer.lock | Updated open-telemetry/exporter-otlp dependency version and checksums |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/Database/Database.php`:
- Around line 2050-2054: The dedupe filter compares against the raw $id but
stored attributes use normalized IDs (ID::custom($id)), so compute the
normalized id the same way as validateAttribute (e.g. $normalizedId =
ID::custom($id)) and use that in the array_filter comparison (and null-check
$attr['$id'] before comparing) so duplicates are removed using the
normalized/case-folded ID rather than the raw $id.
🧹 Nitpick comments (1)
src/Database/Database.php (1)
2176-2186: Bulk dedupe: use strict membership + avoid shadowing$attributesparam (and consider intra-batch uniqueness)
\in_array()should be strict (true) to avoid loose comparisons.- Reusing
$attributesas both the method parameter and the “existing metadata attributes” array is confusing.- If the input
$attributescontains duplicate'$id'values,$attributeDocumentscan still contain duplicates and get appended twice.Proposed fix (minimal + strict + clearer names)
- // Append attribute, removing any duplicates - $attributes = $collection->getAttribute('attributes', []); - $newAttributeIds = \array_map(fn ($attr) => $attr['$id'], $attributeDocuments); - $attributes = \array_filter($attributes, fn ($attr) => !\in_array($attr['$id'], $newAttributeIds)); + // Append attributes, removing any duplicates + $existingAttributes = $collection->getAttribute('attributes', []); + $newAttributeIds = \array_map(fn ($attr) => $attr['$id'], $attributeDocuments); + $existingAttributes = \array_filter( + $existingAttributes, + fn ($attr) => !\in_array($attr['$id'], $newAttributeIds, true) + ); foreach ($attributeDocuments as $attributeDocument) { - $attributes[] = $attributeDocument; + $existingAttributes[] = $attributeDocument; } - $collection->setAttribute('attributes', \array_values($attributes)); + $collection->setAttribute('attributes', \array_values($existingAttributes));
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
composer.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
src/Database/Database.phptests/e2e/Adapter/Scopes/AttributeTests.php
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
Applied to files:
tests/e2e/Adapter/Scopes/AttributeTests.php
📚 Learning: 2025-10-29T12:27:57.071Z
Learnt from: ArnabChatterjee20k
Repo: utopia-php/database PR: 747
File: src/Database/Adapter/Mongo.php:1449-1453
Timestamp: 2025-10-29T12:27:57.071Z
Learning: In src/Database/Adapter/Mongo.php, when getSupportForAttributes() returns false (schemaless mode), the updateDocument method intentionally uses a raw document without $set operator for replacement-style updates, as confirmed by the repository maintainer ArnabChatterjee20k.
Applied to files:
src/Database/Database.php
📚 Learning: 2025-11-25T10:46:37.666Z
Learnt from: fogelito
Repo: utopia-php/database PR: 763
File: src/Database/Database.php:8671-8751
Timestamp: 2025-11-25T10:46:37.666Z
Learning: In Utopia\Database\Database::processRelationshipQueries, ensure the system select for '$id' is added only when the collection has relationship attributes (i.e., !empty($relationships)), and avoid capturing the unused '$idAdded' from Query::addSelect to satisfy static analysis.
Applied to files:
src/Database/Database.php
🧬 Code graph analysis (1)
src/Database/Database.php (1)
src/Database/Document.php (2)
getAttribute(224-231)setAttribute(244-261)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Setup & Build Docker Image
🔇 Additional comments (1)
tests/e2e/Adapter/Scopes/AttributeTests.php (1)
2199-2243: LGTM!The test properly validates the duplicate-prevention logic during migrations. Good coverage with three consecutive attribute creation attempts, verifying that only one attribute entry exists after each call. The test correctly:
- Guards for shared tables support
- Enables/disables migration mode appropriately
- Cleans up resources after completion
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (2)
src/Database/Database.php:4177
- Similar to the attribute creation issue that was fixed, index creation during migrations could also result in duplicate indexes in metadata. The same fix should be applied here: filter out existing indexes by ID before appending to prevent duplicates during migration re-runs.
$collection->setAttribute('indexes', $index, Document::SET_TYPE_APPEND);
src/Database/Database.php:3498
- Relationship attributes are appended using SET_TYPE_APPEND without checking for duplicates. During migrations, this could lead to duplicate relationship attributes in metadata. Consider applying the same duplicate-prevention pattern used in createAttribute().
$collection->setAttribute('attributes', $relationship, Document::SET_TYPE_APPEND);
$relatedCollection->setAttribute('attributes', $twoWayRelationship, Document::SET_TYPE_APPEND);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This reverts commit 3923367.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Database/Database.php (1)
2249-2266: Post-create metadata dedupe is case-sensitive but validateAttribute uses case-insensitive comparison—migration reruns can accumulate case-variant duplicates.When migration skips duplicate checks via
$isLoose = true, the post-create dedupe at lines 2050–2053 and 2176–2186 must handle all ID-equivalence rules. However, post-create dedupe uses exact string comparison ($attribute['$id'] !== $id), whilevalidateAttributeuses case-insensitive comparison (\strtolower()). On reruns, old metadata with "MyAttr" won't be removed if the new attribute is "myattr", leaving duplicates. Additionally, schema validation appliesadapter.filter()but metadata validation doesn't, creating inconsistent normalization rules.Update post-create dedupe to use case-insensitive comparison and apply
adapter.filter()consistently with the validation checks.
♻️ Duplicate comments (2)
src/Database/Database.php (2)
2050-2054: Fix attribute metadata dedupe to use normalized, case-insensitive IDs (current compare can still allow duplicates).Right now Line 2052 compares
'$id' !== $id(raw, case-sensitive) while metadata IDs are treated case-insensitively (Line 2246) and the new attribute ID is created viaID::custom($id)(Line 2280). This can miss duplicates and defeats the PR objective in edge cases.Proposed fix
// Append attribute, removing any duplicates $metadataAttributes = $collection->getAttribute('attributes', []); - $metadataAttributes = \array_filter($metadataAttributes, fn ($attribute) => $attribute['$id'] !== $id); + $normalizedId = $attribute->getId(); // already ID::custom($id) + $metadataAttributes = \array_filter($metadataAttributes, function ($existing) use ($normalizedId) { + $existingId = $existing instanceof Document + ? $existing->getId() + : ($existing['$id'] ?? null); + + return $existingId === null + || \strtolower($existingId) !== \strtolower($normalizedId); + }); $metadataAttributes[] = $attribute; $collection->setAttribute('attributes', \array_values($metadataAttributes));
2176-2186: Batch metadata dedupe: avoid O(n*m) + apply case-insensitive/strict matching.Line 2179 uses
in_array()insidearray_filter()(O(n*m)) and is case-sensitive/non-strict. Since attribute IDs are case-insensitive (Line 2246), this can also miss duplicates in migration reruns.Proposed fix
// Append attribute, removing any duplicates - $newAttributeIds = \array_map(fn ($attr) => $attr['$id'], $attributeDocuments); + $newAttributeIds = \array_map(fn (Document $attr) => $attr->getId(), $attributeDocuments); + $newAttributeIdsMap = \array_fill_keys(\array_map('strtolower', $newAttributeIds), true); $metadataAttributes = $collection->getAttribute('attributes', []); - $metadataAttributes = \array_filter($metadataAttributes, fn ($attr) => !\in_array($attr['$id'], $newAttributeIds)); + $metadataAttributes = \array_filter($metadataAttributes, function ($existing) use ($newAttributeIdsMap) { + $existingId = $existing instanceof Document + ? $existing->getId() + : ($existing['$id'] ?? null); + + return $existingId === null + || !isset($newAttributeIdsMap[\strtolower($existingId)]); + }); - foreach ($attributeDocuments as $attributeDocument) { - $metadataAttributes[] = $attributeDocument; - } - - $collection->setAttribute('attributes', \array_values($metadataAttributes)); + $metadataAttributes = \array_merge(\array_values($metadataAttributes), $attributeDocuments); + $collection->setAttribute('attributes', $metadataAttributes);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Database/Database.php
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
Applied to files:
src/Database/Database.php
📚 Learning: 2025-07-30T19:17:53.630Z
Learnt from: ArnabChatterjee20k
Repo: utopia-php/database PR: 642
File: src/Database/Validator/PartialStructure.php:43-52
Timestamp: 2025-07-30T19:17:53.630Z
Learning: In PartialStructure validator, when filtering for required attributes validation using the $requiredAttributes parameter, $this->attributes should be used instead of the merged $attributes array because this validation is specifically for internal attributes like $createdAt and $updatedAt that are defined in the base Structure class, not collection-specific attributes.
Applied to files:
src/Database/Database.php
📚 Learning: 2025-11-25T10:46:37.666Z
Learnt from: fogelito
Repo: utopia-php/database PR: 763
File: src/Database/Database.php:8671-8751
Timestamp: 2025-11-25T10:46:37.666Z
Learning: In Utopia\Database\Database::processRelationshipQueries, ensure the system select for '$id' is added only when the collection has relationship attributes (i.e., !empty($relationships)), and avoid capturing the unused '$idAdded' from Query::addSelect to satisfy static analysis.
Applied to files:
src/Database/Database.php
📚 Learning: 2025-10-16T09:37:33.531Z
Learnt from: fogelito
Repo: utopia-php/database PR: 733
File: src/Database/Adapter/MariaDB.php:1801-1806
Timestamp: 2025-10-16T09:37:33.531Z
Learning: In the MariaDB adapter (src/Database/Adapter/MariaDB.php), only duplicate `_uid` violations should throw `DuplicateException`. All other unique constraint violations, including `PRIMARY` key collisions on the internal `_id` field, should throw `UniqueException`. This is the intended design to distinguish between user-facing document duplicates and internal/user-defined unique constraint violations.
Applied to files:
src/Database/Database.php
🧬 Code graph analysis (1)
src/Database/Database.php (2)
src/Database/Document.php (2)
getAttribute(224-231)setAttribute(244-261)src/Database/Adapter.php (1)
getSharedTables(212-215)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: Adapter Tests (SharedTables/SQLite)
- GitHub Check: Adapter Tests (SharedTables/Postgres)
- GitHub Check: Adapter Tests (Schemaless/MongoDB)
- GitHub Check: Adapter Tests (Postgres)
- GitHub Check: Adapter Tests (SharedTables/MongoDB)
- GitHub Check: Adapter Tests (SharedTables/MariaDB)
- GitHub Check: Adapter Tests (Pool)
- GitHub Check: Adapter Tests (SharedTables/MySQL)
- GitHub Check: Adapter Tests (MariaDB)
- GitHub Check: Adapter Tests (Mirror)
- GitHub Check: Adapter Tests (MongoDB)
- GitHub Check: Adapter Tests (SQLite)
- GitHub Check: Adapter Tests (MySQL)
- GitHub Check: Unit Test
- GitHub Check: Agent
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
Do not merge, our understanding of why condition is for shared tables only isnt full, so we are avoiding this approach |
Appending during migration happens multiple times when re-running. Changes here prevent that, ensuring only one attribute exist in metadata of same name
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.