-
Notifications
You must be signed in to change notification settings - Fork 702
Fixed drop column issue. #3112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed drop column issue. #3112
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical bug where dropping a column from a deeplake table caused subsequent queries to fail with "cache lookup failed for type 0". The fix filters out dropped columns when building table metadata, ensuring only active columns are processed.
Changes:
- Modified table_data to track only active (non-dropped) columns via an
active_column_indices_vector that maps logical column indices to TupleDesc indices - Added comprehensive test coverage for DROP COLUMN scenarios including multiple drops, reconnection after drop, and array type columns
- Updated process_utility to reload table metadata after DROP COLUMN operations
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| postgres/tests/py_tests/test_drop_column_type_error.py | New test suite validating DROP COLUMN functionality across various scenarios |
| cpp/deeplake_pg/table_data_impl.hpp | Core fix: filters dropped columns during initialization and maps logical indices to physical TupleDesc indices |
| cpp/deeplake_pg/table_data.hpp | Added is_column_dropped() method and active_column_indices_ member variable |
| cpp/deeplake_pg/table_am.cpp | Reordered includes alphabetically and applied code formatting |
| cpp/deeplake_pg/extension_init.cpp | Added DROP COLUMN handler to reload table metadata and applied code formatting |
| cpp/deeplake_pg/duckdb_deeplake_scan.cpp | Applied code formatting and added comment about active columns |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pg::table_storage::instance().erase_table(table_name); | ||
| pg::table_storage::instance().force_load_table_metadata(); | ||
| elog(INFO, "Reloaded table_data after DROP COLUMN for table '%s'", table_name.c_str()); | ||
| return; // Exit early after reloading table |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment 'Exit early after reloading table' is misleading since the function returns at line 997, but there's additional code after this block (at line 998-1000) that would execute in other ALTER TABLE cases. Consider revising to 'Exit processing for DROP COLUMN after reloading table' or similar to clarify this only exits the DROP COLUMN case.
| return; // Exit early after reloading table | |
| return; // Exit processing for DROP COLUMN after reloading table |
|
|
||
| inline bool table_data::is_column_dropped(AttrNumber attr_num) const noexcept | ||
| { | ||
| // Active columns are never dropped (dropped columns are filtered out) |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is_column_dropped() function unconditionally returns false and ignores its parameter. This creates a confusing API - callers might expect this to check if a column is dropped, but it always returns false. Consider either removing this function if it's not needed, or documenting why it exists (e.g., if it's required to satisfy an interface) and renaming it to something like is_active_column_dropped() to clarify it only operates on active columns.
| // Active columns are never dropped (dropped columns are filtered out) | |
| // Active columns are never dropped: table_data only exposes non-dropped columns and | |
| // filters out dropped ones when building active_column_indices_. Therefore, for any | |
| // attr_num that reaches this function, the corresponding column is guaranteed to be | |
| // active and this function will always return false. | |
| // | |
| // The attr_num parameter is retained to keep the interface consistent with callers | |
| // and related helpers, but it is not used in this implementation. |
|



🚀 🚀 Pull Request
Impact
Description
Things to be aware of
Things to worry about
Additional Context