feat: Support COPY FROM without prior DDL#134
feat: Support COPY FROM without prior DDL#134shirly121 wants to merge 14 commits intoalibaba:mainfrom
COPY FROM without prior DDL#134Conversation
Resolve vertex and edge NameOrId against the live schema in BatchInsertVertexOpr and BatchInsertEdgeOpr Eval instead of fixing IDs at plan build time, so COPY can run when labels are resolved later. Extend bind_copy_from, gopt conversion, and plan_copy plumbing to carry physical EdgeType / NameOrId through the pipeline. Add Python tests for import/export paths that exercise the behavior. Committed-by: Xiaoli Zhou from Dev container
Committed-by: Xiaoli Zhou from Dev container
…ring - import_data.md: document auto_detect, node/edge inference rules, import order, limitations; anchor from intro; extend troubleshooting. - spec.md: link Module 5 to import_data.md; fix P2 module/FM numbering (M6–M13, FR-601–1305) and dependency blurbs to match Priority Overview. Committed-by: Xiaoli Zhou from Dev container
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Review Summary by QodoSupport COPY FROM without prior DDL via deferred label resolution
WalkthroughsDescription• Defer vertex/edge label resolution to execution time for schema-less COPY • Support COPY FROM without prior DDL via auto-detect and schema inference • Add DDLVertexInfo/DDLEdgeInfo structures for inferred table metadata • Extend physical plan with NameOrId for deferred label resolution • Add comprehensive Python tests for no-schema import/export workflows Diagramflowchart LR
A["COPY Statement<br/>No Schema"] -->|Binder| B["DDLVertexInfo/<br/>DDLEdgeInfo"]
B -->|Planner| C["LogicalCreateTable<br/>+ LogicalCopyFrom"]
C -->|GOPT| D["CreateVertexSchema/<br/>CreateEdgeSchema<br/>+ BatchInsert*"]
D -->|Execution| E["Resolve NameOrId<br/>at Runtime<br/>+ Insert Data"]
File Changes1. src/compiler/binder/bind/copy/bind_copy_from.cpp
|
Code Review by Qodo
|
| struct NEUG_API BoundCopyFromInfo { | ||
| // Table entry to copy into. | ||
| catalog::TableCatalogEntry* tableEntry; |
There was a problem hiding this comment.
1. boundcopyfrominfo lacks ddl_required 📎 Requirement gap ✓ Correctness
BoundCopyFromInfo does not include the required ddl_required flag, so downstream planning/execution cannot explicitly determine when schema-creating DDL must be prepended for inferred COPY. This violates the binder contract required for the no-schema COPY flow.
Agent Prompt
## Issue description
`BoundCopyFromInfo` is required to expose an explicit `ddl_required` flag for the no-schema COPY flow, but the current implementation only adds `ddlTableInfo` and uses its presence as an implicit signal.
## Issue Context
Compliance requires `ddl_required` to be present and set appropriately so planner/execution can reliably decide when to prepend Create*Schema before BatchInsert*.
## Fix Focus Areas
- include/neug/compiler/binder/copy/bound_copy_from.h[95-156]
- src/compiler/binder/bind/copy/bind_copy_from.cpp[168-186]
- src/compiler/binder/bind/copy/bind_copy_from.cpp[293-313]
- src/compiler/binder/bind/copy/bind_copy_from.cpp[382-477]
- src/compiler/planner/plan/plan_copy.cpp[71-155]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
…ogging
- Change brace-init {} to parentheses () for std::vector<ColumnEvaluateType>
in bindCopyRelFrom and bindCopyRelFromNoSchema to avoid compile error
(brace-init selects initializer_list ctor, size_t cannot convert to enum class).
- Remove leftover LOG(INFO) debug statements in gopt_planner.cc and
g_query_converter.cpp; restore VLOG(10) for physical plan output.
- Remove commented-out code in DDLVertexInfo constructor.
doc/source/data_io/import_data.md
Outdated
| COPY LegacyUser FROM "users.csv" (header=true, auto_detect=false); -- require table to exist | ||
| ``` | ||
|
|
||
| Parser option names are matched case-insensitively; `AUTO_DETECT` is accepted as well. |
There was a problem hiding this comment.
这个应该统一说一下就好。不需要在这里单独提吧
doc/source/data_io/import_data.md
Outdated
|
|
||
| ```cypher | ||
| COPY LegacyUser FROM "users.csv" (header=true, auto_detect=true); | ||
| COPY LegacyUser FROM "users.csv" (header=true, auto_detect=false); -- require table to exist |
There was a problem hiding this comment.
为啥你这个例子要写成 LegacyUser ?
specs/001-ap-temp-graph/spec.md
Outdated
| 示例: | ||
|
|
||
| ```cypher | ||
| COPY person FROM 'person.csv' (auto_detect=true); |
There was a problem hiding this comment.
csv/json/parquet都支持,我再加一些测试和文档
test(python): add no-schema extension tests for JSON, JSONL, and Parquet
Committed-by: Xiaoli Zhou from Dev container
What do these changes do?
Related issue number
Fixes #120