Conversation
Committed-by: Xiaoli Zhou from Dev container
|
|
||
| ### 1.3 图语义说明 | ||
| NeuG 底层存储为**有向图**(CSR for outgoing, CSC for incoming)。算法层根据算法需求封装不同的语义: | ||
|
|
||
| | 图语义 | 实现方式 | 适用算法 | |
There was a problem hiding this comment.
AI/GraphRAG algorithm count mismatch
The section header reads "AI/GraphRAG 刚需算法(3 个)" but the table that follows only contains 2 algorithms: Leiden and Label Propagation. This contradicts both the section header and the overall claim of "8 个核心算法" in §1.2 (6 classic + 2 AI = 8, so the AI section count should be 2, not 3).
Either a third algorithm is missing from the table, or the count in the header should be corrected to (2 个).
| ### 1.3 图语义说明 | |
| NeuG 底层存储为**有向图**(CSR for outgoing, CSC for incoming)。算法层根据算法需求封装不同的语义: | |
| | 图语义 | 实现方式 | 适用算法 | | |
| #### AI/GraphRAG 刚需算法(2 个) | |
| | 算法 | 图语义 | 描述 | 输出 | 并行化 | | |
| | --- | --- | --- | --- | --- | | |
| | **Leiden** | 无向 | 高质量社区发现(优于 Louvain) | `(node, community_id)` | 支持 | | |
| | **Label Propagation** | 无向 | 基于标签传播的快速社区发现 | `(node, label)` | 支持 | |
|
|
||
| ```cypher | ||
| CALL project_graph( | ||
| <GRAPH_NAME>, |
There was a problem hiding this comment.
Draft design note left in published spec
This line reads as an unfinished internal design note written in first-person ("这里我需要对算法的表示提出修改。我认为还是修改为..."), which is the author reasoning through a design decision mid-document. It is not appropriate for a published specification and should be removed before merging.
| <GRAPH_NAME>, |
|
|
||
| > **注意**:不支持 Windows 平台。 | ||
| > | ||
|
|
||
| ### 2.3 Extension 生命周期 | ||
| ```plain |
There was a problem hiding this comment.
Malformed platform support table
The table header declares only 2 columns (平台 | 架构) but the separator row has 3 column delimiters (|------|------|--------|), making this an invalid Markdown table. Most renderers will either misrender it or collapse the columns incorrectly.
| > **注意**:不支持 Windows 平台。 | |
| > | |
| ### 2.3 Extension 生命周期 | |
| ```plain | |
| | 平台 | 架构 | | |
| |------|------| | |
| | Linux | x86_64 | | |
| | Linux | aarch64 (ARM64) | | |
| | macOS | arm64 (Apple Silicon) | | |
| | macOS | x86_64 | |
specs/004-gds/spec.md
Outdated
| | 列名 | 类型 | 描述 | | ||
| | --- | --- | --- | | ||
| | `node` | Any | 目标顶点标识符 | | ||
| | `distance` | Float | 从源到该节点的最短距离 | | ||
| | `path` | List | 最短路径经过的节点列表(可选) | | ||
|
|
||
|
|
||
| **Cypher 示例**: | ||
|
|
||
| ```cypher | ||
| CALL project_graph('station_graph', {'Station': 'true'}, {'CONNECTED': 'true'}); |
There was a problem hiding this comment.
BFS and weightless Shortest Path are conflated
In §1.4.3 the spec says weight_property: null makes shortest_path "等价于 BFS" (equivalent to BFS). However, §1.4.5 defines bfs as a fully independent procedure with its own max_depth parameter and different semantics (hop count vs. distance). This creates ambiguity:
- Are
bfsandshortest_pathwithweight_property: nulltruly interchangeable? - If so, is
bfsjust a convenience alias, or does it add capability (max_depth) not present inshortest_path?
The spec should explicitly clarify the relationship — for example, whether shortest_path(..., {weight_property: null}) also supports max_depth, or whether the two procedures remain distinct despite the note.
| | 性能优化 | 算法级别优化 | | ||
|
|
||
|
|
||
| --- No newline at end of file |
There was a problem hiding this comment.
Missing newline at end of file
The file is missing a trailing newline, as indicated by \ No newline at end of file in the diff. This is a POSIX requirement for text files and can cause issues with certain tools.
| --- | |
| | 性能优化 | 算法级别优化 | | |
specs/004-gds/spec.md
Outdated
| ### 1.2 V1 算法列表 | ||
| 第一版支持 **8 个核心算法**,分为两类(BFS、LCC 等别名在详细说明中列出): | ||
|
|
||
| #### 经典图算法(6 个) |
There was a problem hiding this comment.
把 ldbc graphalytics 算法具体的对应也说明一下。
There was a problem hiding this comment.
然后这个spec里面额外需要加入实现完成之后 需要和 竞品包括 kuzu,ladybug db,neo4j 等在 ldbc graphalytics 某个数据集上进行benchmark 比较,需要比竞品更优
specs/004-gds/spec.md
Outdated
| | **PageRank** | 有向 | 计算节点的重要性分数 | `(node, rank)` | 支持 | | ||
| | **Shortest Path (Dijkstra)** | 有向 | 单源最短路径 | `(node, distance, path)` | 不支持 | | ||
| | **Connected Components** | 无向 | 弱连通分量检测(别名 WCC) | `(node, component_id)` | 支持 | | ||
| | **Breadth-First Search (BFS)** | 有向 | 从源点出发的广度优先遍历,按层扩展 | `(node, distance)` | 不支持 | |
There was a problem hiding this comment.
为什么 shortest path 和bfs的并行化是不支持?我们其实不需要假定这个shortest path 一定是 dijkstra 算法?理论上是选择性能优的算法。
Committed-by: Xiaoli Zhou from Dev container
specs/004-gds/spec.md
Outdated
|
|
||
| ```cypher | ||
| -- 先投影子图,再执行算法 | ||
| CALL project_graph('my_graph', {'Person': 'n.name <> "Ira"'}, {'KNOWS': 'r.id < 3'}); |
Committed-by: Xiaoli Zhou from Dev container
Committed-by: Xiaoli Zhou from Dev container

Committed-by: Xiaoli Zhou from Dev container
What do these changes do?
as titled.
Related issue number
Fixes
Greptile Summary
This PR adds a new Graph Data Science (GDS) design specification (
specs/004-gds/spec.md) that covers the full stack from product requirements (8 core graph algorithms), user-facing Cypher API (INSTALL/LOAD EXTENSION,project_graph,CALL algo), C++ implementation structures (ProjectedSubgraph,GDSGraph,GDSAlgophysical plan), developer extension API, and a prioritised roadmap.Key issues found during review:
(2 个)or a missing third algorithm should be added.shortest_pathwithweight_property: nullis "equivalent to BFS", yet §1.4.5 definesbfsas a separate procedure with an additionalmax_depthparameter — the spec should clarify whether these are aliases or truly distinct procedures.Confidence Score: 3/5
Important Files Changed
bfsandshortest_pathwith no weight.Sequence Diagram
sequenceDiagram participant User participant NeuG as NeuG (Cypher Engine) participant ExtReg as Extension Registry participant OSS as OSS / Local FS participant SubgraphCtx as Session Subgraph Context participant GDSAlgo as GDS Algorithm User->>NeuG: INSTALL EXTENSION 'gds' NeuG->>OSS: Download libjson.neug_extension for platform OSS-->>NeuG: .so file NeuG->>ExtReg: Register extension metadata User->>NeuG: LOAD EXTENSION 'gds' NeuG->>ExtReg: dlopen() → call Init() ExtReg-->>NeuG: Functions registered in catalog User->>NeuG: CALL project_graph('g', {Person:'true'}, {KNOWS:'true'}) NeuG->>SubgraphCtx: Store ProjectedSubgraph (labels + predicates, no data copy) SubgraphCtx-->>NeuG: OK User->>NeuG: CALL k_core('g', {min_k:3}) YIELD node, core_number NeuG->>SubgraphCtx: Lookup ProjectedSubgraph by name 'g' SubgraphCtx-->>NeuG: VertexEntries + EdgeEntries NeuG->>NeuG: Compile to GDSAlgo physical plan (bind label_ids + Expression predicates) NeuG->>GDSAlgo: Execute with GDSGraph (scan full graph, apply predicates at runtime) GDSAlgo-->>NeuG: (node, core_number) tuples NeuG-->>User: Result setLast reviewed commit: be1533e