Embedding Tool-Use Intent into Biomedical Knowledge Graphs for Reliable Agentic Reasoning
Biomedical agents retrieve medical knowledge and external tools through separate mechanisms. When a clinical question does not share vocabulary with tool API descriptions, embedding-based tool search fails to surface relevant tools, even when they exist in the catalog. This project proposes Action-Augmented Knowledge Graphs, a unified representation that embeds biomedical tools as first-class nodes in a medical knowledge graph, linked to disease, medication, and procedure entities through similarity edges. Using an adapted TxAgent harness with general-purpose models, we compare three tool retrieval strategies across 200 biomedical questions: ToolSearch (embedding-only baseline), ToolGraph (graph-traversal-based), and CypherGraph (LLM-driven Cypher exploration).
LLMs face a "Tool Selection Gap" in complex ecosystems like ToolUniverse (211 biomedical APIs): semantic search often fails to identify the correct diagnostic tool when the user's query does not textually match a tool's description, even if a direct biological relationship exists. Current systems maintain separate silos for data retrieval and tool discovery, preventing agents from seeing valid actions available in the local neighborhood of a disease or gene.
We compare three tool retrieval strategies:
- ToolSearch — Baseline embedding similarity search over tool descriptions via pgvector
- ToolGraph — Graph-traversal approach: question entities are matched to graph nodes, then tools are discovered by traversing entity-to-tool edges in the knowledge graph
- CypherGraph — The LLM generates and executes Cypher queries against the knowledge graph to discover relevant tools
| Metric | ToolSearch | ToolGraph | CypherGraph |
|---|---|---|---|
| Composite quality | 4.63 | 4.80 | 3.19 |
| Correctness | 4.63 | 4.80 | 3.19 |
| Completeness | 4.52 | 4.68 | 3.13 |
| Relevance | 4.72 | 4.89 | 3.25 |
| Avg. tool calls | 2.32 | 2.85 | 1.68 |
| Total duration (s) | 12.5 | 23.8 | 23.0 |
ToolGraph improves composite answer quality over ToolSearch by acting as a semantic bridge (question -> medical entity -> tool). On roughly 45% of questions where ToolSearch fails, ToolGraph still provides domain-appropriate candidates through entity-to-tool graph edges.
+-------------------+
| Next.js 16 |
| (Dashboard UI + |
| API Routes) |
+--------+----------+
|
+--------------+--------------+
| |
+-------v--------+ +---------v--------+
| Neo4j Graph | | Supabase |
| (Knowledge + | | (pgvector for |
| Tool Nodes) | | embeddings) |
+----------------+ +------------------+
| |
+--------------+--------------+
|
+--------v----------+
| TxAgent Server |
| (FastAPI on |
| port 8000) |
+--------+----------+
|
+--------v----------+
| OpenRouter |
| (LLM Provider) |
+-------------------+
| Component | Technology | Purpose |
|---|---|---|
| Frontend | Next.js 16, React 19, Tailwind CSS 4 | Dashboard UI and API routes |
| AI SDK | Vercel AI SDK 6, OpenRouter | LLM abstraction layer |
| Graph DB | Neo4j | Knowledge graph with tool nodes |
| Vector DB | Supabase (pgvector) | Tool embedding storage and similarity search |
| Agent | TxAgent (Python, FastAPI) | Biomedical reasoning with tool retrieval |
| Validation | Zod 4, TypeScript 5 (strict) | Schema validation and type safety |
| Visualization | Vega-Lite | Interactive charts on dashboard |
- Bun runtime
- Neo4j database instance (local or Aura)
- Supabase project with pgvector enabled
- Python 3.11+ for TxAgent server
- OpenRouter API key
-
Clone the repository
git clone https://github.com/nikitawagner/tool-enabled-knowledge-graphs.git cd tool-enabled-knowledge-graphs -
Install dependencies
bun install
-
Configure environment
cp sample.env .env.local
Fill in the required values:
OPENROUTER_API_KEY— OpenRouter API keySUPABASE_URLandSUPABASE_SERVICE_ROLE_KEY— Supabase project credentialsNEO4J_URI,NEO4J_USERNAME,NEO4J_PASSWORD,NEO4J_DATABASE— Neo4j connection
-
Set up TxAgent (Python)
cd TxAgent pip install -r requirements.txt
bun run medgraph:select # Select representative titles from corpus
bun run medgraph:extract # Entity extraction from titles/content
bun run medgraph:build # Construct graph from entities
bun run medgraph:embed # Generate embeddings
bun run medgraph:import # Import to Neo4j
bun run medgraph:import-tools # Inject ToolUniverse tools into graph
bun run medgraph:merge # Deduplicate similar entitiesbun run experiment:questions # Generate 200 biomedical questions from graph entitiesbun run dev:all # Start Next.js + TxAgent servers
bun run experiment:run -- --mode=toolsearch
bun run experiment:run -- --mode=tool-graph
bun run experiment:run -- --mode=cyphergraphbun run experiment:evaluate -- --mode=toolsearch
bun run experiment:evaluate -- --mode=tool-graph
bun run experiment:evaluate -- --mode=cyphergraph.
├── src/
│ ├── ai/ # Provider-agnostic LLM abstraction (Vercel AI SDK)
│ ├── app/ # Next.js App Router (dashboard, API routes)
│ ├── database/ # Supabase client and tool queries
│ ├── lib/ # Neo4j driver and utilities
│ └── scripts/ # Data processing and graph construction
├── experiment/
│ ├── questions/ # 200 generated biomedical questions
│ ├── results/ # Raw API response JSON files
│ └── analysis/ # Evaluation results
├── TxAgent/ # Adapted TxAgent (FastAPI server with retrieval modes)
├── data/medgraph/ # Knowledge graph node and edge data
├── neo4j/ # Neo4j setup and import scripts
└── report/ # Final LaTeX report and figures
The full research report is available in the report/ directory.
- Nikita Wagner — TxAgent integration, experiment pipeline, analysis and visualizations
- Alexandros Pechlivanidis — Graph construction pipeline, Neo4j schema and API, tool injection and entity linking
- TxAgent (Gao et al., 2025) — Multi-step medical reasoning with ToolUniverse
- MedGraphRAG (Wu et al., 2024) — Graph-centric medical QA
- Hetionet (Himmelstein et al., 2017) — Heterogeneous biomedical network