Skip to content

fenio/netalchemy

Repository files navigation

Netalchemy

Transform raw network traffic into golden policies

Netalchemy orchestrates the network policy lifecycle for Kubernetes: init -> scan -> define -> build -> verify

It generates least-privilege NetworkPolicies from cluster topology and observed traffic, then continuously detects drift between your intended policy and what's actually enforced in the cluster.

It integrates with Inspektor Gadget for traffic observation and NCA for policy verification.

Installation

From GitHub Releases (recommended)

Download the latest binary for your platform from Releases, extract, and place it in your $PATH:

tar xzf netalchemy_*.tar.gz
sudo mv netalchemy /usr/local/bin/

From source

go install github.com/fenio/netalchemy/cmd/netalchemy@latest

Prerequisites

Quick Start

Option A: Quick Start (topology-based)

Scan your cluster and generate policies based on its structure:

# 1. Initialize policy from cluster topology
netalchemy init -o policy.yaml

# 2. Review and visualize
netalchemy graph -p policy.yaml --terminal

# 3. Generate Kubernetes NetworkPolicy manifests
netalchemy build -p policy.yaml -o ./policies/

# 4. Apply to cluster
kubectl apply -f ./policies/

# 5. Detect drift at any time
netalchemy verify -p policy.yaml

Option B: Traffic-based (more accurate)

Observe actual network traffic to generate policies:

# 1. Observe traffic in your cluster (requires Inspektor Gadget)
netalchemy scan -n o11y,db -d 1h -o observed.yaml

# 2. Define your policy from observed traffic
netalchemy define --from observed.yaml -o policy.yaml

# 3. Generate and apply
netalchemy build -p policy.yaml -o ./policies/
kubectl apply -f ./policies/

# 4. Detect drift at any time
netalchemy verify -p policy.yaml

Option C: Best of Both (recommended)

Combine topology scanning with traffic observation:

# 1. Quick baseline from cluster structure
netalchemy init -o baseline.yaml

# 2. Observe actual traffic
netalchemy scan -d 1h -o observed.yaml

# 3. Smart merge: combine baseline with observed traffic
netalchemy define --base baseline.yaml --from observed.yaml -o policy.yaml

# 4. Review, generate, apply
netalchemy graph -p policy.yaml --terminal
netalchemy build -p policy.yaml -o ./policies/
kubectl apply -f ./policies/

# 5. Detect drift at any time
netalchemy verify -p policy.yaml

Commands

netalchemy init

Scan your cluster and generate a starter policy based on topology.

# Initialize with interactive prompts
netalchemy init

# Initialize for specific namespaces
netalchemy init -n app1,app2,monitoring

# Output to custom file, skip prompts
netalchemy init -o my-policy.yaml --yes

# Export built-in heuristics as YAML
netalchemy init --export-heuristics

# Use custom heuristics config
netalchemy init --config ./my-heuristics.yaml

Auto-detects:

  • Ingress controllers and exposed services
  • Databases (PostgreSQL, MySQL, Redis, MongoDB)
  • Observability stack (Prometheus, Grafana, Loki, VictoriaMetrics)
  • Namespace-internal communication patterns

Detection patterns and port mappings are configurable. See Customizing Heuristics.

netalchemy scan

Capture network traffic and generate policy from actual connections.

# Observe all non-system namespaces for 1 hour
netalchemy scan -d 1h -o observed.yaml

# Observe specific namespaces
netalchemy scan -n o11y,db -d 5m -o observed.yaml

# Convert existing Inspektor Gadget trace
netalchemy scan --from-trace ./networktrace.log -o observed.yaml

# Debug: show raw gadget output
netalchemy scan -d 30s -v

The --from-trace flag accepts raw output from kubectl gadget run trace_tcp:latest. Multiple trace formats are supported: gadget's p/s format, tabular, connection-based, and IP-based.

Important: The scan command captures new TCP connections only (connect/accept syscalls). On stable clusters with long-lived connections, you may see little or no traffic during short observation windows.

Tips for effective observation:

  • Observe during deployments - pod restarts establish new connections
  • Use longer durations - periodic connections (health checks, metrics) will appear over time
  • Trigger activity - restart deployments while observing: kubectl rollout restart deployment -n <ns> <name>
  • Combine with init - use netalchemy init for baseline, then refine with observed traffic

netalchemy define

Create, validate, or merge policy files.

# Create from observed traffic
netalchemy define --from observed.yaml -o policy.yaml

# Smart merge: base policy + observed traffic
netalchemy define --base baseline.yaml --from observed.yaml -o policy.yaml

# Interactive review mode
netalchemy define --from observed.yaml -o policy.yaml -i

# Interactive policy creation from scratch
netalchemy define --interactive -o policy.yaml

# Validate policy file
netalchemy define --validate policy.yaml

# Simple merge of multiple files
netalchemy define --merge policy1.yaml,policy2.yaml -o merged.yaml

netalchemy build

Generate Kubernetes NetworkPolicy manifests.

# Generate NetworkPolicies with default-deny
netalchemy build -p policy.yaml -o ./policies/

# Dry-run (show what would be generated)
netalchemy build -p policy.yaml --dry-run

# Without default-deny (additive mode)
netalchemy build -p policy.yaml --default-deny=false -o ./policies/

# Check selectors against cluster before building
netalchemy build -p policy.yaml --check-selectors -o ./policies/

netalchemy verify

Detect drift between your intended policy and the NetworkPolicies actually enforced in the cluster. This catches deleted policies, manual edits, partial rollouts, and any other divergence from your declared intent.

# Verify live cluster against policy
netalchemy verify -p policy.yaml

# Verify against exported manifests
netalchemy verify -p policy.yaml --manifests ./cluster-export/

# Output in JSON format
netalchemy verify -p policy.yaml --output json

# Auto-suggest fixes for violations
netalchemy verify -p policy.yaml --fix

Use verify in CI pipelines to fail builds when cluster state drifts from the source-of-truth policy file. A non-zero exit code means drift was detected.

netalchemy diff

Compare two policy files.

# Compare two policy files
netalchemy diff policy1.yaml policy2.yaml

# With custom labels
netalchemy diff --label1 "Before" --label2 "After" old.yaml new.yaml

# JSON output
netalchemy diff --json policy1.yaml policy2.yaml

netalchemy graph

Visualize connectivity graph.

# Show in terminal (ASCII)
netalchemy graph -p policy.yaml --terminal

# Generate SVG graph (requires Graphviz)
netalchemy graph -p policy.yaml -o graph.svg

# Generate PNG
netalchemy graph -p policy.yaml -o graph.png

# Generate DOT file for custom rendering
netalchemy graph -p policy.yaml -o graph.dot

Global Flags

--timeout

Set a timeout for operations. Applies to init, build --check-selectors, and graph (SVG/PNG rendering). Each command has a sensible default if --timeout is not specified.

# Set a 2-minute timeout for init
netalchemy init --timeout 2m

# Set a 5-minute timeout for graph rendering
netalchemy graph -p policy.yaml -o graph.svg --timeout 5m

Policy File Format

Netalchemy uses an extended baseline-rules format compatible with NCA.

Basic Format (NCA-compatible)

- name: grafana-to-postgres
  description: Allow Grafana to connect to PostgreSQL
  from: app=grafana
  to: app=postgres
  protocol: TCP
  port_min: 5432
  action: allow

Extended Format (with default action)

# Explicitly declare default-deny semantics
default: deny

rules:
- name: ingress-to-grafana
  description: Ingress to Grafana dashboard
  from: app.kubernetes.io/name=ingress-nginx
  to: app.kubernetes.io/name=grafana
  protocol: TCP
  port_min: 3000
  action: allow

- name: grafana-to-postgres
  from: app.kubernetes.io/name=grafana
  to: app.kubernetes.io/name=postgresql
  protocol: TCP
  port_min: 5432
  action: allow

- name: o11y-internal
  description: Internal communication within o11y namespace
  from: namespace=o11y
  to: namespace=o11y
  action: allow

Field Reference

Field Description Example
default Default action: deny or allow deny
name Rule identifier (required) grafana-http
description Human-readable description Allow HTTP access
from Source selector app=grafana, namespace=o11y, 10.0.0.0/8
to Destination selector app=postgres
protocol TCP, UDP, or SCTP TCP
port_min Start of port range 80
port_max End of port range (optional) 443
action allow or deny allow

Selector Types

Type Example Description
Label app=grafana Pod label selector
K8s Label app.kubernetes.io/name=grafana Standard Kubernetes label
Namespace namespace=o11y All pods in namespace
Namespace + Label namespace=app,app.kubernetes.io/name=api Specific pods in specific namespace
CIDR 10.0.0.0/8 IP range

Cross-Namespace Rules

For policies that span namespaces, use namespace-qualified selectors:

# API in 'app' namespace can access PostgreSQL in 'data' namespace
- name: api-to-postgres
  from: namespace=app,app.kubernetes.io/name=api
  to: namespace=data,app.kubernetes.io/name=postgres
  protocol: TCP
  port_min: 5432
  action: allow

This generates NetworkPolicies with proper namespaceSelector and podSelector combinations.

Customizing Heuristics

The init command uses name-based heuristics to classify workloads (databases, ingress controllers, observability tools, etc.) and assign default ports. These are fully configurable.

Exporting defaults

netalchemy init --export-heuristics > heuristics.yaml

Overriding

Create a YAML file with only the patterns or ports you want to add or change. Lists are appended (deduplicated) to defaults; port maps are merged (your values win).

# ~/.config/netalchemy/heuristics.yaml
detection:
  databases:
    - scylladb
    - clickhouse
ports:
  database:
    scylladb: 9042
    clickhouse: 9000

Resolution order

  1. Explicit --config flag path
  2. ~/.config/netalchemy/heuristics.yaml (if it exists)
  3. Built-in defaults only

Output

Netalchemy uses colored output for better readability:

  • Green: Success
  • Red: Errors
  • Yellow: Warnings
  • Cyan: Progress

Tutorial: Hands-on Labs

Two labs demonstrate the complete workflow — from policy generation to drift detection — with a local cluster:

Lab 1: Init Approach (Heuristic-based)

cd examples/kind-lab
./run-lab.sh

Uses netalchemy init to scan cluster topology and generate policies based on detected patterns (databases, ingress, etc.). Good for quick starts.

See examples/kind-lab/README.md

Lab 2: Scan Approach (Traffic-based)

cd examples/kind-lab-observe
./run-lab.sh

Uses netalchemy scan with Inspektor Gadget to capture actual network traffic and generate policies from observed connections. More accurate but requires traffic.

See examples/kind-lab-observe/README.md

Comparison

Aspect Init (Lab 1) Scan (Lab 2)
Method Heuristics Traffic capture
Requirements kubectl Inspektor Gadget
Speed Instant Needs observation time
Accuracy May be over-permissive Exact (only observed traffic)
Best for Quick start, known architecture Dynamic workloads, during deployments
Limitation Guesses based on labels/ports Only captures new connections, not existing

License

GPL-3.0 — see LICENSE for details.

About

because hand-crafting NetworkPolicies is a cruel joke

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages