Helios is a modern, lightweight workload orchestrator written in Rust, inspired by the principles of simplicity, flexibility, and performance found in systems like HashiCorp Nomad. This project aims to provide a simple yet powerful platform for managing diverse workloads across a cluster of machines.
Helios now implements a full Raft-based cluster for distributed state, leader election, and fault tolerance. The core workflow includes:
- gRPC server (
helios-server) handling job submissions. - Raft consensus cluster providing replicated metadata and automatic failover.
- Persistent state manager (Redb) ensuring durable job and allocation storage.
- Background scheduler proposing task allocations via Raft.
- Exec driver launching tasks on client nodes and streaming logs.
- Command-line client (
helios-cli) for job control and monitoring.
Helios is organized into a modular Cargo workspace with clear component boundaries:
- helios-core: Shared domain models (
Job,Task,Allocation,Status). - helios-api: Protobuf definitions and generated gRPC stubs for both public and internal APIs.
- helios-server: Orchestrator engine integrating Raft consensus, state persistence, scheduling, and execution drivers.
- helios-cli: CLI client for submitting and monitoring jobs over gRPC.
- helios-client: Embeddable Rust client library (in development).
For a detailed architecture breakdown, refer to docs/architecture.md.
Built on a modern, asynchronous Rust stack:
| Component | Key Crates | Purpose |
|---|---|---|
| Async Runtime | tokio |
Foundation for non-blocking tasks and I/O. |
| gRPC API | tonic, prost, tonic-build |
Public and internal RPC contract implementation. |
| CLI Tooling | clap |
Command parsing and help generation. |
| Serialization | serde, bincode |
Data encoding for storage and network transport. |
| Consensus | openraft |
Raft protocol implementation for distributed consensus. |
| Persistence | redb |
Embedded key-value storage for jobs and allocations. |
| Logging & Err | log, env_logger, anyhow |
Structured logging and ergonomic error handling. |
You must have the Rust toolchain installed. If you don't have it, you can install it via rustup.
Clone the repository and build the entire workspace from the root directory:
git clone <your-repo-url>
cd helios
cargo buildFor a release build, use:
cargo build --releaseHelios now runs as a Raft cluster. Start three helios-server nodes with a shared peer list, and bootstrap the cluster from a single node. Use a fourth terminal for the CLI.
Note: Use --bootstrap on exactly one node (typically node 1). All nodes must include the full peer list, including themselves.
Terminal 1 (bootstrap):
cargo run --package helios-server -- \
--node-id 1 \
--rpc-addr 127.0.0.1:50051 \
--peer 1=127.0.0.1:50051 \
--peer 2=127.0.0.1:50052 \
--peer 3=127.0.0.1:50053 \
--bootstrapTerminal 2:
cargo run --package helios-server -- \
--node-id 2 \
--rpc-addr 127.0.0.1:50052 \
--peer 1=127.0.0.1:50051 \
--peer 2=127.0.0.1:50052 \
--peer 3=127.0.0.1:50053Terminal 3:
cargo run --package helios-server -- \
--node-id 3 \
--rpc-addr 127.0.0.1:50053 \
--peer 1=127.0.0.1:50051 \
--peer 2=127.0.0.1:50052 \
--peer 3=127.0.0.1:50053Run one helios-client agent per node to register and heartbeat so the scheduler can allocate work:
Terminal 4:
cargo run --package helios-client -- --node-id 1 --server-addr http://127.0.0.1:50051 --client-rpc-addr 127.0.0.1:60051Terminal 5:
cargo run --package helios-client -- --node-id 2 --server-addr http://127.0.0.1:50051 --client-rpc-addr 127.0.0.1:60052Terminal 6:
cargo run --package helios-client -- --node-id 3 --server-addr http://127.0.0.1:50051 --client-rpc-addr 127.0.0.1:60053Submit one or more jobs once the cluster and agents are up. The CLI connects to 127.0.0.1:50051 by default; override with --server-addr.
# Example: echo with explicit arg separator for the subcommand
cargo run --package helios-cli -- job run --name multi-node-test echo -- 'It replicated!'
# Another example (custom server address)
cargo run --package helios-cli -- --server-addr http://127.0.0.1:50052 job run --name final-test echo -- "IT LIVES!"Tips:
- To start fresh, remove any
helios-node-*.dbfiles before restarting nodes. - All nodes print their serving address. Submit CLI requests to any running node.