🧠 Project-Cortex v2.0

The "Gold Medal" AI Wearable for the Visually Impaired

Democratizing Assistive Technology
Building a <$150 AI wearable to disrupt the $4,000+ premium device market (OrCam, eSight).
Powered by Raspberry Pi 5, Gemini 2.5 Flash, and Adaptive Edge AI.

Explore Architecture • View Roadmap • Read Documentation

🎯 Mission & Vision

Project-Cortex is an open-source assistive wearable designed for the Young Innovators Awards (YIA) 2026. Our goal is to provide real-time scene understanding, object detection, and navigation for the visually impaired using commodity hardware.

Why We Built This

Commercial devices like OrCam MyEye cost $4,000+, making them inaccessible to 90% of the visually impaired population. Cortex achieves comparable (and often superior) performance for <$150.

Feature	Project-Cortex v2.0	Commercial Devices
Cost	<$150 🏆	$4,000 - $5,500
Learning	Adaptive (Real-Time)	Static (Pre-trained only)
Latency	<100ms (Safety)	Variable
Audio	Body-Relative 3D Spatial	Mono / Stereo
Connectivity	Hybrid Edge + Cloud	Cloud-Dependent or Offline-Only

✨ Key Innovation Highlights

1. Adaptive Dual-Model Vision (Layer 0 + Layer 1)

Unlike traditional systems that use a single static model, Cortex uses a parallel cascade:

Layer 0 (Guardian): Static YOLO11n-NCNN for safety-critical hazards (cars, stairs). Runs 100% offline, 80.7ms latency ✅ (4.8x faster than PyTorch).
Layer 1 (Learner): Adaptive YOLOE-11s that learns new objects in real-time from Gemini descriptions and Google Maps POI data.

2. Native Audio-to-Audio Conversation (Layer 2)

Powered by Gemini 2.5 Flash Live API over WebSocket:

<500ms Latency: 83% faster than traditional HTTP pipelines (3s).
Full Duplex: Users can interrupt the AI naturally.
Multimodal: Streams video + audio continuously for deep context.

3. Body-Relative Spatial Audio (Layer 3)

Chest-Mounted Camera: Navigation cues are relative to your torso, not your head.
Audio Beacons: "Follow the sound" to find specific objects.
Proximity Alerts: Dynamic warning tones for obstacles.

🧠 The 4-Layer AI Brain

Our architecture is divided into four specialized layers to balance safety, intelligence, and speed.

Layer	Name	Function	Technology	Latency
L0	The Guardian	Safety-Critical Detection	YOLO11n-NCNN (Local)	80.7ms ✅
L1	The Learner	Adaptive Context	YOLOE-11s (Local)	~120ms
L2	The Thinker	Deep Reasoning & QA	Gemini Live (Cloud)	<500ms
L3	The Guide	Navigation & 3D Audio	PyOpenAL + VIO/SLAM	Real-time
L4	The Memory	Persistence	SQLite + Vector DB	<10ms

🏗️ System Architecture

Hardware Stack (Edge Unit)

Compute: Raspberry Pi 5 (4GB RAM)
Vision: IMX415 / Camera Module 3 (Wide)
Audio: Bluetooth Headphones (OpenAL Spatial Output)
Power: 30,000mAh USB-C PD Power Bank (usb_max_current_enable=1)
Sensors: BNO055 IMU (Torso Orientation), GPS

Hybrid-Edge Topology

graph TD
    User((User)) <-->|Audio/Haptics| RPi[Raspberry Pi 5]
    RPi <-->|WebSocket| Laptop[Laptop Server (Optional)]
    RPi <-->|Live API| Gemini[Gemini Cloud]
    
    subgraph "Raspberry Pi 5 (Wearable)"
        L0[Layer 0: Guardian]
        L1[Layer 1: Learner]
        L2[Layer 2: Thinker]
        L4[Layer 4: Memory]
    end
    
    subgraph "Laptop Server (Heavy Compute)"
        L3_SLAM[Layer 3: VIO/SLAM]
        Dash[Web Dashboard]
    end

🚀 Quick Start

Prerequisites

Hardware: Raspberry Pi 5 (4GB) OR Windows Laptop (Dev Mode)
API Keys: Google Gemini API Key
Python: 3.11+

Installation

Clone the Repository

git clone https://github.com/IRSPlays/ProjectCortexV2.git
cd ProjectCortexV2

Install Dependencies

python -m venv venv
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate

pip install -r requirements.txt

Configure Environment

cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

Run Development GUI
```
python src/cortex_gui.py
```

📊 Performance & Benchmarks

Measured on Raspberry Pi 5 (4GB) running production code:

Component	Target	Actual	Status
Safety Detection (L0)	<100ms	60-80ms	✅ EXCEEDED
Adaptive Detection (L1)	<150ms	90-130ms	✅ PASSED
Gemini Live Response	<700ms	~450ms	✅ EXCEEDED
Haptic Trigger	<10ms	3-5ms	✅ INSTANT
RAM Usage	<4GB	~3.6GB	✅ OPTIMIZED

📚 Documentation

Detailed technical documentation is available in the docs/ directory.

📘 Unified System Architecture - The master blueprint.
⚡ Adaptive YOLOE Implementation - How the self-learning vision works.
🗣️ Gemini Live API Plan - WebSocket audio streaming details.
🎧 Spatial Audio Guide - Body-relative navigation explained.
🛠️ Router Fix & Logic - How we route user intents.

🤝 Contributing

This project is built for the Young Innovators Awards 2026. Contributions are welcome! Please read our Development Workflow.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Built with 💙 for Accessibility.
"Failing with Honour, Pain First, Rest Later"

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
.kilocode		.kilocode
.nicegui		.nicegui
TTS Model		TTS Model
Version_1		Version_1
assets/sounds		assets/sounds
config		config
docs		docs
memory		memory
memory_storage		memory_storage
models		models
plans		plans
server		server
src		src
temp_audio		temp_audio
tests		tests
utils		utils
.env.example		.env.example
.env.template		.env.template
.gitattributes		.gitattributes
.gitignore		.gitignore
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
README.md		README.md
convert_models_to_ncnn.py		convert_models_to_ncnn.py
mobileclip_blt.ts		mobileclip_blt.ts
requirements.txt		requirements.txt
requirements_adaptive_yoloe.txt		requirements_adaptive_yoloe.txt
setup_rpi5.sh		setup_rpi5.sh
start_cortex.ps1		start_cortex.ps1
test_kokoro_debug.py		test_kokoro_debug.py
test_vad.wav		test_vad.wav
test_zai_coding_endpoint.py		test_zai_coding_endpoint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Project-Cortex v2.0

The "Gold Medal" AI Wearable for the Visually Impaired

📑 Table of Contents

🎯 Mission & Vision

Why We Built This

✨ Key Innovation Highlights

1. Adaptive Dual-Model Vision (Layer 0 + Layer 1)

2. Native Audio-to-Audio Conversation (Layer 2)

3. Body-Relative Spatial Audio (Layer 3)

🧠 The 4-Layer AI Brain

🏗️ System Architecture

Hardware Stack (Edge Unit)

Hybrid-Edge Topology

🚀 Quick Start

Prerequisites

Installation

📊 Performance & Benchmarks

📚 Documentation

🤝 Contributing

About

Uh oh!

Releases

Packages

Languages

IRSPlays/ProjectCortexV2

Folders and files

Latest commit

History

Repository files navigation

🧠 Project-Cortex v2.0

The "Gold Medal" AI Wearable for the Visually Impaired

📑 Table of Contents

🎯 Mission & Vision

Why We Built This

✨ Key Innovation Highlights

1. Adaptive Dual-Model Vision (Layer 0 + Layer 1)

2. Native Audio-to-Audio Conversation (Layer 2)

3. Body-Relative Spatial Audio (Layer 3)

🧠 The 4-Layer AI Brain

🏗️ System Architecture

Hardware Stack (Edge Unit)

Hybrid-Edge Topology

🚀 Quick Start

Prerequisites

Installation

📊 Performance & Benchmarks

📚 Documentation

🤝 Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages