Skip to content

qimcis/sys-intelligence-agent

Repository files navigation

System Intelligence Benchmark Contributor

Setup

npm install

Configuration

Set your Anthropic API key via environment variable:

export ANTHROPIC_API_KEY=sk-ant-your-api-key-here

For server-side exam processing with Docker, set:

  • SIB_WORKER_IMAGE (Docker image built from docker/worker/Dockerfile)
  • SIB_REPO_URL (clone URL for the base system-intelligence-benchmark repo)

Running

Development:

npm run dev

Production:

npm run build
node ./dist/server/entry.mjs

The server runs on http://localhost:3000 by default.

Vercel (UI) + Docker Host (API)

This project is intended to run the UI on Vercel and the API on a Docker-capable host.

UI (Vercel):

  1. Set the Vercel project framework to Astro.
  2. Build command: ASTRO_OUTPUT=static npm run build.
  3. Output directory: dist.
  4. Update vercel.json to point /api/* to your API host.

API (Docker host):

  1. Build the worker image: docker build -t sib-worker -f docker/worker/Dockerfile .
  2. Set env vars: ANTHROPIC_API_KEY, SIB_WORKER_IMAGE=sib-worker, SIB_REPO_URL.
  3. Build and run the server: npm run build then node ./dist/server/entry.mjs.

Features

Add Exams

  1. Navigate to /exams
  2. Fill in exam metadata (ID, name, course, institution, year)
  3. Upload the exam PDF/TXT file
  4. Upload the solutions PDF/TXT file
  5. Optionally upload reference materials
  6. Click "Process and Add Exam"

GitHub username and token are required to create a draft pull request.

The AI will parse the exam and solutions, generating a structured exam.md file in the courseexam format.

Add Labs (WIP)

  1. Navigate to /labs
  2. Enter the GitHub repository URL
  3. Fill in course metadata
  4. Click "Clone and Analyze Lab"

The AI agent will:

  • Clone the repository
  • Analyze the structure to identify tasks
  • Generate config.json, task.md, compose.yaml, and evaluate.sh for each task
  • Copy starter files
  • Update courses.json

Output Locations

  • Exams: benchmarks/courseexam_bench/data/raw/{exam_id}/
  • Labs: benchmarks/courselab_bench/data/{course_id}/