OCR platform for Macedonian book pages. Upload scanned images or PDFs, run OCR with preprocessing, review and correct text, and export cleaned results. Deployed on Kubernetes with a fully automated CI/CD pipeline.
- Batch OCR — Upload multiple images or a PDF; pages are OCR'd in background jobs with real-time progress
- Text correction — Tabbed raw/cleaned/corrected views with inline saving
- Bilingual UI — English and Macedonian with light/dark themes
- MongoDB persistence — Job history survives restarts (replaced in-memory storage)
- Delete history — Individual job deletion or clear all history
- Export — Download per-page or book-level text files
Browser ──http://localhost:8080──→ Traefik Ingress
│
┌────────────┼────────────┐
│ │ │
┌────┴────┐ ┌───┴────┐ ┌───┴────┐
│frontend │ │backend │ │mongodb │
│ :80 │ │ :8000 │ │ :27017 │
└─────────┘ └───┬────┘ └────────┘
│
Tesseract OCR
(inside backend pod)
All API calls pass through the Ingress:
/api/*→ backend (prefix stripped)/images/*→ backend (serves uploaded images)/→ frontend (React SPA)
| Layer | Technology |
|---|---|
| Backend | FastAPI + Uvicorn + Motor (async MongoDB) |
| Database | MongoDB 8 (StatefulSet in K8s) |
| OCR | Tesseract 5 (mkd language) |
| PyMuPDF (fitz) | |
| Image processing | OpenCV, Pillow |
| Frontend | React 19 + Vite + Axios |
| i18n | i18next (English / Macedonian) |
| Container | Docker + Docker Compose |
| Orchestration | Kubernetes (k3d) |
| CI/CD | GitHub Actions + Argo CD |
| Ingress | Traefik (k3s built-in) |
./
├── backend/ # FastAPI app
│ ├── app/
│ │ ├── api/routes/ # Jobs, OCR, Books, Files, Text
│ │ ├── core/ # Config, database (Motor)
│ │ ├── schemas/ # Pydantic models
│ │ └── services/ # Job service, OCR service, Book service
│ └── Dockerfile
├── frontend/ # React + Vite
│ ├── src/
│ │ ├── api/ # Axios client (prefixes /api)
│ │ ├── components/ # UploadSection, JobHistory, ImageViewer, Editor
│ │ └── context/ # AppContext (book state)
│ ├── Dockerfile
│ └── nginx.conf # Serves static files only (API proxied by Ingress)
├── ocr_pipeline/ # Tesseract preprocessing + text cleanup
├── k8s/ # Kubernetes manifests
│ ├── namespace.yaml
│ ├── ingress.yaml # Traefik Ingress (/api + /images → backend, / → frontend)
│ ├── mongodb/ # StatefulSet + Service + ConfigMap + Secret
│ ├── backend/ # Deployment + Service + ConfigMap + PVC
│ ├── frontend/ # Deployment + Service
│ └── argocd/ # Argo CD Application manifest
├── docker-compose.yml # Dev: mongodb + backend + frontend
├── .github/workflows/ci.yml # CI/CD: build, push to Docker Hub, deploy via Argo CD
├── .env.example # Environment variable template
└── requirements.txt
# 1. Clone
git clone https://github.com/YOUR_USERNAME/mk-ocr-platform.git
cd mk-ocr-platform
# 2. Copy env and edit if needed
cp .env.example .env
# 3. Start all services
docker compose up -d
# 4. Open the app
open http://localhost:5173- k3d installed
- Docker Hub account
k3d cluster create mk-ocr --api-port 6550 -p "8080:80@loadbalancer"Docker images are already available on Docker Hub as
martinnq/mk-ocr-*.
If you forked the repo, build and push your own images first, then updatek8s/*/deployment.yaml.
cp k8s/mongodb/secret.yaml.example k8s/mongodb/secret.yaml
cp k8s/backend/secret.yaml.example k8s/backend/secret.yaml
# Edit the passwords if desiredkubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/mongodb/
kubectl apply -f k8s/backend/
kubectl apply -f k8s/frontend/
kubectl apply -f k8s/ingress.yamlkubectl get pods -n mk-ocr-app
# Wait for all pods to be Ready, then:
open http://localhost:8080The .github/workflows/ci.yml implements a full CI/CD pipeline.
Every push to main triggers:
- backend job — builds and pushes
mk-ocr-backendto Docker Hub (latest+sha-*tags) - frontend job — builds and pushes
mk-ocr-frontendto Docker Hub (latest+sha-*tags) - deploy job — updates
k8s/*/deployment.yamlwith the new SHA image tag, commits and pushes to Git
Argo CD runs in the cluster and watches the Git repo. When the deploy job pushes updated manifests, Argo CD automatically syncs them to the cluster.
- Add these GitHub Secrets (repo → Settings → Secrets → Actions):
| Secret | Value |
|---|---|
DOCKERHUB_USERNAME |
Your Docker Hub username |
DOCKERHUB_TOKEN |
Docker Hub access token |
- Install Argo CD:
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/v2.12.6/manifests/install.yaml- Configure Argo CD:
# Edit k8s/argocd/application.yaml — replace YOUR_GITHUB_USERNAME
kubectl apply -f k8s/argocd/application.yamlkubectl port-forward svc/argocd-server -n argocd 9090:80
# Open https://localhost:9090
# Username: admin
# Password: kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/ocr/batch-upload |
Upload multiple images, returns a job |
POST |
/ocr/upload-pdf |
Upload PDF, renders pages, queues OCR |
GET |
/jobs/history |
List all jobs |
DELETE |
/jobs/history |
Clear all job history |
GET |
/jobs/{id} |
Get job status + progress |
DELETE |
/jobs/{id} |
Delete a single job |
GET |
/books/{book}/pages |
List pages for a book |
GET |
/books/{book}/pages/{page} |
Get page detail + text blocks |
PUT |
/books/{book}/pages/{page}/corrected-text |
Save corrected text |
GET |
/books/{book}/export/txt |
Export best-available text |
Full OpenAPI docs at http://localhost:8080/api/docs (via Ingress) or http://127.0.0.1:8000/docs (direct).
| Variable | Default | Description |
|---|---|---|
MONGODB_URI |
mongodb://localhost:27017 |
MongoDB connection string |
MONGODB_DB_NAME |
mk_ocr_platform |
Database name |
CORS_ORIGINS |
* |
Allowed CORS origins |
OCR_LANGUAGE |
mkd |
Tesseract language |
TESSERACT_CMD |
platform-dependent | Path to Tesseract binary |
IMAGES_DIR |
./images |
Image storage directory |
OCR_OUTPUT_DIR |
./ocr_output |
OCR output directory |
TEXT_OUTPUT_DIR |
./text |
Corrected text directory |
- MongoDB is required. For local dev without Docker, set
MONGODB_URIin.envto your local MongoDB instance. - Filename format for batch uploads:
page_001.jpg,page_002.png, etc. (3+ digits). - Tesseract must be installed with the
mkdlanguage pack. - Encoding: All text outputs are UTF-8.
- Performance: OCR is CPU-bound; large batches run sequentially in background tasks.
MIT