Skip to content

ppush/resume-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Resume Analyzer

Python FastAPI Docker License Code Style

AI-powered resume analysis service built entirely with Cursor AI assistant, using local LLM (LM Studio) with intelligent block processing, skills extraction, and experience calculation.

✨ Features

  • 🧠 LLM-Powered Analysis - Uses local LLM (LM Studio) for intelligent resume parsing
  • πŸ“„ Multi-Format Support - Handles DOCX and PDF resume files
  • ⚑ Parallel Processing - Concurrent block processing for faster analysis
  • 🎯 Smart Skills Extraction - Deduplication and scoring of technical skills
  • πŸ’Ό Experience Calculation - Automatic work experience calculation from roles
  • 🌍 Multi-Language Support - Analyzes resumes in various languages
  • 🐳 Docker Ready - Complete containerization with Docker Compose
  • πŸ§ͺ Comprehensive Testing - Unit and integration tests with unittest
  • πŸ—οΈ SOLID Architecture - Clean, maintainable code following SOLID principles

πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • LM Studio - Download from lmstudio.ai
  • Docker (optional) - For containerized deployment

1. Clone the Repository

git clone https://github.com/ppush/resume-analyzer.git
cd resume-analyzer

2. Install Dependencies

pip install -r requirements.txt

3. Start LM Studio

  1. Download and install LM Studio
  2. Load model: google/gemma-3-12b or meta-llama-3.1-8b-instruct
  3. Start local server on port 1234

4. Run the Service

python run.py

5. Access the API

🐳 Docker Deployment

Quick Start with Docker

# Build and run with Docker Compose
docker-compose up --build

# Or use the provided scripts
./docker-scripts/run.sh    # Linux/macOS
.\docker-scripts\run.ps1   # Windows PowerShell

Development Mode

# Run in development mode with live reload
docker-compose -f docker-compose.dev.yml up --build

# Or use the provided scripts
./docker-scripts/dev.sh    # Linux/macOS
.\docker-scripts\dev.ps1   # Windows PowerShell

πŸ“‹ API Documentation

POST /analyze

Upload and analyze a resume file with intelligent structure preservation (DOCX/PDF).

Request:

curl -X POST "http://localhost:8000/analyze" \
     -H "Content-Type: multipart/form-data" \
     -F "file=@resume.pdf"

Response:

{
  "skills_from_resume": [
    {"name": "Java", "score": 85},
    {"name": "Spring Boot", "score": 80},
    {"name": "Microservices", "score": 75}
  ],
  "skills_merged": [
    {"name": "Java", "score": 85, "merged": 2},
    {"name": "Spring Boot", "score": 80, "merged": 1},
    {"name": "Microservices Architecture", "score": 75, "merged": 3}
  ],
  "roles": [
    {
      "title": "Senior Software Architect",
      "project": "Polixis SA",
      "duration": "1 year 9 months",
      "score": 90,
      "category": ["Engineering", "Architecture"]
    }
  ],
  "languages": [
    {"language": "English", "level": "Advanced"},
    {"language": "Armenian", "level": "Native"}
  ],
  "experience": "20+ years",
  "location": "Armenia",
  "ready_to_remote": true,
  "ready_to_trip": true
}

πŸ—οΈ Architecture

Core Components

resume-analyzer/
β”œβ”€β”€ 🧠 core/                    # Core business logic
β”‚   β”œβ”€β”€ resume_parser.py        # LLM-based resume parsing
β”‚   β”œβ”€β”€ block_processor.py      # Parallel block processing
β”‚   β”œβ”€β”€ experience_calculator.py # Experience calculation
β”‚   β”œβ”€β”€ aggregation/            # Result aggregation
β”‚   β”‚   β”œβ”€β”€ resume_result_aggregator.py
β”‚   β”‚   β”œβ”€β”€ skill_merger.py
β”‚   β”‚   └── experience_analyzer.py
β”‚   └── prompts/                # LLM prompt templates
β”‚       β”œβ”€β”€ prompt_base.py
β”‚       β”œβ”€β”€ parsing_prompts.py
β”‚       β”œβ”€β”€ project_prompts.py
β”‚       β”œβ”€β”€ skill_prompts.py
β”‚       └── language_prompts.py
β”œβ”€β”€ πŸ”Œ services/                # External services
β”‚   β”œβ”€β”€ llm_client.py          # LM Studio integration
β”‚   └── file_loader.py         # DOCX/PDF processing
β”œβ”€β”€ πŸ§ͺ tests/                   # Test suite
β”‚   β”œβ”€β”€ unit/                  # Unit tests
β”‚   └── integration/           # Integration tests
β”œβ”€β”€ 🐳 docker-scripts/         # Docker management
β”œβ”€β”€ πŸ“„ main.py                 # FastAPI application
└── πŸš€ run.py                  # Service runner

Processing Pipeline

graph TD
    A[Resume File<br/>PDF/DOCX] --> B[File Loader<br/>PyMuPDF/Mammoth]
    B --> C[HTML Chunker<br/>Split into chunks]
    C --> D[Resume Parser<br/>LLM Block Segmentation]
    D --> E[Block Processor<br/>Parallel LLM Processing]
    E --> F[Result Aggregator<br/>Skills Merging & Experience Calc]
    F --> G[Final JSON<br/>Structured Data]
    
    D --> H[5 Block Types:<br/>projects, skills, education,<br/>languages, summary]
    E --> I[Concurrent Processing<br/>AsyncIO + Semaphore]
    F --> J[Skills Deduplication<br/>Experience Calculation<br/>Job Recommendations]
Loading

Block Types

  1. projects - Work experience and roles
  2. skills - Technical skills and competencies
  3. education - Education and certifications
  4. languages - Language proficiency
  5. summary - General information and overview

βš™οΈ Configuration

Environment Variables

# LM Studio settings
LM_STUDIO_URL=http://localhost:1234/v1/chat/completions
DEFAULT_MODEL=google/gemma-3-12b
DEFAULT_MAX_TOKENS=4096
DEFAULT_TEMPERATURE=0.0
DEFAULT_SEED=42

# Timeout settings
LLM_TIMEOUT=120

# Logging
LOG_LEVEL=INFO

Model Configuration

Edit config.py to change the LLM model:

# Available models
DEFAULT_MODEL = "google/gemma-3-12b"           # Recommended
# DEFAULT_MODEL = "meta-llama-3.1-8b-instruct"  # Alternative

πŸ§ͺ Testing

Run All Tests

# Run complete test suite
python tests/run_all_tests.py

# Run with verbose output
python tests/run_all_tests.py -v

Individual Test Categories

# Unit tests
python -m unittest tests.unit.test_resume_parser -v
python -m unittest tests.unit.test_block_processor -v
python -m unittest tests.unit.test_experience_calculator -v

# Integration tests
python -m unittest tests.integration.test_full_pipeline -v

Test Coverage

# Run tests with coverage
pytest tests/ --cov=core --cov=services --cov-report=html

πŸ› οΈ Development

Code Quality

# Format code
black .
isort .

# Lint code
flake8 .

# Type checking
mypy core/ services/

Using Makefile

make format      # Format code
make lint        # Lint code
make type-check  # Type checking
make test        # Run tests
make clean       # Clean temporary files
make quality-check  # Full quality check

Development Dependencies

pip install -r requirements-dev.txt

πŸ“Š Performance

LLM Optimization

  • Fixed Seed: seed=42 for consistent results
  • Smart Temperature Control:
    • Skills & Projects: temperature=0.0 (deterministic)
    • Summary: temperature=0.7 (creative)
    • Education: temperature=1.0 (maximum creativity)
  • Parallel Processing: Concurrent block processing
  • Timeout Management: 120s timeout per LLM request

Benchmarks

  • Processing Time: ~30-60 seconds per resume
  • Memory Usage: ~200-300MB
  • Docker Image Size: ~500MB
  • Startup Time: ~10-15 seconds

πŸ”§ Troubleshooting

Common Issues

LM Studio Connection Error

# Check if LM Studio is running
curl http://localhost:1234/v1/models

# Start LM Studio and load a model

File Format Issues

# Supported formats: DOCX, PDF
# Check file size (max 10MB)
# Ensure file is not corrupted

Docker Issues

# Check Docker is running
docker version

# Rebuild containers
docker-compose down
docker-compose up --build

Logs

  • Application Logs: resume_analyzer.log
  • Docker Logs: docker-compose logs -f resume-analyzer
  • Log Level: Set via LOG_LEVEL environment variable

πŸ“š Documentation

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Clone your fork
git clone https://github.com/ppush/resume-analyzer.git

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
python tests/run_all_tests.py

# Format code
black .
isort .

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support


Made with ❀️ for the developer community

⭐ Star this repo | πŸ› Report Bug | πŸ’‘ Request Feature

About

AI-powered resume analyzer built entirely with Cursor AI. Extracts structured data from PDF/DOCX resumes using LLM processing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors