Security Scanner

An automated, Web security audit tool that crawls a target website, runs security checks across 20+ scanner modules, enhances findings with Google Gemini AI, and presents everything in a real-time dashboard.

Live Demo: https://url-security-scanner.onrender.com

Overview

Security Scanner is a production-ready, full-stack security auditing platform built on:

Python + FastAPI — modular backend with isolated API router and WebSocket manager
Google Gemini AI (gemini-3.1-flash-lite-preview) — deep AI-powered analysis and remediation advice per finding
Real-time WebSocket streaming — live scan progress pushed to the browser without polling
Jinja2 templated HTML reports — self-contained, standalone scan reports rendered server-side
Scheduled auto-scan loop — continuous monitoring via a background daemon thread

The scanner performs both passive analysis (headers, cookies, DNS, JWT, SRI, cache control) and active probing (XSS, SQLi, SSRF, open redirect, path discovery, GraphQL introspection) against any target URL.

Architecture

Project Structure

Security Scanner/
├── main.py                          # Lean FastAPI bootloader — mounts router & WS
├── config.json                      # Runtime scan settings (managed by dashboard)
├── requirements.txt
├── runtime.txt                      # Pins Python 3.11 for Render deployment
├── .env                             # Your API keys (never committed)
├── .gitignore
│
├── api/
│   └── routes.py                    # All REST endpoints + scan thread logic
│
├── sockets/
│   └── manager.py                   # WebSocket connection pool & broadcast engine
│
├── agent/
│   ├── scanner.py                   # Scan orchestrator — runs all phases in order
│   ├── crawler.py                   # BFS web crawler (same-domain, configurable depth)
│   ├── gemini_analyzer.py           # Gemini AI batch analysis & executive summary
│   ├── reporter.py                  # Jinja2 report renderer (JSON + HTML output)
│   ├── scheduler.py                 # Hourly daemon — runs scans, merges findings
│   └── scanners/
│       ├── ssl_check.py             # SSL/TLS cert validation, cipher & version checks
│       ├── headers.py               # HTTP security header checks (CSP, HSTS, etc.)
│       ├── cookies.py               # Cookie flag checks (HttpOnly, Secure, SameSite)
│       ├── content.py               # HTML/JS content: secrets, libs, CSRF, eval()
│       ├── cors.py                  # CORS misconfiguration checks
│       ├── active.py                # Active probes: XSS, SQLi, sensitive paths, redirects
│       ├── cache_control.py         # Cache-Control & Pragma header analysis
│       ├── sri.py                   # Subresource Integrity check for external scripts
│       ├── http_methods.py          # Dangerous HTTP methods (TRACE, PUT, DELETE)
│       ├── error_disclosure.py      # Verbose error page detection
│       ├── directory_listing.py     # Open directory listing detection
│       ├── tech_fingerprint.py      # Technology stack fingerprinting
│       ├── security_txt.py          # security.txt policy file check (RFC 9116)
│       ├── hsts_preload.py          # HSTS preload list validation
│       ├── ssrf.py                  # Server-Side Request Forgery probe
│       ├── rate_limiting.py         # Rate limiting / brute-force protection check
│       ├── jwt_security.py          # JWT token weakness detection
│       ├── graphql_security.py      # GraphQL introspection & DoS checks
│       ├── dns_security.py          # DNS security (SPF, DMARC, DNSSEC, CAA)
│       ├── subdomains.py            # Subdomain enumeration & takeover detection
│       └── virustotal.py            # VirusTotal domain reputation check
│
├── ui/
│   ├── templates/
│   │   ├── index.html               # Dashboard SPA shell (Jinja2)
│   │   └── report_template.html     # Standalone HTML report template (Jinja2)
│   └── static/
│       ├── style.css                # Dark theme CSS
│       └── js/
│           ├── main.js              # Global state (scanRunning, findings, filters)
│           ├── api.js               # REST fetch wrappers (scan, config, reports)
│           ├── ws.js                # WebSocket client & message dispatcher
│           └── ui.js                # DOM rendering (cards, charts, logs, summary)
│
└── reports/                         # Auto-generated output (gitignored)
    ├── scan_<id>_<timestamp>.json
    └── scan_<id>_<timestamp>.html

Prerequisites

Python 3.10+
Google Gemini API key — free at aistudio.google.com
Internet access to reach the target URL

Installation

# 1. Clone or download the project
cd URL_Security_Scanner

# 2. Create a virtual environment (recommended)
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # macOS / Linux

# 3. Install dependencies
pip install -r requirements.txt

Configuration

1. Create a `.env` file

Create a file named .env in the root directory:

GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-3.1-flash-lite-preview
VIRUSTOTAL_API_KEY=your_virustotal_api_key_here

2. `config.json`

Managed automatically by the dashboard, but can also be edited manually:

{
  "target_url": "https://example.com",
  "max_pages": 30,
  "crawl_delay": 1.0,
  "hourly_scan_enabled": false,
  "last_scan": null
}

Field	Description	Default
`target_url`	Website to scan	`""`
`max_pages`	Max pages to crawl (5–100)	`30`
`crawl_delay`	Seconds between requests (0.3–5.0)	`1.0`
`hourly_scan_enabled`	Enable background auto-scan	`false`
`last_scan`	ISO timestamp of last completed scan	`null`

Running the Scanner

python main.py

Open your browser at:

http://localhost:8000

Using the Dashboard

The dashboard has four pages accessible from the left sidebar:

Dashboard

Live metrics: Overall Risk, Total, Critical, High, Medium, Low, Info
AI-generated Executive Summary and Immediate Actions
Severity doughnut chart (Chart.js)
Live scan log preview (WebSocket streamed)

Findings

All findings from the latest scan or any loaded report
Filter bar — All / Critical / High / Medium / Low / Info
Each card shows: severity badge, title, affected location count, category
Expand (▼) to see: Description, Real-World Impact, Technical Details, numbered affected URLs with evidence snippets, Fix Suggestion, code example, OWASP tag, CWE tag

Reports (History)

Last 20 scans sorted newest first
Shows target URL, time, severity badges, overall risk
Click any row to load findings into the Findings view
View Details ↗ opens the full standalone HTML report in a new tab
Delete removes both JSON + HTML report files

Live Log

Full real-time log of every scanner step, pushed via WebSocket
Persists up to 200 lines · scroll-to-bottom on new entries

Sidebar Controls

Control	Description
Target URL	Website to scan
Max Pages	Crawler page limit
Delay (s)	Crawl delay between requests
Hourly Auto-Scan	Toggle background scan loop
Run Scan Now	Trigger an immediate scan

Scanner Modules

Module	File	What It Checks
SSL / TLS	`ssl_check.py`	Certificate validity, expiry, deprecated TLS, weak ciphers
HTTP Headers	`headers.py`	CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Referrer-Policy
Cookies	`cookies.py`	HttpOnly, Secure, SameSite flags
Content	`content.py`	Hardcoded secrets, outdated JS libs, CSRF tokens, eval(), mixed content
CORS	`cors.py`	Wildcard origins, reflected origins, null origin
Active Probes	`active.py`	XSS reflection, SQLi error detection, sensitive path exposure, open redirects
Cache Control	`cache_control.py`	Missing or misconfigured Cache-Control / Pragma headers
SRI	`sri.py`	Missing Subresource Integrity on external scripts/styles
HTTP Methods	`http_methods.py`	Dangerous methods enabled (TRACE, PUT, DELETE)
Error Disclosure	`error_disclosure.py`	Verbose stack traces or debug info in error responses
Directory Listing	`directory_listing.py`	Open directory index pages
Tech Fingerprint	`tech_fingerprint.py`	Server/framework version disclosure
Security.txt	`security_txt.py`	Presence and validity of security.txt (RFC 9116)
HSTS Preload	`hsts_preload.py`	HSTS preload list inclusion check
SSRF	`ssrf.py`	Server-Side Request Forgery via open redirect parameters
Rate Limiting	`rate_limiting.py`	Missing rate limit / brute-force protection headers
JWT Security	`jwt_security.py`	Weak JWT algorithms (none, HS256 with default secret)
GraphQL	`graphql_security.py`	Introspection enabled, no depth limiting
DNS Security	`dns_security.py`	SPF, DMARC, DNSSEC, CAA record checks
Subdomains	`subdomains.py`	Subdomain enumeration and takeover risk
VirusTotal	`virustotal.py`	Domain reputation and malware flagging

How the AI Analysis Works

After all scanner modules run, raw findings are sent to Google Gemini in batches of 20:

Raw findings → Gemini prompt → AI-enhanced findings

Gemini enriches each finding with:

Field	Description
`enhanced_severity`	Re-assessed severity based on real-world exploitability
`impact`	Plain-English real-world impact statement
`technical_details`	In-depth technical explanation
`fix_suggestion`	Specific, copy-paste-ready remediation
`code_example`	Working code fix (Nginx config, JS, response headers, etc.)
`priority`	Fix order (1 = most urgent)
`references`	CVE / CWE / OWASP references

Gemini also produces an executive summary with:

Overall risk level + risk score (0–100)
Top 3 key findings
Top 3 immediate actions
2–3 sentence executive summary

If Gemini is unavailable or rate-limited, the scanner falls back to raw findings gracefully.

Reports

Every completed scan writes two files to reports/:

`scan_<id>_<timestamp>.json`

Full machine-readable report: all findings, occurrences, Gemini enhancements, severity counts, metadata.

`scan_<id>_<timestamp>.html`

Standalone self-contained HTML file rendered from ui/templates/report_template.html via Jinja2:

Metric cards (Overall Risk, Total, Critical, High, Medium, Low, Info, Pages Scanned)
Executive Summary
Key Findings & Immediate Actions (two-column)
Interactive filterable findings with expandable detail panels

Access directly at:

http://localhost:8000/api/reports/<filename>/html

Hourly Auto-Scan Loop

The scheduler (agent/scheduler.py) runs in a background daemon thread using the schedule library.

Enable via:

Toggle "Hourly Auto-Scan" in the sidebar UI, or
Set "hourly_scan_enabled": true in config.json

Behaviour:

Runs a full scan every 60 minutes
Progress streams live to all connected WebSocket clients
New findings are merged into the existing history report for the same target — only net-new vulnerabilities are appended, avoiding duplicate noise
Auto-starts on server boot if enabled in config.json
Toggle off at any time from the sidebar

Dependencies

Package	Version	Purpose
`fastapi`	0.111.0	Web framework
`uvicorn[standard]`	0.30.0	ASGI server
`google-generativeai`	0.7.2	Gemini AI SDK
`requests`	2.32.3	HTTP crawling & probing
`beautifulsoup4`	4.12.3	HTML parsing
`lxml`	5.2.2	Fast HTML/XML backend
`python-dotenv`	1.0.1	`.env` loading
`schedule`	1.2.2	Hourly scan scheduler
`aiofiles`	23.2.1	Async file I/O
`jinja2`	3.1.4	HTML report templating
`python-multipart`	0.0.9	Form parsing
`websockets`	>=10,<12	WebSocket transport
`pyOpenSSL`	24.1.0	SSL/TLS inspection
`dnspython`	2.6.1	DNS record lookups
`PyJWT`	2.8.0	JWT token analysis

Limitations & Notes

Non-destructive probes — XSS uses a benign marked string (<bhk-xss-test>); SQLi only detects database error messages, never extracts data.
Rate limiting — Configurable crawl delay (default 1s) prevents overwhelming target servers.
Same-domain only — The crawler never follows external links.
No headless browser — JS-rendered SPAs may not be fully crawled; HTTP requests only.
Gemini rate limits — Findings are batched in groups of 20 to stay within free API quotas.
Authorized use only — Only scan websites you own or have explicit written permission to test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Security Scanner

Table of Contents

Overview

Architecture

Project Structure

Prerequisites

Installation

Configuration

1. Create a `.env` file

2. `config.json`

Running the Scanner

Using the Dashboard

Dashboard

Findings

Reports (History)

Live Log

Sidebar Controls

Scanner Modules

How the AI Analysis Works

Reports

`scan_<id>_<timestamp>.json`

`scan_<id>_<timestamp>.html`

Hourly Auto-Scan Loop

Dependencies

Limitations & Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent		agent
api		api
docs		docs
sockets		sockets
ui		ui
.gitignore		.gitignore
README.md		README.md
config.json		config.json
main.py		main.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt

Folders and files

Latest commit

History

Repository files navigation

Security Scanner

Table of Contents

Overview

Architecture

Project Structure

Prerequisites

Installation

Configuration

1. Create a .env file

2. config.json

Running the Scanner

Using the Dashboard

Dashboard

Findings

Reports (History)

Live Log

Sidebar Controls

Scanner Modules

How the AI Analysis Works

Reports

scan_<id>_<timestamp>.json

scan_<id>_<timestamp>.html

Hourly Auto-Scan Loop

Dependencies

Limitations & Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Create a `.env` file

2. `config.json`

`scan_<id>_<timestamp>.json`

`scan_<id>_<timestamp>.html`

Packages