A production-minded, retrieval-augmented generation (RAG) agent that allows users to query a personal knowledge base across multiple conversational styles. Built on Next.js 15 App Router, React 19, and Tailwind CSS, this system implements a clean document processing pipeline and offers a complete, zero-dependency Portfolio Demo Mode for safe public showcase.
🔗 Live Demo: matthew-schramm-codex-agent.vercel.app
🔗 Portfolio: matthew-schramm-portfolio.onrender.com
🔗 LinkedIn: linkedin.com/in/matthew-schramm-476523253
Matthew's Codex bridges the gap between raw document sets (PDFs, Markdown, text files) and contextual chat. Instead of relying on generic LLM queries, it uses semantic retrieval to ground answers strictly in Matthew’s personal profile, experience, work-style guidelines, and academic papers.
- Decoupled Architecture: Features a local-first simulation environment (Demo Mode) that maps exactly to production API boundaries. Recruiters can test the full interface, upload files, simulate vector database injection, and query documents without setting up API keys, Pinecone indexes, or paying for tokens.
- Contextual Persona Prompts: Conversational behavior shifts dynamically across five modes (Interview, Story, TL;DR, Humble Brag, Self-Reflection) using prompt preambles while keeping the under-the-hood vector retrieval logic uniform.
- Change-Aware Ingestion Pipeline: Ingestion script scans the local document folder and uses MD5 hashing to run incremental updates—only embedding changed files, saving on API costs and execution times.
- Source Attribution & Citation: Chat bubble answers display clickable source chips that map directly back to the source documents retrieved from the vector index, providing auditability for AI-generated answers.
The following diagram illustrates how documents are processed in the ingestion pipeline and how queries are dynamically routed based on the environment configuration (Production vs. Demo Mode).
flowchart TD
subgraph IngestionPipeline ["Ingestion Pipeline (scripts/ingest.ts)"]
Doc[Raw Files: PDF, MD, TXT] --> Hash{Has file changed?\nMD5 Hash Check}
Hash -- No --> Skip[Skip File]
Hash -- Yes --> Chunk[Unified Parser & Chunking\n1200 chars / 200 overlap]
Chunk --> Embed[OpenAI Embedding\ntext-embedding-3-small]
Embed --> Upsert[Pinecone Upsert\n1536-dim vector]
end
subgraph ChatRetrieval ["Chat Retrieval & Interface (src/app/api/chat)"]
UserQuery[User Query + Selected Mode] --> DemoCheck{isDemoMode?}
DemoCheck -- Yes (Demo Mode) --> LocalFixture[Local Seeds\ndemo-data.ts]
LocalFixture --> OutputDemo[Simulated Response\n+ Seeded Sources]
DemoCheck -- No (Production) --> EmbedQuery[Query Embedded]
EmbedQuery --> VectorQuery[Pinecone Vector Search]
VectorQuery --> Context[Context Construction\nTop-5 Chunks]
Context --> SystemPrompt[System Prompt + Mode Preamble]
SystemPrompt --> LLM[OpenAI GPT-4o-mini]
LLM --> OutputProd[Response with Source Citations]
end
- Multi-Mode Conversations: Swap styles seamlessly (e.g. professional Interview answers, introspective Self-Reflection, narrative Story logs, or quick TL;DR lists).
- Source Attribution: Visual chips showing which documents (resumes, academic transcripts, work-style documents) were referenced.
- Administrative Interface (
/admin): A fully functional admin dashboard to upload documents, review the current file repository, and run ingestion updates. - Robust CLI Tools: Development tools for clear force-reprocessing (
npm run ingest:clear), dry-runs (npm run ingest:dry), and status checks (npm run dataset:status). - Clean UI/UX: Custom Tailwind CSS animations, Radix UI layout elements, and fully responsive layouts.
You can run the application in either Portfolio Demo Mode (zero setup required) or Production Mode (connected to your own OpenAI and Pinecone accounts).
- Node.js v20+
- npm
To run the app immediately with local seed data and mock API boundaries:
-
Setup Environment:
cp .env.example .env.local
(Ensure
DEMO_MODE=trueandNEXT_PUBLIC_DEMO_MODE=trueare set inside.env.local) -
Install & Run (using
make):make setup make demo
Alternatively, run
npm install && npm run dev. -
Open http://localhost:3000 to chat, or visit http://localhost:3000/admin to explore the simulated dataset manager.
To connect the application to real vector stores and AI models:
-
Configure API Keys: Edit
.env.localto disable demo mode and add your production keys:DEMO_MODE=false NEXT_PUBLIC_DEMO_MODE=false OPENAI_API_KEY=your_openai_api_key PINECONE_API_KEY=your_pinecone_api_key PINECONE_INDEX=your_index_name
-
Add Your Documents: Place your personal documents (PDF, Markdown, or TXT) inside
src/data/. -
Ingest the Knowledge Base: Run the ingestion script to parse, embed, and upsert vectors into Pinecone:
make ingest
(Use
make ingest-dryto run a preview of what chunks would be created before hitting OpenAI/Pinecone). -
Launch Application:
make dev
The included Makefile provides short, standard targets for development and administration:
| Command | Action |
|---|---|
make setup |
Installs project dependencies and copies the .env.example template |
make dev |
Starts the Next.js development server in production mode |
make demo |
Starts the development server in offline, seed-backed Portfolio Demo Mode |
make ingest |
Runs the document ingestion script to process and embed new files |
make ingest-dry |
Previews document chunking and metadata generation without uploading to Pinecone |
make ingest-clear |
Clear the existing Pinecone index vectors for files being re-ingested |
make build |
Builds the production bundle |
make lint |
Validates TypeScript and ESLint standards |
- Legacy PDF Parsing in Node: Node.js environments often struggle with native client-side PDF readers due to canvas and DOM dependencies. We resolved this by employing the
pdfjs-dist/legacy/build/pdf.mjsloader directly inscripts/ingest.ts, allowing unified local PDF parsing without requiring OS-level binaries. - Permissive Relevance Thresholds: The retrieval threshold in
src/app/api/chat/route.tsis tuned to0.2rather than0.7to ensure conversational flow remains warm and informative for resumes and cover letters, falling back gracefully to general conversational modes rather than failing abruptly on minor semantic mismatches. - Decoupled API Contract: By implementing the
isDemoModecheck directly inside the API handlers (/api/chat,/api/upload,/api/ingest,/api/dataset), we preserve the React client-side async fetch states exactly as they would operate on a live site, proving UI integrity and API layout compliance.
This project is an Active Showcase / Portfolio Demo. It represents modern engineering practices in full-stack Next.js design, RAG implementation, security compliance, and developer convenience.
For comments, feedback, or networking, please contact Matthew at mattschramm1235@gmail.com or visit the Portfolio.