AskBit: A Bit-Based Semantic RAG FAQ Assistant

AskBit is a FAQ and knowledge assistant that uses bit vector encoding of semantic sentence embeddings for high-speed retrieval—and integrated generative RAG capabilities via local Llama 3 using Ollama. AskBit gives you blazing-fast semantic search plus fully offline, context-grounded generative answers in an easy-to-use CLI.

🚀 Quickstart

1. Install dependencies (via uv)

uv pip install -r requirements.txt

2. Run AskBit CLI

make dev

This runs:

uv run python cli.py

Launching the interactive CLI interface:

Welcome to AskBit CLI. Type help or ? to list commands.
(askbit)

🛠️ CLI Commands

🧠 Train the model

(askbit) train data/faq.json

Train on a JSON file containing an array of Q&A pairs:

[
  {"question": "...", "answer": "..."},
  ...
]

💬 Retrieval-Augmented Generation (RAG) Answer

(askbit) ask "How do I reset my password?"

Returns a generated answer:

Finds the most relevant FAQs using fast semantic search.
Builds a context prompt and sends it to Llama 3 (via Ollama).
Streams the generative answer live in your terminal.

Requires Ollama and the Llama 3 model installed locally (ollama run llama3 must work in your terminal).

🔎 Pure Retrieval

(askbit) match "How do I reset my password?"

Returns the best-matched answer based on semantic similarity, with no generation.

🗞 Debug / Inspect

(askbit) vector "How do I reset my password?"           # Show bit vector for a query
(askbit) topk "How do I reset my password?" --topk 3    # Show top 3 FAQ matches

📦 Architecture Overview

1. Input: FAQ Dataset

Example input:

faq = [
    ("How to reset my password?", "Click 'Forgot Password' on the login page."),
    ("What is the refund policy?", "Refunds are processed within 5 business days."),
    # ... more pairs
]

2. Semantic Bit Encoding

Each question-answer pair is encoded as a dense semantic vector using SBERT (sentence-transformers).
Embeddings are binarized: each float dimension >0 becomes 1, else 0.
Results in a compact, efficient, semantic bit vector per item.

Example:

dense_vec = sbert_model.encode("example text", normalize_embeddings=True)
bit_vector = (dense_vec > 0).astype(int)

3. Binary KNN Retrieval

Uses sklearn KNeighborsClassifier with Hamming distance for bitwise similarity.
Finds the closest entry (or top-K) for any query—semantics, not keywords!

4. Retrieval-Augmented Generation (RAG) with Ollama Llama 3

The best-matched FAQ entries (top-K context) are assembled as a prompt.
This context and the user's query are sent to Llama 3 served locally by Ollama.
Llama 3 generates a fluent, context-aware answer, which is streamed live back to the user in the terminal.

Example RAG prompt

Context:
Q1: How do I reset my password?
A1: Click 'Forgot Password' on the login page.

Q2: I can't log in, what should I do?
A2: Use the reset password link or contact support.

User question: How do I access my account if I forgot my password?
Answer (only with the answer, using the above context):

💡 How AskBit Works

Training:
- Prepare FAQ Q&A pairs.
- Encode each as binary semantic vectors.
- Train KNN (Hamming) on these bits.
Retrieval:
- Encode user query as bits.
- Retrieve best FAQ matches with bitwise KNN.
RAG/Generation (ask):
- Build a prompt of top-K FAQ Q&A.
- Pass as context to Llama 3 via Ollama.
- Stream the LLM's answer back in real time.

🧠 Why Bit Vectors + RAG?

Semantic search: Paraphrased and fuzzy queries work out of the box.
Lightning-fast: Bit ops are 50× faster than float vector math.
Tiny memory/ram: 1M FAQs ≈ 48MB for bit vectors.
Truly useful RAG: LLM is grounded in your actual company knowledge, avoids hallucinating.
Private and offline: No API calls, all runs on your own machine.

🌐 Ollama Integration

Ollama makes it trivial to run Llama 3 and other powerful models locally with streaming output.
You must have Ollama and the llama3 model installed:
- brew install ollama
- ollama run llama3 (test)
AskBit streams answers directly from Ollama, enhancing both retrieval and user experience.

⚙️ Development Notes

Dependencies in requirements.txt.
Environment managed via uv.
Run with:

make dev

Extendable: swap out the LLM, tweak how context is constructed, or add more models.

📝 Summary

AskBit is a blazing-fast, local RAG assistant—combining semantic bit vector search with offline generative power. It supports robust FAQ tasks (retrieval), real conversational Q&A (generation), and is hackable, explainable, and private—designed for your own data.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
core		core
data		data
services		services
.gitignore		.gitignore
.python-version		.python-version
Makefile		Makefile
README.md		README.md
cli.py		cli.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AskBit: A Bit-Based Semantic RAG FAQ Assistant

🚀 Quickstart

1. Install dependencies (via uv)

2. Run AskBit CLI

🛠️ CLI Commands

🧠 Train the model

💬 Retrieval-Augmented Generation (RAG) Answer

🔎 Pure Retrieval

🗞 Debug / Inspect

📦 Architecture Overview

1. Input: FAQ Dataset

2. Semantic Bit Encoding

3. Binary KNN Retrieval

4. Retrieval-Augmented Generation (RAG) with Ollama Llama 3

Example RAG prompt

💡 How AskBit Works

🧠 Why Bit Vectors + RAG?

🌐 Ollama Integration

⚙️ Development Notes

📝 Summary

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AskBit: A Bit-Based Semantic RAG FAQ Assistant

🚀 Quickstart

1. Install dependencies (via uv)

2. Run AskBit CLI

🛠️ CLI Commands

🧠 Train the model

💬 Retrieval-Augmented Generation (RAG) Answer

🔎 Pure Retrieval

🗞 Debug / Inspect

📦 Architecture Overview

1. Input: FAQ Dataset

2. Semantic Bit Encoding

3. Binary KNN Retrieval

4. Retrieval-Augmented Generation (RAG) with Ollama Llama 3

Example RAG prompt

💡 How AskBit Works

🧠 Why Bit Vectors + RAG?

🌐 Ollama Integration

⚙️ Development Notes

📝 Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages