All projects
2026 · Coursework + extension · 1 min read
Document QA Assistant
Retrieval-augmented question answering over a document corpus. Embedding store, retriever, generator, and an evaluation harness. Foundation for the security-flavoured RAG project.
- Python
- Transformers
- Vector search
- RAG
What it does
Answers natural-language questions against a static document corpus by retrieving the most relevant passages and grounding the model's answer in those passages.
Pipeline
- Ingest: chunk documents into ~500-token windows with overlap.
- Embed: encode with a sentence-transformer model.
- Retrieve: cosine top-k against the query embedding.
- Generate: prompt the LLM with question + retrieved passages, ask for an answer with explicit source citations.
- Evaluate: hand-graded answers against a held-out question set, plus automatic metrics (faithfulness, answer relevance).
Why I'm extending it
The Phase-2 plan turns this into a public-facing Security Copilot RAG — the same architecture, but the corpus becomes CVE feeds and a target codebase, and the eval harness becomes the centrepiece. Most RAG demos online skip evaluation; that's exactly where the interesting engineering lives.
Lessons so far
- Chunk boundaries matter more than chunk size. Overlap helps.
- Cosine similarity is fine until it isn't; switching to a small reranker on the top-50 made a measurable jump in answer quality.
- An evaluation harness you actually run is worth ten you talk about.