LeDat98 / NexusRAG

Hybrid RAG system combining vector search, knowledge graph (LightRAG), and cross-encoder reranking — with Docling document parsing, visual intelligence (image/table captioning), agentic streaming chat, and inline citations. Powered by Gemini or local Ollama models.

164 stars

40 forks

4 issues

PythonTypeScriptCSS

Chat with Codebase Architecture Scan Security Audit Explain Codebase

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing LeDat98/NexusRAG in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Click here to launch the interactive analysis workspace

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/LeDat98/NexusRAG)

Preview:

Repository Overview (README excerpt)

Crawler view

NexusRAG Hybrid Knowledge Base with Agentic Chat, Citations & Knowledge Graph **Upload documents. Ask questions. Get cited answers.** NexusRAG combines vector search, knowledge graph, and cross-encoder reranking into one seamless RAG pipeline — powered by Gemini, local Ollama, or fully offline sentence-transformers. Features · Quick Start · Model Recommendations · Tech Stack --- Architecture Showcase --- Beyond Traditional RAG Most RAG systems follow a simple pipeline: split text → embed → retrieve → generate. NexusRAG goes further at every stage: | Aspect | Traditional RAG | NexusRAG | |---|---|---| | **Document Parsing** | Plain text extraction, structure lost | Docling: preserves headings, page boundaries, formulas, layout | | **Images & Tables** | Ignored entirely | Extracted, captioned by vision LLM, embedded as searchable vectors | | **Chunking** | Fixed-size splits, breaks mid-sentence | Hybrid semantic + structural chunking (respects headings, tables) | | **Embeddings** | Single model for everything | Dual-model: BAAI/bge-m3 (1024d, search) + KG embedding (Gemini 3072d / Ollama / sentence-transformers) | | **Retrieval** | Vector similarity only | 3-way parallel: Vector over-fetch + KG entity lookup + Cross-encoder rerank | | **Knowledge** | No entity awareness | LightRAG graph: entity extraction, relationship mapping, multi-hop traversal | | **Context** | Raw chunks dumped to LLM | Structured assembly: KG insights → cited chunks → related images/tables | | **Citations** | None or manual | Auto-generated 4-char IDs with page number and heading path | | **Page awareness** | Lost after chunking | Preserved end-to-end: chunk → citation → document viewer navigation | --- Features Deep Document Parsing (Docling) NexusRAG uses Docling for structural document understanding — not just text extraction: • **Structural preservation** — Heading hierarchy ( ), page boundaries, paragraph grouping • **Formula enrichment** — LaTeX math notation preserved during conversion • **Multi-format** — PDF, DOCX, PPTX, HTML, TXT with consistent output • **Hybrid chunking** — respects semantic AND structural boundaries — never splits mid-heading or mid-table • **Page-aware metadata** — Every chunk carries its page number, heading path, and references to images/tables on the same page Hybrid Retrieval Pipeline | Stage | Technology | Details | |---|---|---| | **Vector Embedding** | BAAI/bge-m3 | 1024-dim multilingual bi-encoder (100+ languages) | | **KG Embedding** | Gemini / Ollama / sentence-transformers | Configurable: Gemini (3072d), Ollama, or local sentence-transformers (e.g. bge-m3 1024d) | | **Vector Search** | ChromaDB | Cosine similarity, over-fetch top-20 candidates | | **Knowledge Graph** | LightRAG | Entity/relationship extraction, keyword-to-entity matching | | **Reranking** | BAAI/bge-reranker-v2-m3 | Cross-encoder joint scoring — encodes (query, chunk) pairs together | | **Generation** | Gemini / Ollama | Agentic streaming chat with function calling | **Why two embedding models?** Vector search needs speed (local bge-m3, 1024-dim). Knowledge graph extraction needs semantic richness for entity recognition — choose Gemini Embedding (3072-dim, cloud), Ollama, or sentence-transformers (fully local, no API needed). Each model is optimized for its role. **Retrieval flow:** • **Parallel retrieval** — Vector over-fetch (top-20) + KG entity lookup run simultaneously • **Cross-encoder reranking** — All 20 candidates scored jointly with the query through a transformer (far more precise than cosine similarity alone) • **Filtering** — Keep top-8 above relevance threshold (0.15), with fallback to top-3 if all below • **Media discovery** — Find images and tables on the same pages as retrieved chunks Visual Document Intelligence Images and tables are **embedded into chunk vectors** — not stored separately. When Docling extracts an image on page 5, its LLM-generated caption is appended to the text chunks on that page before embedding. This means searching for "revenue chart" finds chunks that contain the chart description, without needing a separate image search index. **Image Pipeline** • Docling extracts images from PDF/DOCX/PPTX (up to 50 per document, 2x resolution) • Vision LLM (Gemini Vision or Ollama multimodal) generates captions: specific numbers, labels, trends • Captions appended to page chunks: • Chunk is embedded → **image becomes vector-searchable** through its description • During retrieval, images on matched pages are surfaced as references **Table Pipeline** • Docling exports tables as structured Markdown (preserving rows, columns, dimensions) • Text LLM summarizes each table: purpose, key columns, notable values (max 500 chars) • Summaries appended to page chunks: • Table summaries injected back into document Markdown as blockquotes for the document viewer Citation System Every answer is grounded in source documents with **4-character citation IDs** (e.g., ): • **Inline citations** — Clickable badges embedded directly in the answer text • **Source cards** — Each citation shows filename, page number, heading path, and relevance score • **Cross-navigation** — Click a citation to jump to the exact section in the document viewer • **Image references** — Visual content cited separately as with page tracking • **Strict grounding** — The LLM is instructed to only cite sources that directly support claims, max 3 per sentence Knowledge Graph Visualization Interactive force-directed graph built from extracted entities and relationships: • **Entity types** — Person, Organization, Product, Location, Event, Technology, Financial Metric, Date, Regulation (configurable) • **Force simulation** — Repulsion + spring forces + center gravity with real-time physics • **Pan & zoom** — Mouse drag, scroll wheel (0.3x-3x), keyboard reset • **Node interaction** — Click to select, hover to highlight connected edges, drag to reposition • **Entity scaling** — Node radius proportional to co…