Best Open Source rag Libraries
A curated list of the most popular GitHub repositories tagged with rag. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1langgenius/dify
Production-ready platform for agentic workflow development.
#2langchain-ai/langchain
The agent engineering platform
#3open-webui/open-webui
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
#4Shubhamsaboo/awesome-llm-apps
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
#5infiniflow/ragflow
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
#6PaddlePaddle/PaddleOCR
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
#7dair-ai/Prompt-Engineering-Guide
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
#8Mintplex-Labs/anything-llm
The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
#9pathwaycom/llm-app
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
#10FlowiseAI/Flowise
Build AI Agents, Visually
#11mem0ai/mem0
Universal memory layer for AI Agents
#12run-llama/llama_index
LlamaIndex is the leading document agent and OCR platform
#13jeecgboot/JeecgBoot
JeecgBoot 是一款 AI 驱动的低代码开发平台,提供"零代码"与"代码生成"双模式——零代码模式一句话搭建系统,代码生成模式自动输出前后端代码与建表 SQL,生成即可运行。平台内置 AI 聊天助手、AI大模型、知识库、AI流程编排、MCP 与插件体系,兼容主流大模型,支持一句话生成流程图、设计表单、聊天式业务操作,解决 Java 项目 80% 重复工作,高效且不失灵活。
#14milvus-io/milvus
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
#15QuivrHQ/quivr
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
#16mindsdb/mindsdb
Query Engine for AI Analytics: Build self-reasoning agents across all your live data
#17chatchat-space/Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
#18thedotmack/claude-mem
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
#19khoj-ai/khoj
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
#20ItzCrazyKns/Vane
Vane is an AI-powered answering engine.
#21patchy631/ai-engineering-hub
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
#22microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
#23HKUDS/LightRAG
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
#24datawhalechina/hello-agents
📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
#25datawhalechina/happy-llm
📚 从零开始构建大模型
#26labring/FastGPT
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
#27simstudioai/sim
Build, deploy, and orchestrate AI agents. Sim is the central intelligence layer for your AI workforce.
#28chroma-core/chroma
Open-source search and retrieval database for AI applications.
#29langchain-ai/langgraph
Build resilient language agents as graphs.
#30NirDiamant/RAG_Techniques
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.
#31Cinnamon/kotaemon
An open-source RAG-based tool for chatting with your documents.
#32humanlayer/12-factor-agents
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
#33AccumulateMore/CV
✔(已完结)超级全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】
#34eosphoros-ai/DB-GPT
open-source agentic AI data assistant for the next generation of AI + Data products.
#35onyx-dot-app/onyx
Open Source AI Platform - AI Chat with advanced features that works with every LLM
#36elizaOS/eliza
Autonomous agents for everyone
#37arc53/DocsGPT
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
#38promptfoo/promptfoo
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
#39topoteretes/cognee
Knowledge Engine for AI Agent Memory in 6 lines of code
#40langbot-app/LangBot
Production-grade platform for building agentic IM bots - 生产级多平台智能机器人开发平台. 提供 Agent、知识库编排、插件系统 / Bots for Discord / Slack / LINE / Telegram / WeChat(企业微信, 企微智能机器人, 公众号) / 飞书 / 钉钉 / QQ / Satori e.g. Integrated with ChatGPT(GPT), DeepSeek, Dify, n8n, Langflow, Coze, Claude, Gemini, MiniMax, Ollama, SiliconFlow, Moonshot, GLM, clawdbot / openclaw
#41volcengine/OpenViking
OpenViking is an open-source context database designed specifically for AI Agents(such as openclaw). OpenViking unifies the management of context (memory, resources, and skills) that Agents need through a file system paradigm, enabling hierarchical context delivery and self-evolving.
#42liyupi/ai-guide
程序员鱼皮的 AI 资源大全 + Vibe Coding 零基础教程,分享大模型选择指南(DeepSeek / GPT / Gemini / Claude)、最新 AI 资讯、Prompt 提示词大全、AI 知识百科(RAG / MCP / A2A)、AI 编程教程、AI 工具用法(Cursor / Claude Code / OpenClaw / TRAE / Lovable / Agent Skills)、AI 开发框架教程(Spring AI / LangChain)、AI 产品变现指南,帮你快速掌握 AI 技术,走在时代前沿。本项目为开源文档版本,已升级为鱼皮 AI 导航网站
#43sigoden/aichat
All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
#44alibaba/zvec
A lightweight, lightning-fast, in-process vector database
#45vespa-engine/vespa
AI + Data, online. https://vespa.ai
#46dataease/SQLBot
🔥 基于大模型和 RAG 的智能问数系统,对话式数据分析神器。Text-to-SQL Generation via LLMs using RAG.
#47ageerle/ruoyi-ai
面向企业级市场的一站式AI应用开发框架,支持多厂商大模型统一接入与管理,具备安全可控的企业知识库与高精度检索优化能力,提供可视化流程编排、自主决策智能体与多智能体协同调度,兼容主流 Agent Skill 协议,同时支持微信生态扩展,帮助企业与开发者零门槛快速构建安全、高效、可落地的AI智能体应用与行业解决方案。
#48FellouAI/eko
Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai
#49datawhalechina/all-in-rag
🔍大模型应用开发实战一:RAG 技术全栈指南,在线阅读地址:https://datawhalechina.github.io/all-in-rag/
#50PacktPublishing/LLM-Engineers-Handbook
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices