mayooear / ai-pdf-chatbot-langchain
AI PDF chatbot agent built with LangChain & LangGraph
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing mayooear/ai-pdf-chatbot-langchain in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewAI PDF Chatbot & Agent Powered by LangChain and LangGraph This monorepo is a customizable template example of an AI chatbot agent that "ingests" PDF documents, stores embeddings in a vector database (Supabase), and then answers user queries using OpenAI (or another LLM provider) utilising LangChain and LangGraph as orchestration frameworks. This template is also an accompanying example to the book Learning LangChain (O'Reilly): Building AI and LLM applications with LangChain and LangGraph. **Here's what the Chatbot UI looks like:** Table of Contents • Features • Architecture Overview • Prerequisites • Installation • Environment Variables • Frontend Variables • Backend Variables • Local Development • Running the Backend • Running the Frontend • Usage • Uploading/Ingesting PDFs • Asking Questions • Viewing Chat History • Production Build & Deployment • Customizing the Agent • Troubleshooting • Next Steps --- Features • **Document Ingestion Graph**: Upload and parse PDFs into objects, then store vector embeddings into a vector database (we use Supabase in this example). • **Retrieval Graph**: Handle user questions, decide whether to retrieve documents or give a direct answer, then generate concise responses with references to the retrieved documents. • **Streaming Responses**: Real-time streaming of partial responses from the server to the client UI. • **LangGraph Integration**: Built using LangGraph’s state machine approach to orchestrate ingestion and retrieval, visualise your agentic workflow, and debug each step of the graph. • **Next.js Frontend**: Allows file uploads, real-time chat, and easy extension with React components and Tailwind. --- Architecture Overview • **Supabase** is used as the vector store to store and retrieve relevant documents at query time. • **OpenAI** (or other LLM providers) is used for language modeling. • **LangGraph** orchestrates the "graph" steps for ingestion, routing, and generating responses. • **Next.js** (React) powers the user interface for uploading PDFs and real-time chat. The system consists of: • **Backend**: A Node.js/TypeScript service that contains LangGraph agent "graphs" for: • **Ingestion** ( ) - handles indexing/ingesting documents • **Retrieval** ( ) - question-answering over the ingested documents • **Configuration** ( ) - handles configuration for the backend api including model providers and vector stores • **Frontend**: A Next.js/React app that provides a web UI for users to upload PDFs and chat with the AI. --- Prerequisites • **Node.js v18+** (we recommend Node v20). • **Yarn** (or npm, but this monorepo is pre-configured with Yarn). • **Supabase project** (if you plan to store embeddings in Supabase; see Setting up Supabase). • You will need: • - • A table named and a function named for vector similarity search (see LangChain documentation for guidance on setting up the tables). • **OpenAI API Key** (or another LLM provider’s key, supported by LangChain). • **LangChain API Key** (free and optional, but highly recommended for debugging and tracing your LangChain and LangGraph applications). Learn more here --- Installation • **Clone** the repository: • Install dependencies (from the monorepo root): yarn install • Configure environment variables in both backend and frontend. See . files for details. Environment Variables The project relies on environment variables to configure keys and endpoints. Each sub-project (backend and frontend) has its own .env.example. Copy these to .env and fill in your details. Frontend Variables Create a .env file in frontend: Backend Variables Create a .env file in backend: **Explanation of Environment Variables:** • : The URL where your LangGraph backend server is running. Defaults to for local development. • : Your LangSmith API key. This is optional, but highly recommended for debugging and tracing your LangChain and LangGraph applications. • : The ID of the LangGraph assistant for document ingestion. Default is . • : The ID of the LangGraph assistant for question answering. Default is . • : Enable tracing to debug your application on the LangSmith platform. Set to to enable. • : The name of your LangSmith project. • : Your OpenAI API key. • : Your Supabase URL. • : Your Supabase service role key. Local Development This monorepo uses Turborepo to manage both backend and frontend projects. You can run them separately for development. Running the Backend • Navigate to backend: • Install dependencies (already done if you ran yarn install at the root). • Start LangGraph in dev mode: This will launch a local LangGraph server on port 2024 by default. It should redirect you to a UI for interacting with the LangGraph server. Langgraph studio guide Running the Frontend • Navigate to frontend: • Start the Next.js development server: This will start a local Next.js development server (by default on port 3000). Access the UI in your browser at http://localhost:3000. Usage Once both services are running: • Use langgraph studio UI to interact with the LangGraph server and ensure the workflow is working as expected. • Navigate to http://localhost:3000 to use the chatbot UI. • Upload a small PDF document via the file upload button at the bottom of the page. This will trigger the ingestion graph to extract the text and store the embeddings in Supabase via the frontend route. • After the ingestion is complete, ask questions in the chat input. • The chatbot will trigger the retrieval graph via the route to retrieve the most relevant documents from the vector database and use the relevant PDF context (if needed) to answer. Uploading/Ingesting PDFs Click on the paperclip icon in the chat input area. Select one or more PDF files to upload ensuring a total of max 5, each under 10MB (you can change these threshold values in the route). The backend processes the PDFs, extracts text, and stores embeddings in Supabase (or your chosen vector store). Asking Questions • Type your question in the ch…