back to home

xr843 / fojin

Buddhist Digital Text Platform — 9,200+ texts, 500+ sources, 8 UI languages, AI Q&A (RAG), knowledge graph, full-text search

View on GitHub
143 stars
26 forks
10 issues

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing xr843/fojin in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/xr843/fojin)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

FoJin 佛津 The World's Encyclopedic Buddhist Digital Text Platform **500+ sources. 30 languages. 30 countries. One search.** Aggregating the world's Buddhist digital heritage — 9,200+ texts in Pali, Classical Chinese, Tibetan, and Sanskrit from 504 data sources — with full-text reading, AI-powered Q&A, knowledge graph, collections, citations, annotations, bookmarks, and multi-language parallel reading. Live Demo  ·  中文文档  ·  Discord  ·  Report Bug --- Why FoJin? Buddhist texts are scattered across hundreds of databases worldwide — CBETA, SuttaCentral, BDRC, SAT, 84000, GRETIL, and many more. Each has different interfaces, languages, and data formats. Researchers spend more time *finding* texts than *reading* them. **FoJin solves this.** It aggregates 504 sources into a single, searchable platform with features no other tool provides: | What you need | How FoJin helps | |---|---| | Find a sutra across databases | **Multi-dimensional search** across 9,200+ texts from 504 sources | | Read the full text online | **7,600+ texts** with full content available for online reading | | Compare translations | **Parallel reading** in 30 languages side by side | | Look up Buddhist terms | **6 dictionaries**, 237K entries (Chinese/Sanskrit/Pali/English) | | Explore relationships | **Knowledge graph** with 9,600+ entities and 3,800+ relations | | View original manuscripts | **IIIF manuscript viewer** connected to BDRC and more | | Ask questions about texts | **AI Q&A** ("XiaoJin") grounded in 11M characters of canonical text | | Save and organize | **Collections, bookmarks, annotations** for personal study | | Cite in research | **Citation export** (BibTeX, RIS, APA) for academic use | Quick Start Then visit: **http://localhost:3000** > API docs at http://localhost:8000/docs Features Multi-Dimensional Search Search across Buddhist canons by title, translator, catalog number, or full-text keyword. Powered by Elasticsearch with ICU tokenizer for multi-language support. Full-Text Reading Read 7,600+ Buddhist texts with full content online. Navigate by volume, scroll through content, and jump between related texts. Parallel Reading (30 Languages) Compare translations side by side — Classical Chinese, Sanskrit, Pali, Tibetan, English, Japanese, Korean, Gandhari, and 21 more languages. Dictionary Lookup 6 authoritative dictionaries with 237,593 entries: • **DDB** (Digital Dictionary of Buddhism) • **SuttaCentral Glossary** (Pali) • **NCPED** (New Concise Pali-English Dictionary) • **NTI** (Nan Tien Institute Buddhist Dictionary) • **Edgerton BHS** (Buddhist Hybrid Sanskrit Dictionary) • **Monier-Williams** (Sanskrit-English Dictionary) Knowledge Graph 9,600+ entities (persons, monasteries, texts, schools) and 3,800+ relationships, visualized as an interactive force-directed graph. Click any node to explore connections. AI Q&A — "XiaoJin" Ask questions in natural language. XiaoJin answers based on canonical Buddhist texts (38 core sutras, ~11M characters) using RAG (Retrieval-Augmented Generation). Every answer includes citations to the source text. Collections, Bookmarks & Annotations Save texts to personal collections, bookmark specific passages, and add annotations for study and research. Citation Export Export citations in BibTeX, RIS, and APA formats for academic papers and reference managers. Manuscript Viewer Browse digitized manuscripts and rare editions from BDRC and other institutions via IIIF protocol. Multi-Language UI Available in 8 languages: Chinese, English, Japanese, Korean, Thai, Vietnamese, Sinhala, and Burmese. Data Sources FoJin aggregates data from major Buddhist digital projects worldwide: | Source | Content | Languages | |--------|---------|-----------| | CBETA | Chinese Buddhist Canon | Classical Chinese | | SuttaCentral | Early Buddhist Texts | Pali, Chinese, English | | 84000 | Tibetan Buddhist Canon | Tibetan, English, Sanskrit | | BDRC | Tibetan manuscripts (IIIF) | Tibetan | | SAT | Taisho Tripitaka | Chinese, Japanese | | GRETIL | Sanskrit e-texts | Sanskrit | | DSBC | Digital Sanskrit Buddhist Canon | Sanskrit | | Gandhari.org | Gandhari manuscripts | Gandhari | | VRI Tipitaka | Pali Canon (Chattha Sangayana) | Pali | | Korean Tripitaka | Goryeo Tripitaka | Chinese, Korean | | + 494 more... | | | Tech Stack | Layer | Technology | |-------|-----------| | Frontend | React 18, TypeScript, Vite, Ant Design 5, Zustand, TanStack Query | | Backend | FastAPI, SQLAlchemy (async), Pydantic v2 | | Database | PostgreSQL 15 + pgvector + pg_trgm | | Search | Elasticsearch 8 (ICU tokenizer) | | Cache | Redis 7 | | AI | RAG (pgvector semantic search) + multi-provider LLM | | Deploy | Docker Compose, Nginx (gzip_static, security headers) | | CI | GitHub Actions | Architecture Development Security • Non-root containers (backend: , frontend: ) • Multi-stage Docker builds (no build tools in production) • Internal services bound to only • Memory/CPU limits per container • CSP, X-Frame-Options, X-Content-Type-Options headers • Query length limits on all search parameters • JWT with 8h expiry, production requires strong secret Contributing Contributions are welcome! Whether it's adding a new data source, improving search, fixing bugs, or translating the UI — we'd love your help. • Fork the repository • Create your feature branch ( ) • Commit your changes ( ) • Push to the branch ( ) • Open a Pull Request See CONTRIBUTING.md for detailed guidelines. Roadmap • [x] ~~Citation export (BibTeX, RIS, APA)~~ • [x] ~~Mobile-responsive reader~~ • [x] ~~Public REST API with rate limiting~~ • [x] ~~User annotations~~ • [x] ~~Community-contributed data sources~~ • [x] ~~Internationalization (i18n) — 8 UI languages (Chinese, English, Japanese, Korean, Thai, Vietnamese, Sinhala, Burmese)~~ • [ ] OCR pipeline for scanned texts • [ ] Embedding-based semantic search across all texts • [ ] Collaborative annotation sharing • [ ]…