back to home

Best Open Source language model Libraries

A curated list of the most popular GitHub repositories tagged with language model. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1microsoft/generative-ai-for-beginners

21 Lessons, Get Started Building with Generative AI

106,699Jupyter Notebook
Analyze Code

#2rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

85,633Jupyter Notebook
Analyze Code

#3dair-ai/Prompt-Engineering-Guide

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

70,629MDX
Analyze Code

#4xtekky/gpt4free

The official gpt4free repository | various collection of powerful language models | opus 4.6 gpt 5.3 kimi 2.5 deepseek v3.2 gemini 3

65,721Python
Analyze Code

#5LAION-AI/Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

37,449Python
Analyze Code

#6tatsu-lab/stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

30,266Python
Analyze Code

#7mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

22,059Python
Analyze Code

#8yamadashy/repomix

📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools like Claude, ChatGPT, DeepSeek, Perplexity, Gemini, Gemma, Llama, Grok, and more.

21,983TypeScript
Analyze Code

#9vercel/ai

The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source library for building AI-powered applications and agents

21,928TypeScript
Analyze Code

#10arc53/DocsGPT

Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

17,716Python
Analyze Code

#11BlinkDL/RWKV-LM

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

14,357Python
Analyze Code

#12mlfoundations/open_clip

An open source implementation of CLIP.

13,397Python
Analyze Code

#13microsoft/LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

13,264Python
Analyze Code

#14neuml/txtai

💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

12,192Python
Analyze Code

#15brightmart/nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,854
Analyze Code

#16BlinkDL/ChatRWKV

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

9,512Python
Analyze Code

#17OpenNMT/OpenNMT-py

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

6,990Python
Analyze Code

#18zai-org/CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

6,724Python
Analyze Code

#19codertimo/BERT-pytorch

Google AI 2018 BERT pytorch implementation

6,518Python
Analyze Code