back to home

Best Open Source natural language processing Libraries

A curated list of the most popular GitHub repositories tagged with natural language processing. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

156,780Python
Analyze Code

#2d2l-ai/d2l-zh

《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。

75,722Python
Analyze Code

#3GokuMohandas/Made-With-ML

Learn how to design, develop, deploy and iterate on production-grade ML applications.

46,391Jupyter Notebook
Analyze Code

#4google-research/bert

TensorFlow code and pre-trained models for BERT

39,871Python
Analyze Code

#5hankcs/HanLP

Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification

36,138Python
Analyze Code

#6explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

33,228Python
Analyze Code

#7eugeneyan/applied-ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

28,694
Analyze Code

#8d2l-ai/d2l-en

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

28,194Python
Analyze Code

#9srbhr/Resume-Matcher

Improve your resumes with Resume Matcher. Get insights, keyword suggestions and tune your resumes to job descriptions.

26,002TypeScript
Analyze Code

#10sebastianruder/NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

22,979Python
Analyze Code

#11huggingface/datasets

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

21,200Python
Analyze Code

#12RasaHQ/rasa

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

21,057Python
Analyze Code

#13bee-san/Ciphey

⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡

21,039Python
Analyze Code

#14QwenLM/Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

20,428Python
Analyze Code

#15ShusenTang/Dive-into-DL-PyTorch

本项目将《动手学深度学习》(Dive into Deep Learning)原书中的MXNet实现改为PyTorch实现。

19,316Jupyter Notebook
Analyze Code

#16keon/awesome-nlp

:book: A curated list of resources dedicated to Natural Language Processing (NLP)

18,207
Analyze Code

#17arc53/DocsGPT

Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

17,716Python
Analyze Code

#18graykode/nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

14,855Jupyter Notebook
Analyze Code

#19nltk/nltk

NLTK Source

14,519Python
Analyze Code

#20flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

14,359Python
Analyze Code

#21languagetool-org/languagetool

Style and Grammar Checker for 25+ Languages

14,098Java
Analyze Code

#22Unstructured-IO/unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

14,014HTML
Analyze Code

#23kmario23/deep-learning-drizzle

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

12,796HTML
Analyze Code

#24openvinotoolkit/openvino

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

9,732C++
Analyze Code

#25jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

9,629Python
Analyze Code

#26sloria/TextBlob

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

9,515Python
Analyze Code

#27clips/pattern

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

8,854Python
Analyze Code

#28lazyprogrammer/machine_learning_examples

A collection of machine learning examples and tutorials.

8,823Python
Analyze Code

#29amusi/Deep-Learning-Interview-Book

深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)

8,704
Analyze Code

#30PaddlePaddle/models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

6,948Python
Analyze Code

#31pliang279/awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

6,818
Analyze Code

#32MycroftAI/mycroft-core

Mycroft Core, the Mycroft Artificial Intelligence platform.

6,618Python
Analyze Code

#33axa-group/nlp.js

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more

6,553JavaScript
Analyze Code