back to home

Best Open Source inference Libraries

A curated list of the most popular GitHub repositories tagged with inference. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

70,857Python
Analyze Code

#2ggml-org/whisper.cpp

Port of OpenAI's Whisper model in C/C++

46,889C++
Analyze Code

#3deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

41,643Python
Analyze Code

#4hpcaitech/ColossalAI

Making large AI models cheaper, faster and more accessible

41,349Python
Analyze Code

#5google-ai-edge/mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

33,837C++
Analyze Code

#6sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

23,633Python
Analyze Code

#7Tencent/ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

22,809C++
Analyze Code

#8SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

21,060Python
Analyze Code

#9gvergnaud/ts-pattern

🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.

14,784TypeScript
Analyze Code

#10NVIDIA/TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

12,703C++
Analyze Code

#11openvinotoolkit/openvino

OpenVINOâ„¢ is an open source toolkit for optimizing and deploying AI inference

9,732C++
Analyze Code

#12RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

9,151C++
Analyze Code

#13xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

9,058Python
Analyze Code

#14oumi-ai/oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

8,860Python
Analyze Code

#15dusty-nv/jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

8,733C++
Analyze Code

#16LMCache/LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

6,910Python
Analyze Code

#17gcanti/io-ts

Runtime type system for IO decoding/encoding

6,822TypeScript
Analyze Code