Best Open Source distributed Libraries
A curated list of the most popular GitHub repositories tagged with distributed. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1tensorflow/tensorflow
An Open Source Machine Learning Framework for Everyone
#2ClickHouse/ClickHouse
ClickHouse® is a real-time analytics database management system
#3mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference
#4milvus-io/milvus
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
#5ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
#6nextcloud/server
☁️ Nextcloud server, a safe home for all your data
#7surrealdb/surrealdb
A scalable, distributed, collaborative, document-graph database, for the realtime web
#8xuxueli/xxl-job
A distributed task scheduling framework.(分布式任务调度平台XXL-JOB)
#9ageron/handson-ml
⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
#10taosdata/TDengine
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
#11dianping/cat
CAT 作为服务端项目基础组件,提供了 Java, C/C++, Node.js, Python, Go 等多语言客户端,已经在美团点评的基础架构中间件框架(MVC框架,RPC框架,数据库框架,缓存框架等,消息队列,配置系统等)深度集成,为美团点评各业务线提供系统丰富的性能指标、健康状况、实时告警等。
#12teambit/bit
AI-powered development workspaces with reusable components, architectural clarity and zero overhead.
#13lightgbm-org/LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
#14Oneflow-Inc/oneflow
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
#15orbitdb/orbitdb
Peer-to-Peer Databases for the Decentralized Web
#16hazelcast/hazelcast
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
#17GreptimeTeam/greptimedb
The open-source Observability 2.0 database. One engine for metrics, logs, and traces — replacing Prometheus, Loki & ES.
#18Eventual-Inc/Daft
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
#19microsoft/FluidFramework
Library for building distributed, real-time collaborative web applications
#20redis/rueidis
A fast Golang Redis client that supports Client Side Caching, Auto Pipelining, RDMA, etc.
#21optuna/optuna-examples
Examples for https://github.com/optuna/optuna
#22Mesh-LLM/mesh-llm
Distributed AI/LLM for the people. Share compute privately or publicly to power your agents and chat.
#23helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python