Best Open Source distributed systems Libraries
A curated list of the most popular GitHub repositories tagged with distributed systems. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1Snailclimb/JavaGuide
Java 面试 & 后端通用面试指南,覆盖计算机基础、数据库、分布式、高并发与系统设计。准备后端技术面试,首选 JavaGuide!
#2doocs/advanced-java
😮 Core Interview Questions & Answers For Experienced Java(Backend) Developers | 互联网 Java 工程师进阶知识完全扫盲:涵盖高并发、分布式、高可用、微服务、海量数据处理等领域知识
#3redis/redis
For developers, who are building real-time data-driven applications, Redis is the preferred, fastest, and most feature-rich cache, data structure server, and document and vector query engine.
#4binhnguyennus/awesome-scalability
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
#5etcd-io/etcd
Distributed reliable key-value store for the most critical data of a distributed system
#6apache/dubbo
The java implementation of Apache Dubbo. An RPC and microservice framework.
#7karanpratapsingh/system-design
Learn how to design systems at scale and prepare for system design interviews
#8spacedriveapp/spacedrive
Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.
#9anoma/anoma
Reference implementation of Anoma
#10ashishps1/awesome-system-design-resources
Learn System Design concepts and prepare for interviews using free resources.
#11conductor-oss/conductor
Conductor is an event driven agentic orchestration platform providing durable and highly resilient execution engine for applications and AI Agents
#12seaweedfs/seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. Enterprise version is at seaweedfs.com.
#13dmlc/xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
#14nsqio/nsq
A realtime distributed messaging platform
#15micro/go-micro
A Go microservices framework
#16Vonng/ddia
《Designing Data-Intensive Application》DDIA 第一版 / 第二版 中文翻译
#17systemdesign42/system-design-academy
If you want to become good at system design, join this newsletter now 👇
#18nats-io/nats-server
High-Performance server for NATS.io, the cloud and edge native messaging system.
#19temporalio/temporal
Temporal service
#20akka/akka-core
A platform to build and run apps that are elastic, agile, and resilient. SDK, libraries, and hosted environments.
#21juicedata/juicefs
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
#22Netflix/conductor
Conductor is a microservices orchestration engine.
#23apache/zookeeper
Apache ZooKeeper
#24trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
#25bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
#26git-bug/git-bug
Distributed, offline-first bug tracker embedded in git
#27cadence-workflow/cadence
Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.
#28twitter/finagle
A fault tolerant, protocol-agnostic RPC system
#29oldratlee/translations
🐼 Chinese translations for classic software development resources
#30robinhood/faust
Python Stream Processing
#31hatchet-dev/hatchet
🪓 Run Background Tasks at Scale
#32hazelcast/hazelcast
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.