back to home
Best Open Source datalake Libraries
A curated list of the most popular GitHub repositories tagged with datalake. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1sinaptik-ai/pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
23,211Python
Analyze Code
#2trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
12,579Java
Analyze Code
#3activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
9,008C++
Analyze Code