back to home
Best Open Source datasets Libraries
A curated list of the most popular GitHub repositories tagged with datasets. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1awesomedata/awesome-public-datasets
A topic-centric list of HQ open datasets.
72,958
Analyze Code
#2HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
26,476TypeScript
Analyze Code
#3huggingface/datasets
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
21,200Python
Analyze Code
#4activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
9,008C++
Analyze Code