back to home

Best Open Source data Libraries

A curated list of the most popular GitHub repositories tagged with data. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1Asabeneh/30-Days-Of-Python

The 30 Days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than 100 days. Follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw

58,405Python
Analyze Code

#2TanStack/query

🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.

48,562TypeScript
Analyze Code

#3run-llama/llama_index

LlamaIndex is the leading document agent and OCR platform

47,100Python
Analyze Code

#4metabase/metabase

The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:

46,049Clojure
Analyze Code

#5DataExpert-io/data-engineer-handbook

This is a repo with links to everything you'd ever want to learn about data engineering

40,236Jupyter Notebook
Analyze Code

#6SheetJS/sheetjs

📗 SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs

36,192
Analyze Code

#7vercel/swr

React Hooks for Data Fetching

32,312TypeScript
Analyze Code

#8sinaptik-ai/pandas-ai

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

23,211Python
Analyze Code

#9PrefectHQ/prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

21,650Python
Analyze Code

#10airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

20,743Python
Analyze Code

#11faker-js/faker

Generate massive amounts of fake data in the browser and node.js

14,896TypeScript
Analyze Code

#12oxnr/awesome-bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

14,239
Analyze Code

#13bchavez/Bogus

:card_index: A simple fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.

9,610C#
Analyze Code

#14D4Vinci/Scrapling

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

9,101Python
Analyze Code

#15rawgraphs/rawgraphs-app

A web interface to create custom vector-based visualizations on top of RAWGraphs core

8,934JavaScript
Analyze Code

#16mage-ai/mage-ai

🧙 Build, run, and manage data pipelines for integrating and transforming data.

8,651Python
Analyze Code

#17flyteorg/flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

6,747Go
Analyze Code