back to home

Best Open Source data analysis Libraries

A curated list of the most popular GitHub repositories tagged with data analysis. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1apache/superset

Apache Superset is a Data Visualization and Data Exploration Platform

70,618TypeScript
Analyze Code

#2scikit-learn/scikit-learn

scikit-learn: machine learning in Python

65,186Python
Analyze Code

#3pandas-dev/pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

47,933Python
Analyze Code

#4sansan0/TrendRadar

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

46,718Python
Analyze Code

#5metabase/metabase

The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:

46,049Clojure
Analyze Code

#6streamlit/streamlit

Streamlit — A faster way to build and share data apps.

43,570Python
Analyze Code

#7gradio-app/gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

41,779Python
Analyze Code

#8666ghj/BettaFish

微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

35,654Python
Analyze Code

#9gchq/CyberChef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

34,087JavaScript
Analyze Code

#10microsoft/Data-Science-For-Beginners

10 Weeks, 20 Lessons, Data Science for All!

33,972Jupyter Notebook
Analyze Code

#11AMAI-GmbH/AI-Expert-Roadmap

Roadmap to becoming an Artificial Intelligence Expert in 2022

30,751JavaScript
Analyze Code

#12dataease/dataease

🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.

23,425Java
Analyze Code

#13lukasmasuch/best-of-ml-python

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

23,240
Analyze Code

#14sinaptik-ai/pandas-ai

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

23,211Python
Analyze Code

#15airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

20,743Python
Analyze Code

#16allinurl/goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

20,242C
Analyze Code

#17ydataai/ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

13,387Python
Analyze Code

#18tangyudi/Ai-Learn

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

12,642
Analyze Code

#19yzhao062/pyod

A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques

9,722Python
Analyze Code

#20rapidsai/cudf

cuDF - GPU DataFrame Library

9,495C++
Analyze Code

#21K-Dense-AI/claude-scientific-skills

A set of ready to use scientific skills for Claude

9,023Python
Analyze Code

#22flyteorg/flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

6,747Go
Analyze Code

#23rhiever/Data-Analysis-and-Machine-Learning-Projects

Repository of teaching materials, code, and data for my data analysis and machine learning projects.

6,628Jupyter Notebook
Analyze Code