Best Open Source data analysis Libraries
A curated list of the most popular GitHub repositories tagged with data analysis. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
#2scikit-learn/scikit-learn
scikit-learn: machine learning in Python
#3sansan0/TrendRadar
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。
#4pandas-dev/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
#5metabase/metabase
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:
#6streamlit/streamlit
Streamlit — A faster way to build and share data apps.
#7gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
#8666ghj/BettaFish
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
#9gchq/CyberChef
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
#10microsoft/Data-Science-For-Beginners
10 Weeks, 20 Lessons, Data Science for All!
#11AMAI-GmbH/AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
#12akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
#13Kanaries/pygwalker
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
#14K-Dense-AI/claude-scientific-skills
A set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing.
#15yzhao062/pyod
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
#16rapidsai/cudf
cuDF - GPU DataFrame Library
#17growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
#18cloudquery/cloudquery
Data pipelines for cloud config and security data. Build cloud asset inventory, CSPM, FinOps, and vulnerability management solutions. Extract from AWS, Azure, GCP, and 70+ cloud and SaaS sources.
#19Nyandwi/machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
#20alandefreitas/matplotplusplus
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
#21SpiderClub/weibospider
:zap: A distributed crawler for weibo, building with celery and requests.
#22sacridini/Awesome-Geospatial
Long list of geospatial tools and resources
#23bruin-data/bruin
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
#24Canner/wren-engine
The open context engine for AI agents support 15+ data sources. Built on Rust and Apache DataFusion.
#25glotzerlab/freud
Powerful, efficient particle trajectory analysis in scientific Python.
#26Hack23/cia
Citizen Intelligence Agency. Open-source intelligence platform analyzing Swedish political activities using AI and data visualization. Tracks politicians, government institutions, and parliamentary data, offering detailed insights, performance metrics, and advanced analytics.