Best Open Source data analysis Libraries
A curated list of the most popular GitHub repositories tagged with data analysis. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
#2scikit-learn/scikit-learn
scikit-learn: machine learning in Python
#3sansan0/TrendRadar
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。
#4pandas-dev/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
#5metabase/metabase
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:
#6streamlit/streamlit
Streamlit — A faster way to build and share data apps.
#7gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
#8666ghj/BettaFish
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
#9gchq/CyberChef
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
#10microsoft/Data-Science-For-Beginners
10 Weeks, 20 Lessons, Data Science for All!
#11AMAI-GmbH/AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
#12akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
#13Kanaries/pygwalker
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
#14K-Dense-AI/claude-scientific-skills
A set of ready to use Agent Skills for research, science, engineering, analysis, finance and writing.
#15yzhao062/pyod
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
#16rapidsai/cudf
cuDF - GPU DataFrame Library
#17lance-format/lance
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
#18Nyandwi/machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
#19alandefreitas/matplotplusplus
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
#20SpiderClub/weibospider
:zap: A distributed crawler for weibo, building with celery and requests.
#21sacridini/Awesome-Geospatial
Long list of geospatial tools and resources
#22rilldata/rill
The fastest business intelligence tool for humans and agents.
#23probabl-ai/skore
Track your Data Science. Skore's open-source Python library accelerates ML model development with automated evaluation reports, smart methodological guidance, and comprehensive cross-validation analysis.
#24ILoveBingLu/CipherTalk
本地导出微信数据,协助您完成爱的备份。
#25spectrochempy/spectrochempy
SpectroChemPy is a framework for processing, analyzing and modeling spectroscopic data for chemistry with Python