back to home

apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

2,915 stars
762 forks
833 issues
JavaPythonJavaScript

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing apache/gravitino in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/apache/gravitino)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

Apache Gravitino™ Introduction Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages metadata directly in different sources, types, and regions, providing users with unified metadata access for data and AI assets. 🚀 Key Features • **Unified Metadata Management**: Manage diverse metadata sources through a single model and API (e.g., Hive, MySQL, HDFS, S3). • **End-to-End Data Governance**: Features like access control, auditing, and discovery across all metadata assets. • **Direct Metadata Integration**: Changes in underlying systems are immediately reflected via Gravitino’s connectors. • **Geo-Distribution Support**: Share metadata across regions and clouds to support global architectures. • **Multi-Engine Compatibility**: Seamlessly integrates with query engines without modifying SQL dialects. • **AI Asset Management (WIP)**: Support for AI model and feature tracking. 🌐 Common Use Cases • Federated metadata discovery across data lakes and data warehouses • Multi-region metadata synchronization for hybrid or multi-cloud setups • Data and AI asset governance with unified audit and access control • Plug-and-play access for engines like Trino or Spark • Support for evolving metadata standards, including AI model lineage 📚 Documentation The latest Gravitino documentation is available at gravitino.apache.org/docs/latest. This README provides a basic overview; visit the site for full installation, configuration, and development documentation. 🧪 Quick Start Use Gravitino Playground (Recommended) Gravitino provides a Docker Compose–based playground for a full-stack experience. Clone or download the Gravitino Playground repository and follow its README. Run Gravitino Locally • Download and extract a binary release. • Edit to configure settings. • Start the server: • To stop: Press to stop. • (Optional) Use the new UI • To switch to the new UI at runtime: edit (or set the environment variable before starting) and set to : • Alternatively, you can remove the line from (the template defaults to ); removing that line will revert the service to the legacy UI behavior. 🧊 Iceberg REST Catalog Gravitino provides a native Iceberg REST catalog service. See: Iceberg REST catalog service 🗄️ Lance REST Catalog Gravitino provides a native Lance REST catalog service. See: Lance REST catalog service 🔌 Trino Integration Gravitino includes a Trino connector for federated metadata access. See: Using Trino with Gravitino 🛠️ Building from Source Gravitino uses Gradle. Windows is not currently supported. Clean build without tests: Build a distribution: Or compressed package: Artifacts are output to the directory. More build options: How to build Gravitino 👨‍💻 Developer Resources • How to build Gravitino • How to test Gravitino • Publish Docker images 🤝 Contributing We welcome all kinds of contributions—code, documentation, testing, connectors, and more! To get started, please read our CONTRIBUTING.md guide. 🔗 ASF Resources • 📬 Mailing List: dev@gravitino.apache.org (subscribe) • 🐞 Issue Tracker: GitHub Issues 🪪 License Apache Gravitino is licensed under the Apache License, Version 2.0. See the LICENSE file for details. Apache®, Apache Gravitino™, Apache Hadoop®, Apache Hive™, Apache Iceberg™, Apache Kafka®, Apache Spark™, Apache Submarine™, Apache Thrift™, and Apache Zeppelin™ are trademarks of the Apache Software Foundation in the United States and/or other countries.