back to home

apache / polaris

Apache Polaris, the interoperable, open source catalog for Apache Iceberg

1,879 stars
400 forks
319 issues
JavaPythonShell

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing apache/polaris in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/apache/polaris)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

Apache Polaris Apache Polaris™ is an open-source, fully-featured catalog for Apache Iceberg™. It implements Iceberg's REST API, enabling seamless multi-engine interoperability across a wide range of platforms, including Apache Doris™, Apache Flink®, Apache Spark™, Dremio® OSS, StarRocks, and Trino. Documentation is available at https://polaris.apache.org. The REST OpenAPI specifications are available here: Polaris management API doc and Polaris Catalog API doc. [Subscribe to the dev mailing list][dev-list-subscribe] to join discussions via email or browse the archives. Check out the CONTRIBUTING guide for contribution guidelines. [dev-list-subscribe]: mailto:dev-subscribe@polaris.apache.org Polaris Overview Click here for a quick overview of Polaris. Quickstart Click here for the quickstart experience, which will help you set up a Polaris instance locally or on any supported cloud provider. Project Structure Apache Polaris is organized into the following modules: • Primary modules: • - The main Polaris entity definitions and core business logic • API modules - Build scripts for generating Java classes from the OpenAPI specifications: • - Polaris Management API model classes • - Polaris Management API service classes • - The Iceberg REST service classes • - The Polaris Catalog API service classes • Runtime modules: • - The Polaris Admin Tool; mainly for bootstrapping persistence • - The runtime configuration defaults • - The Polaris distribution • - The Polaris Quarkus Server • - The package containing the Polaris service. • - Integration tests for the Polaris Spark plugin • - Test utilities • Persistence modules: • - The JDBC implementation of BasePersistence to be used via AtomicMetaStoreManager • Extensions modules: • - The Hadoop federation extension • - The Hive federation extension • Secondary modules: • - Generates the aggregated license report • - The Bill of Materials (BOM) for Polaris • - Establishes consistent build logic • - Normative integration tests for reuse in downstream projects • Tool modules: • Documentation configuration: • - Annotations for documentation generator • - Generates Polaris reference docs • - The configuration documentation site • Other Tools: • - Helper for container specifications • - Predefined Immutables configuration & annotations for Polaris • - Minio test container • - RustFS test container • - Miscellaneous types for Polaris • - Versioning for Polaris In addition to modules, there are: • API specifications - The OpenAPI specifications • Python client - The Python client • codestyle - The code style guidelines • getting-started - A collection of getting started examples • gradle - The Gradle wrapper and Gradle configuration files including banned dependencies • helm - The Helm charts for Polaris. • Spark Plugin - The Polaris Spark plugin • regtests - Regression tests • server-templates - OpenAPI Generator templates to generate the server code • site - The Polaris website Outside of this repository, there are several other tools that can be found in a separate Polaris-Tools repository. Building and Running Apache Polaris is built using Gradle with Java 21+ and Docker 27+. • - To build and run tests. Make sure Docker is running, as the integration tests depend on it. • - To skip tests. • - To run all checks, including unit tests and integration tests. • - To run the Polaris server locally; the server is reachable at localhost:8181. This is also suitable for running regression tests, or for connecting with Spark. Set your own credentials by specifying system property where: • is the realm • is the CLIENT_ID • is the CLIENT_SECRET • If credentials are not set, it will use preset credentials • - To connect from Spark SQL. Here are some example commands to run in the Spark SQL shell: • - To run regression tests locally, see more options here. Makefile Convenience Commands To streamline the developer experience, especially for common setup and build tasks, a root-level Makefile is available. This Makefile acts as a convenient wrapper around various Gradle commands and other tooling, simplifying interactions. While Gradle remains the primary build system, the Makefile provides concise shortcuts for frequent operations like: • Building Polaris components: e.g., • Managing development clusters: e.g., • Automating Helm tasks: e.g., • Handling dependencies: e.g., • Managing client operations: e.g., To see available commands: For example, to build the Polaris server and its container image, you can simply run: More build and run options Running in Docker • To build the image locally: • - To run the image. The Polaris codebase contains some docker compose examples to quickly get started with Polaris, using different configurations. Check the directory for more information. Running in Kubernetes • See README in for more information. Configuring Polaris Polaris Servers can be configured using a variety of ways. Please see the Configuration Guide for more information. Default configuration values can be found in . Building docs • Docs are generated using Hugo using the Docsy theme. • To view the site locally, run • See README in for more information. Publishing Build Scans to develocity.apache.org Build scans of CI builds from a branch or tag in the repository on GitHub publish build scans to the ASF Develocity instance at develocity.apache.org, if the workflow runs have access to the Apache organization-level secret . Build scans of local developer builds publish build scans only if the Gradle command line option is used. Those build scans are published to Gradle's public Develocity instance (see advanced configuration options below). Note that build scans on Gradle's public Develocity instance are publicly accessible to anyone. You have to accept Gradle's terms of service to publish to the Gradle's public Develocity instance. CI builds originating from pull requests against the GitHub repository are published to Gradle's _…