back to home

AltimateAI / altimate-code

Open-source agentic data engineering harness for dbt, SQL, and cloud warehouses. 100+ tools, 10 warehouses, AI-powered.

View on GitHub
256 stars
16 forks
80 issues

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing AltimateAI/altimate-code in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/AltimateAI/altimate-code)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

**The open-source data engineering harness.** The intelligence layer for data engineering AI — 100+ deterministic tools for SQL analysis, column-level lineage, dbt, FinOps, and warehouse connectivity across every major cloud platform. Run standalone in your terminal, embed underneath Claude Code or Codex, or integrate into CI pipelines and orchestration DAGs. Precision data tooling for any LLM. --- Install Then — in order: **Step 1: Configure your LLM provider** (required before anything works): Or set an environment variable directly: **Step 2 (optional): Auto-detect your data stack** (read-only, safe for production connections): auto-detects dbt projects, warehouse connections (from , Docker, environment variables), and installed tools (dbt, sqlfluff, airflow, dagster, and more). Skip this and start building — you can always run it later. > **Headless / scripted usage:** auto-approves all permission prompts. Not recommended with live warehouse connections. > **Zero additional setup.** One command install. Why a specialized harness? General AI coding agents can edit SQL files. They cannot *understand* your data stack. altimate gives any LLM a deterministic data engineering intelligence layer — no hallucinated SQL advice, no guessing at schema, no missed PII. | Capability | General coding agents | altimate | |---|---|---| | SQL anti-pattern detection | None | 19 rules, confidence-scored | | Column-level lineage | None | Automatic from SQL, any dialect | | Schema-aware autocomplete | None | Live-indexed warehouse metadata | | Cross-dialect SQL translation | None | Snowflake ↔ BigQuery ↔ Databricks ↔ Redshift | | FinOps & cost analysis | None | Credits, expensive queries, right-sizing | | PII detection | None | 30+ regex patterns, 15 categories | | dbt integration | Basic file editing | Manifest parsing, test gen, model scaffolding, lineage | | Data visualization | None | Auto-generated charts from SQL results | | Observability | None | Local-first tracing of AI sessions and tool calls | > **Benchmarked precision:** 100% F1 on SQL anti-pattern detection (1,077 queries, 19 rules, 0 false positives). > 100% edge-match on column-level lineage (500 queries, 13 categories). > See methodology → **What the harness provides:** • **SQL Intelligence Engine** — deterministic SQL parsing and analysis (not LLM pattern matching). 19 rules, 100% F1, 0 false positives. Built for data engineers who've been burned by hallucinated SQL advice. • **Column-Level Lineage** — automatic extraction from SQL across dialects. 100% edge-match on 500 benchmark queries. • **Live Warehouse Intelligence** — indexed schemas, query history, and cost data from your actual warehouse. Not guesses. • **dbt Native** — manifest parsing, test generation, model scaffolding, medallion patterns, impact analysis • **FinOps** — credit consumption, expensive query detection, warehouse right-sizing, idle resource cleanup • **PII Detection** — 15 categories, 30+ regex patterns, enforced pre-execution **Works seamlessly with Claude Code and Codex.** Use or to set up integration in one step. altimate is the data engineering tool layer — use it standalone in your terminal, or mount it as the harness underneath whatever AI agent you already run. The two are complementary. altimate-code is a fork of OpenCode rebuilt for data teams. Model-agnostic — bring your own LLM or run locally with Ollama. Quick demo Key Features All features are deterministic — they parse, trace, and measure. Not LLM pattern matching. SQL Anti-Pattern Detection 19 rules with confidence scoring — catches SELECT *, cartesian joins, non-sargable predicates, correlated subqueries, and more. **100% accuracy** on 1,077 benchmark queries. Column-Level Lineage Automatic lineage extraction from SQL. Trace any column back through joins, CTEs, and subqueries to its source. Works standalone or with dbt manifests for project-wide lineage. **100% edge match** on 500 benchmark queries. FinOps & Cost Analysis Credit analysis, expensive query detection, warehouse right-sizing, unused resource cleanup, and RBAC auditing. Cross-Dialect Translation Transpile SQL between Snowflake, BigQuery, Databricks, Redshift, PostgreSQL, MySQL, SQL Server, and DuckDB. PII Detection & Safety Automatic column scanning for PII across 15 categories with 30+ regex patterns. Safety checks and policy enforcement before query execution. dbt Native Manifest parsing, test generation, model scaffolding, incremental model detection, and lineage-aware refactoring. 12 purpose-built skills including medallion patterns, yaml config generation, and dbt docs. Data Visualization Interactive charts and dashboards from SQL results. The data-viz skill generates publication-ready visualizations with automatic chart type selection based on your data. Local-First Recap Built-in observability for AI interactions — recap tool calls, token usage, and session activity locally. No external services required. View recaps with . Features include loop detection, post-session summary, and shareable HTML exports. AI Teammate Training Teach your AI teammate project-specific patterns, naming conventions, and best practices. The training system learns from examples and applies rules automatically across sessions. Agent Modes Each mode has scoped permissions, tool access, and SQL write-access control. | Mode | Role | Access | |---|---|---| | **Builder** | Create dbt models, SQL pipelines, and data transformations | Full read/write (write SQL prompts for approval; / / hard-blocked) | | **Analyst** | Explore data, run SELECT queries, FinOps analysis, and generate insights | Read-only enforced (SELECT only, no file writes) | | **Plan** | Outline an approach before acting | Minimal (read files only, no SQL or bash) | > **New to altimate?** Start with **Analyst mode** — it's read-only and safe to run against production connections. Need specialized workflows (validation, migration, research)? Create custom agent…