hud-evals / hud-python

OSS RL environment + evals toolkit

318 stars

54 forks

11 issues

PythonShell

Chat with Codebase Architecture Scan Security Audit Explain Codebase

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing hud-evals/hud-python in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Click here to launch the interactive analysis workspace

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/hud-evals/hud-python)

Preview:

Repository Overview (README excerpt)

Crawler view

The HUD SDK is an open-source Python toolkit for building, evaluating, and training AI agents. Use a unified API for any model provider, wrap your code as MCP environments, run A/B evals at scale, and train with reinforcement learning. To learn more, check out our Documentation and API Reference. Install Get your API key at hud.ai and set it: > For CLI tools ( , , etc.): Usage Unified Model API Use Claude, GPT, Gemini, or Grok through one OpenAI-compatible endpoint: Every call is traced at hud.ai. → Docs Environments Turn your code into tools agents can call. Define how to evaluate them: The agent runs between the yields. First yield sends the prompt, second yield scores the result. → Docs · Templates A/B Evals Test different models. Repeat runs to see the distribution: **Variants** test configurations. **Groups** repeat for distribution. Results stream to hud.ai. → Docs Deploy & Train Push to GitHub, connect on hud.ai, run at scale: Every run generates training data. Use it to fine-tune or run RL. → Docs Links • 📖 Documentation • ⌨️ CLI Reference • 🏆 Leaderboards • 🌐 Environment Templates • 🤖 Supported Models • 💬 Discord Enterprise Building agents at scale? We work with teams on custom environments, benchmarks, and training. 📅 Book a call · 📧 founders@hud.ai Contributing We welcome contributions! See CONTRIBUTING.md. Key areas: Agents · Tools · Environments Citation MIT License · LICENSE