Best Open Source llm evaluation Libraries

A curated list of the most popular GitHub repositories tagged with llm evaluation. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1mlflow/mlflow

The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

24,349Python

Analyze Code

#2langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

22,137TypeScript

Analyze Code

#3comet-ml/opik

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

17,798Python

Analyze Code

#4confident-ai/deepeval

The LLM Evaluation Framework

13,745Python

Analyze Code