kubernetes-sigs / kueue
Kubernetes-native Job Queueing
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing kubernetes-sigs/kueue in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewKueue [![GoReport Widget]][GoReport Status] [GoReport Widget]: https://goreportcard.com/badge/sigs.k8s.io/kueue [GoReport Status]: https://goreportcard.com/report/sigs.k8s.io/kueue Kueue is a set of APIs and controller for job queueing. It is a job-level manager that decides when a job should be admitted to start (as in pods can be created) and when it should stop (as in active pods should be deleted). Read the overview and watch the Kueue-related talks & presentations to learn more. Features overview • **Job management:** Support job queueing based on priorities with different strategies: and . • **Advanced Resource management:** Comprising: resource flavor fungibility, Fair Sharing, cohorts and preemption with a variety of policies between different tenants. • **Integrations:** Built-in support for popular jobs, e.g. BatchJob, Kubeflow training jobs, RayJob, RayCluster, JobSet, plain Pod and Pod Groups. • **System insight:** Built-in prometheus metrics to help monitor the state of the system, and on-demand visibility endpoint for monitoring of pending workloads. • **AdmissionChecks:** A mechanism for internal or external components to influence whether a workload can be admitted. • **Advanced autoscaling support:** Integration with cluster-autoscaler's provisioningRequest via admissionChecks. • **All-or-nothing with ready Pods:** A timeout-based implementation of All-or-nothing scheduling. • **Partial admission and dynamic reclaim:** mechanisms to run a job with reduced parallelism, based on available quota, and to release the quota the pods complete.. • **Mixing training and inference**: Simultaneous management of batch workloads along with serving workloads (such as Deployments or StatefulSets) • **Multi-cluster job dispatching:** called MultiKueue, allows to search for capacity and off-load the main cluster. • **Topology-Aware Scheduling**: Allows to optimize the Pod-to-Pod communication throughput by scheduling aware of the data-center topology. Production Readiness status • ✔️ API version: v1beta2, respecting Kubernetes Deprecation Policy • ✔️ Up-to-date documentation. • ✔️ Test Coverage: • ✔️ Unit Test testgrid. • ✔️ Integration Test testgrid • ✔️ Integration MultiKueue Tests testgrid • ✔️ E2E Tests for Kubernetes 1.33, 1.34, 1.35, on Kind. • ✔️ E2E TAS Test testgrid • ✔️ E2E Custom Configs Test testgrid • ✔️ E2E Cert Manager Test testgrid • ✔️ Performance Test testgrid • ✔️ Scalability verification via performance tests. • ✔️ Monitoring via metrics. • ✔️ Security: RBAC based accessibility. • ✔️ Stable release cycle (2-3 months). • ✔️ Adopters running on production. _Based on community feedback, we continue to simplify and evolve the API to address new use cases_. Installation **Requires Kubernetes 1.29 or newer**. To install the latest release of Kueue in your cluster, run the following command: The controller runs in the namespace. Read the installation guide to learn more. Usage A minimal configuration can be set by running the examples: Then you can run a job with: Learn more about: • Kueue concepts. • Common and advanced tasks. Roadmap High-level overview of the main priorities for 2026: • Improve user experience for MultiKueue - multi-cluster Job dispatching, in particular: • Support Elastic RayJob #8712 • Workload-Level Admission Constraints and Preference-Aware MultiKueue Dispatching #8729 • Prevent starting preemptions in multiple worker clusters #8303 • Support long running services #8526 • Log retrieval from worker clusters #3526 • Improve user experience for Topology Aware Scheduling, in particular: • Support for ResourceTransformations #8860 • Support for Elastic Workloads #8160 • Evict workloads which are running on nodes which become tainted #8838 • Integration with the k8s native Workload-Aware Scheduler (WAS) and Topology-Aware Scheduling #8871 • Support for Concurrent Workload Admission #8691 • Support for running hero workloads #8826 • Consider preemption cost when finding preemption candidates #7990 • Progress towards Beta for the integration with Dynamic Resource Allocation (DRA) #8243 Long-term aspirational goals: • Partial preemption of serving workloads #3762 • Integration with workflow frameworks #74 • Budget support #28 • Flavor assignment strategies, e.g. _minimizing cost_ vs _minimizing borrowing_ #312 • Cooperative preemption support for workloads that implement checkpointing #477 • Delayed preemption for two-stage admission #3758 • Support Structured Parameters (DRA) in Kueue #2941 • Graduate the API to v1 #3476 Community, discussion, contribution, and support Learn how to engage with the Kubernetes community on the community page and the contributor's guide. You can reach the maintainers of this project at: • Slack • Mailing List Graphic assets • Kueue • KueueViz Code of conduct Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.