Pavelevich / llm-checker
Advanced CLI tool that scans your hardware and tells you exactly which LLM or sLLM models you can run locally, with full Ollama integration.
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing Pavelevich/llm-checker in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewLLM Checker **Intelligent Ollama Model Selector** AI-powered CLI that analyzes your hardware and recommends optimal LLM models. Deterministic scoring across **200+ dynamic models** (35+ curated fallback) with hardware-calibrated memory estimation. Start Here • Installation • Quick Start • Calibration Quick Start • Docs • Claude MCP • Commands • Scoring • Hardware • Discord --- Why LLM Checker? Choosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires understanding memory bandwidth, VRAM limits, and performance characteristics. **LLM Checker solves this.** It analyzes your system, scores every compatible model across four dimensions (Quality, Speed, Fit, Context), and delivers actionable recommendations in seconds. --- Features | | Feature | Description | |:---:|---|---| | **200+** | Dynamic Model Pool | Uses full scraped Ollama catalog/variants when available (with curated fallback) | | **4D** | Scoring Engine | Quality, Speed, Fit, Context — weighted by use case | | **Multi-GPU** | Hardware Detection | Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU, integrated/dedicated inventory visibility | | **Calibrated** | Memory Estimation | Bytes-per-parameter formula validated against real Ollama sizes | | **Zero** | Native Dependencies | Pure JavaScript — works on any Node.js 16+ system | | **Optional** | SQLite Search | Install to unlock , , and | --- Documentation • Docs Hub • Usage Guide • Advanced Usage • Technical Reference • Changelog • Calibration Fixtures --- Comparison with Other Tooling (e.g. ) LLM Checker and solve related but different problems: | Tool | Primary Focus | Typical Output | |------|---------------|----------------| | **LLM Checker** | Hardware-aware **model selection** for local inference | Ranked recommendations, compatibility scores, pull/run commands | | **llmfit** | LLM workflow support and model-fit evaluation from another angle | Different optimization workflow and selection heuristics | If your goal is: *"What should I run on this exact machine right now?"*, use **LLM Checker** first. If your goal is broader experimentation across custom pipelines, using both tools can be complementary. --- Installation **Termux (Android):** **Requirements:** • Node.js 16+ (any version: 16, 18, 20, 22, 24) • Ollama installed for running models **Optional:** For database search features ( , , ): --- Start Here (2 Minutes) If you are new, use this exact flow: If you already calibrated routing: --- Distribution LLM Checker is published in all primary channels: • npm (latest, recommended): • GitHub Releases: Release history • GitHub Packages (legacy mirror, may lag): Important: Use npm for Latest Builds If you need the newest release, install from npm ( ), not the scoped GitHub Packages mirror. If you installed and version looks old: v3.3.0 Highlights • Calibrated routing is now first-class in and : • support with default discovery path. • clear precedence: > > deterministic fallback. • routing provenance output (source, route, selected model). • New calibration fixtures and end-to-end tests for: • → • Hardened Jetson CUDA detection to avoid false CPU-only fallback. • Documentation reorganized under with clearer onboarding paths. Optional (Legacy): Install from GitHub Packages Use this only if you explicitly need GitHub Packages. It may not match npm latest. --- Quick Start --- Calibration Quick Start (10 Minutes) This path produces both calibration artifacts and verifies calibrated routing in one pass. 1) Use the sample prompt suite 2) Generate calibration artifacts (dry-run) Artifacts created: • (calibration contract) • (routing policy for runtime commands) 3) Apply calibrated routing Notes: • has precedence over . • If has no path, discovery uses . • currently requires . • shows the expected policy structure. --- Claude Code MCP LLM Checker includes a built-in Model Context Protocol (MCP) server, allowing **Claude Code** and other MCP-compatible AI assistants to analyze your hardware and manage local models directly. Setup (One Command) Or generate the exact command directly from the CLI: Or with npx (no global install needed): Restart Claude Code and you're done. Available MCP Tools Once connected, Claude can use these tools: **Core Analysis:** | Tool | Description | |------|-------------| | | Detect your hardware (CPU, GPU, RAM, acceleration backend) | | | Full compatibility analysis with all models ranked by score | | | Top model picks by category (coding, reasoning, multimodal, etc.) | | | Rank your already-downloaded Ollama models | | | Search the Ollama model catalog with filters | | | Advanced recommendations using the full scoring engine | | | Build a capacity plan for local models with recommended context/parallel/memory settings | | | Return ready-to-paste env vars from the recommended or fallback plan profile | | | Validate a policy file against the v1 schema and return structured validation output | | | Run policy compliance export ( / / / ) for or flows | | | Generate calibration artifacts from a prompt suite with typed MCP inputs | **Ollama Management:** | Tool | Description | |------|-------------| | | List all downloaded models with params, quant, family, and size | | | Download a model from the Ollama registry | | | Run a prompt against a local model (with tok/s metrics) | | | Delete a model to free disk space | **Advanced (MCP-exclusive):** | Tool | Description | |------|-------------| | | Generate optimal Ollama env vars for your hardware (NUM_GPU, PARALLEL, FLASH_ATTENTION, etc.) | | | Benchmark a model with 3 standardized prompts — measures tok/s, load time, prompt eval | | | Head-to-head comparison of two models on the same prompt with speed + response side-by-side | | | Analyze installed models — find redundancies, cloud-only models, oversized models, and upgrade candid…