LLM Checker

Intelligent Ollama Model Selector

AI-powered CLI that analyzes your hardware and recommends optimal LLM models
Deterministic scoring across 35+ curated models with hardware-calibrated memory estimation

Installation • Quick Start • Claude MCP • Commands • Scoring • Hardware

Why LLM Checker?

Choosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires understanding memory bandwidth, VRAM limits, and performance characteristics.

LLM Checker solves this. It analyzes your system, scores every compatible model across four dimensions (Quality, Speed, Fit, Context), and delivers actionable recommendations in seconds.

Features

	Feature	Description
35+	Curated Models	Hand-picked catalog covering all major families and sizes (1B-32B)
4D	Scoring Engine	Quality, Speed, Fit, Context — weighted by use case
Multi-GPU	Hardware Detection	Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU
Calibrated	Memory Estimation	Bytes-per-parameter formula validated against real Ollama sizes
Zero	Native Dependencies	Pure JavaScript — works on any Node.js 16+ system
Optional	SQLite Search	Install `sql.js` to unlock `sync`, `search`, and `smart-recommend`

Installation

# Install globally
npm install -g llm-checker

# Or run directly with npx
npx llm-checker hw-detect

Requirements:

Node.js 16+ (any version: 16, 18, 20, 22, 24)
Ollama installed for running models

Optional: For database search features (sync, search, smart-recommend):

npm install sql.js

Quick Start

# 1. Detect your hardware capabilities
llm-checker hw-detect

# 2. Get full analysis with compatible models
llm-checker check

# 3. Get intelligent recommendations by category
llm-checker recommend

# 4. (Optional) Sync full database and search
llm-checker sync
llm-checker search qwen --use-case coding

Claude Code MCP

LLM Checker includes a built-in Model Context Protocol (MCP) server, allowing Claude Code and other MCP-compatible AI assistants to analyze your hardware and manage local models directly.

Setup (One Command)

# Install globally first
npm install -g llm-checker

# Add to Claude Code
claude mcp add llm-checker -- llm-checker-mcp

Or with npx (no global install needed):

claude mcp add llm-checker -- npx llm-checker-mcp

Restart Claude Code and you're done.

Available MCP Tools

Once connected, Claude can use these tools:

Core Analysis:

Tool	Description
`hw_detect`	Detect your hardware (CPU, GPU, RAM, acceleration backend)
`check`	Full compatibility analysis with all models ranked by score
`recommend`	Top model picks by category (coding, reasoning, multimodal, etc.)
`installed`	Rank your already-downloaded Ollama models
`search`	Search the Ollama model catalog with filters
`smart_recommend`	Advanced recommendations using the full scoring engine

Ollama Management:

Tool	Description
`ollama_list`	List all downloaded models with params, quant, family, and size
`ollama_pull`	Download a model from the Ollama registry
`ollama_run`	Run a prompt against a local model (with tok/s metrics)
`ollama_remove`	Delete a model to free disk space

Advanced (MCP-exclusive):

Tool	Description
`ollama_optimize`	Generate optimal Ollama env vars for your hardware (NUM_GPU, PARALLEL, FLASH_ATTENTION, etc.)
`benchmark`	Benchmark a model with 3 standardized prompts — measures tok/s, load time, prompt eval
`compare_models`	Head-to-head comparison of two models on the same prompt with speed + response side-by-side
`cleanup_models`	Analyze installed models — find redundancies, cloud-only models, oversized models, and upgrade candidates
`project_recommend`	Scan a project directory (languages, frameworks, size) and recommend the best model for that codebase
`ollama_monitor`	Real-time system status: RAM usage, loaded models, memory headroom analysis

Example Prompts

After setup, you can ask Claude things like:

"What's the best coding model for my hardware?"
"Benchmark qwen2.5-coder and show me the tok/s"
"Compare llama3.2 vs codellama for coding tasks"
"Clean up my Ollama — what should I remove?"
"What model should I use for this Rust project?"
"Optimize my Ollama config for maximum performance"
"How much RAM is Ollama using right now?"

Claude will automatically call the right tools and give you actionable results.

Commands

Core Commands

Command	Description
`hw-detect`	Detect GPU/CPU capabilities, memory, backends
`check`	Full system analysis with compatible models and recommendations
`recommend`	Intelligent recommendations by category (coding, reasoning, multimodal, etc.)
`installed`	Rank your installed Ollama models by compatibility

Advanced Commands (require `sql.js`)

Command	Description
`sync`	Download the latest model catalog from Ollama registry
`search <query>`	Search models with filters and intelligent scoring
`smart-recommend`	Advanced recommendations using the full scoring engine

AI Commands

Command	Description
`ai-check`	AI-powered model evaluation with meta-analysis
`ai-run`	AI-powered model selection and execution

`hw-detect` — Hardware Analysis

llm-checker hw-detect

Summary:
  Apple M4 Pro (24GB Unified Memory)
  Tier: MEDIUM HIGH
  Max model size: 15GB
  Best backend: metal

CPU:
  Apple M4 Pro
  Cores: 12 (12 physical)
  SIMD: NEON

Metal:
  GPU Cores: 16
  Unified Memory: 24GB
  Memory Bandwidth: 273GB/s

`recommend` — Category Recommendations

llm-checker recommend

INTELLIGENT RECOMMENDATIONS BY CATEGORY
Hardware Tier: HIGH | Models Analyzed: 205

Coding:
   qwen2.5-coder:14b (14B)
   Score: 78/100
   Command: ollama pull qwen2.5-coder:14b

Reasoning:
   deepseek-r1:14b (14B)
   Score: 86/100
   Command: ollama pull deepseek-r1:14b

Multimodal:
   llama3.2-vision:11b (11B)
   Score: 83/100
   Command: ollama pull llama3.2-vision:11b

`search` — Model Search

llm-checker search llama -l 5
llm-checker search coding --use-case coding
llm-checker search qwen --quant Q4_K_M --max-size 8

Option	Description
`-l, --limit <n>`	Number of results (default: 10)
`-u, --use-case <type>`	Optimize for: `general`, `coding`, `chat`, `reasoning`, `creative`, `fast`
`--max-size <gb>`	Maximum model size in GB
`--quant <type>`	Filter by quantization: `Q4_K_M`, `Q8_0`, `FP16`, etc.
`--family <name>`	Filter by model family

Model Catalog

The built-in catalog includes 35+ models from the most popular Ollama families:

Family	Models	Best For
Qwen 2.5/3	7B, 14B, Coder 7B/14B/32B, VL 3B/7B	Coding, general, vision
Llama 3.x	1B, 3B, 8B, Vision 11B	General, chat, multimodal
DeepSeek	R1 8B/14B/32B, Coder V2 16B	Reasoning, coding
Phi-4	14B	Reasoning, math
Gemma 2	2B, 9B	General, efficient
Mistral	7B, Nemo 12B	Creative, chat
CodeLlama	7B, 13B	Coding
LLaVA	7B, 13B	Vision
Embeddings	nomic-embed-text, mxbai-embed-large, bge-m3, all-minilm	RAG, search

Models are automatically combined with any locally installed Ollama models for scoring.

Scoring System

Models are evaluated across four dimensions, weighted by use case:

Dimension	Description
Q Quality	Model family reputation + parameter count + quantization penalty
S Speed	Estimated tokens/sec based on hardware backend and model size
F Fit	Memory utilization efficiency (how well it fits in available RAM)
C Context	Context window capability vs. target context length

Scoring Weights by Use Case

Three scoring systems are available, each optimized for different workflows:

Deterministic Selector (primary — used by check and recommend):

Category	Quality	Speed	Fit	Context
`general`	45%	35%	15%	5%
`coding`	55%	20%	15%	10%
`reasoning`	60%	10%	20%	10%
`multimodal`	50%	15%	20%	15%

Scoring Engine (used by smart-recommend and search):

Use Case	Quality	Speed	Fit	Context
`general`	40%	35%	15%	10%
`coding`	55%	20%	15%	10%
`reasoning`	60%	15%	10%	15%
`chat`	40%	40%	15%	5%
`fast`	25%	55%	15%	5%
`quality`	65%	10%	15%	10%

All weights are centralized in src/models/scoring-config.js.

Memory Estimation

Memory requirements are calculated using calibrated bytes-per-parameter values:

Quantization	Bytes/Param	7B Model	14B Model	32B Model
Q8_0	1.05	~8 GB	~16 GB	~35 GB
Q4_K_M	0.58	~5 GB	~9 GB	~20 GB
Q3_K	0.48	~4 GB	~8 GB	~17 GB

The selector automatically picks the best quantization that fits your available memory.

Supported Hardware

Apple Silicon

M1, M1 Pro, M1 Max, M1 Ultra
M2, M2 Pro, M2 Max, M2 Ultra
M3, M3 Pro, M3 Max
M4, M4 Pro, M4 Max

NVIDIA (CUDA)

RTX 50 Series (5090, 5080, 5070 Ti, 5070)
RTX 40 Series (4090, 4080, 4070 Ti, 4070, 4060 Ti, 4060)
RTX 30 Series (3090 Ti, 3090, 3080 Ti, 3080, 3070 Ti, 3070, 3060 Ti, 3060)
Data Center (H100, A100, A10, L40, T4)

AMD (ROCm)

RX 7900 XTX, 7900 XT, 7800 XT, 7700 XT
RX 6900 XT, 6800 XT, 6800
Instinct MI300X, MI300A, MI250X, MI210

Intel

Arc A770, A750, A580, A380
Integrated Iris Xe, UHD Graphics

CPU Backends

AVX-512 + AMX (Intel Sapphire Rapids, Emerald Rapids)
AVX-512 (Intel Ice Lake+, AMD Zen 4)
AVX2 (Most modern x86 CPUs)
ARM NEON (Apple Silicon, AWS Graviton, Ampere Altra)

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Hardware       │────>│  Model          │────>│  Deterministic  │
│  Detection      │     │  Catalog (35+)  │     │  Selector       │
└─────────────────┘     └─────────────────┘     └─────────────────┘
        │                       │                       │
   Detects GPU/CPU         JSON catalog +           4D scoring
   Memory / Backend        Installed models         Per-category weights
   Usable memory calc      Auto-dedup               Memory calibration
                                                        │
                                                        v
                                               ┌─────────────────┐
                                               │  Ranked         │
                                               │  Recommendations│
                                               └─────────────────┘

Selector Pipeline:

Hardware profiling — CPU, GPU, RAM, acceleration backend
Model pool — Merge catalog + installed Ollama models (deduped)
Category filter — Keep models relevant to the use case
Quantization selection — Best quant that fits in memory budget
4D scoring — Q, S, F, C with category-specific weights
Ranking — Top N candidates returned

Examples

Detect your hardware:

llm-checker hw-detect

Get recommendations for all categories:

llm-checker recommend

Full system analysis with compatible models:

llm-checker check

Find the best coding model:

llm-checker recommend --category coding

Search for small, fast models under 5GB:

llm-checker search "7b" --max-size 5 --use-case fast

Get high-quality reasoning models:

llm-checker smart-recommend --use-case reasoning

Development

git clone https://github.com/Pavelevich/llm-checker.git
cd llm-checker
npm install
node bin/enhanced_cli.js hw-detect

Project Structure

src/
  models/
    deterministic-selector.js  # Primary selection algorithm
    scoring-config.js          # Centralized scoring weights
    scoring-engine.js          # Advanced scoring (smart-recommend)
    catalog.json               # Curated model catalog (35+ models)
  ai/
    multi-objective-selector.js  # Multi-objective optimization
    ai-check-selector.js        # LLM-based evaluation
  hardware/
    detector.js                # Hardware detection
    unified-detector.js        # Cross-platform detection
  data/
    model-database.js          # SQLite storage (optional)
    sync-manager.js            # Database sync from Ollama registry
bin/
  enhanced_cli.js              # CLI entry point

License

MIT License — see LICENSE for details.

GitHub • npm • Issues

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
analyzer		analyzer
bin		bin
ml-model		ml-model
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.npmignore		.npmignore
ADVANCED_USAGE.md		ADVANCED_USAGE.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
TECHNICAL_DOCS.md		TECHNICAL_DOCS.md
USAGE_GUIDE.md		USAGE_GUIDE.md
debug-ollama.js		debug-ollama.js
llm-checker		llm-checker
llmlogo.jpg		llmlogo.jpg
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Checker

Why LLM Checker?

Features

Installation

Quick Start

Claude Code MCP

Setup (One Command)

Available MCP Tools

Example Prompts

Commands

Core Commands

Advanced Commands (require `sql.js`)

AI Commands

`hw-detect` — Hardware Analysis

`recommend` — Category Recommendations

`search` — Model Search

Model Catalog

Scoring System

Scoring Weights by Use Case

Memory Estimation

Supported Hardware

Architecture

Examples

Development

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Checker

Why LLM Checker?

Features

Installation

Quick Start

Claude Code MCP

Setup (One Command)

Available MCP Tools

Example Prompts

Commands

Core Commands

Advanced Commands (require sql.js)

AI Commands

hw-detect — Hardware Analysis

recommend — Category Recommendations

search — Model Search

Model Catalog

Scoring System

Scoring Weights by Use Case

Memory Estimation

Supported Hardware

Architecture

Examples

Development

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Advanced Commands (require `sql.js`)

`hw-detect` — Hardware Analysis

`recommend` — Category Recommendations

`search` — Model Search

Packages