Features

Built for people who take prompts seriously

Everything runs locally. Two runtime dependencies. No API keys consumed by the engine itself.

Core Capabilities

Seven pillars of deterministic prompt governance

🔍

Quality Scoring

0–100 score across five dimensions: clarity, specificity, completeness, constraints, and efficiency.

🛡️

Ambiguity Detection

Deterministic rules catch scope explosion, missing constraints, hallucination risk, and more.

🎯

Structured Compilation

Outputs prompts with role, goal, and constraints — targeting Claude XML, OpenAI system/user, or Markdown.

Context Compression

Multi-stage compression pipeline strips irrelevant content and reports token savings. Zone-aware, preserves structure.

💰

Multi-Provider Cost

Token and cost estimates across 11 models from 4 providers: Anthropic, OpenAI, Google, and Perplexity.

🔒

Offline License

Ed25519-signed keys verified locally. No accounts, no network calls, no tracking. Your prompts stay private.

📦

Programmatic API

import { optimize } — pure functions, zero side effects. Use as a library, not just a server.

Why You Can Trust the Output

Reproducible: Same prompt always produces the same score, routing, and cost estimate. No randomness.
Accurate Pricing: Cost estimates verified against live provider rates across Anthropic, OpenAI, Google, and Perplexity.
Risk Detection: Catches scope explosion, missing constraints, hallucination risk, and underspecified prompts before they reach an LLM.
Safe Compression: Guaranteed to never increase token count. Structured content (code, tables, lists) is always protected.
Multi-Provider: Routes to the right model at the right price, with a full decision trail explaining why.
Auditable: Every decision is logged with hash-chained integrity. Enterprise teams get cryptographic proof of compliance.

Fully offline. Zero LLM calls. Extensively tested. Run the suite yourself to verify.

Architecture

Full request lifecycle

Every prompt passes through 8 sequential gates. All decisions are deterministic — zero LLM calls, zero randomness.

User Prompt STEP 1 Input Hardening Freemium Gate BLOCKED quota exceeded denied allowed Policy Gate Enterprise BLOCKED policy violation STEP 2 Analysis Phase → Intent Scoring Phase QualityScore 0–100 Compilation Phase Claude / OpenAI / Generic Estimation Phase 5 providers + routing LEGEND Normal flow Blocked Decision gate

Complete Capability Reference

20 capabilities across every tier — 17 always free, 3 metered. Enterprise governance unlocks with the Enterprise plan.

# Capability Tier What it does
Core Capabilities
1 Optimize Metered Full analysis: score, compile, and estimate cost in one pass. Enterprise: blocked if policy violations are detected in enforce mode
2 Refine Metered Iterate on a result — answer blocking questions or add context to update the compiled output
3 Approve Free Sign off on a result to get the final compiled prompt. Enterprise: blocked on violations or high risk score in enforce mode
Analysis & Utilities
4 Estimate Cost Free Token count + cost across Anthropic, OpenAI, Google, and Perplexity before you run anything
5 Compress Context Free Shrink long context with a multi-stage heuristic pipeline — no lossy summarization
6 Quick Check Free Instant pass/fail score and top issues — no compilation, no session
7 Configure Free Enterprise Set mode, thresholds, and output target. Enterprise: policy enforcement mode, audit trail, config lock, retention policy
8 View Usage Free Check how many optimizations you've used, your limits, and what's remaining
9 Usage Stats Free Aggregated metrics: average score, top task types, estimated token savings
Decision Engine
10 Classify Task Free Identify task type, reasoning complexity, risk level, and recommended optimization profile
11 Route Model Free Pick the optimal model for cost, speed, or quality — with a full audit trail of the routing decision
12 Pre-flight Metered One-shot analysis: classify, assess risk, route model, and score quality — before committing to a full optimization
Context & Session Management
13 Prune Context Free Score and rank context by relevance to the task; optionally remove low-signal sections to save tokens
14 List Sessions Free Browse your optimization history — metadata only, raw prompts never stored in the list
15 Export Session Free Full session export with reproducibility hash. Enterprise: includes policy enforcement data and custom rule metadata
16 Delete Session Free Remove a single session from your history
17 Purge Sessions Free Bulk-clear old sessions by age, with a dry-run preview and keep-last safeguard
Account & Licensing
18 Activate License Free Enter your Pro, Power, or Enterprise license key to unlock your tier — validated offline, no account required
19 View License Status Free See your current tier, license expiry date, and where to upgrade if needed
Enterprise Governance
20 Save Custom Rules Free Enterprise Deploy custom governance rules built in the Enterprise Console directly to your Prompt Control Plane
Scoring

Quality in 5 dimensions

Every prompt gets a 0–100 score. Each dimension contributes 20 points. After optimization, every prompt scores 90+.

Clarity

How clearly the intent is stated. Vague terms and short goals lose points.

Clear goal (>15 words)+15
Per vague term (“improve”, “fix it”)-5
Goal too short (<5 words)-5

Specificity

File paths, code blocks, audience, tone, and platform all add specificity.

File path referenced+5
Code block included+3
Audience specified+5
Tone specified+4

Completeness

Does the prompt define what “done” looks like? Are success criteria present?

2+ success criteria+10
1 success criterion+5
Task type detected+3
Output format specified+2

Constraints

Scope limits, forbidden actions, and time budgets protect against scope explosion.

Scope constraints+5
Forbidden actions listed+5
High-risk + no constraints-5
Preservation plan present+2

Efficiency

Token budget and repetition. Concise prompts with no duplication score highest.

Start score18
-2 per 1K tokens >5K-2/K
Repetitive content detected-4
Concise + no duplication+2

Risk Score Dimensions (0–100)

Underspec
65
Scope
45
Hallucination
30
Constraint
50

Example: “Delete all inactive users from production database” — high underspec + constraint risk. In enforce mode (threshold=60), this prompt gets blocked.

Enterprise

Governance Controls

Enterprise-only capabilities that turn scoring into enforcement

🏛️ Enterprise Console

A browser-based admin panel for managing your entire governance configuration. Build custom rules with a visual editor, toggle policies, configure audit settings, and deploy changes — all gated by your Enterprise license key.

Open Enterprise Console →

🛡️ Policy Enforcement

Switch between advisory and enforce mode. In enforce mode, prompts that fail quality rules or exceed risk thresholds are blocked before they reach any LLM. Both built-in and custom rules are enforced.

Configure via Enterprise Console → Advisory warns, Enforce blocks

🔒 Config Lock

Lock the governance configuration with a passphrase. Once locked, no settings can be changed until unlocked with the original passphrase. Prevents unauthorized changes to quality thresholds, policy mode, and audit settings.

Passphrase is SHA-256 hashed, never stored in plaintext

📜 Hash-Chained Audit Trail

Append-only JSONL audit log with SHA-256 hash chaining. Every optimization, approval, deletion, and configuration change is logged. If any entry is modified or deleted, all subsequent hashes break — integrity violation detected.

Local-only, no prompt content stored, privacy-safe

📏 Custom Governance Rules

Build up to 25 organization-specific rules in the Enterprise Console with a visual editor. Define match patterns, risk dimensions, severity levels (blocking or non-blocking), and weights. Deploy with one click — rules take effect on the next optimization.

Deploy via the Enterprise Console or CLI

⏱️ Session Retention

Automatic session cleanup with configurable retention policies. Set how long optimization sessions are kept before they're purged. Supports dry-run to preview what would be deleted before committing.

Configure retention window via Enterprise Console

⚖️ Risk Threshold Gating

Block prompts that exceed a risk score threshold based on your strictness setting. Relaxed blocks above 40, Standard above 60, Strict above 75. Risk scores are dimensional — scored across hallucination, scope, constraint, and underspec dimensions.

Three strictness levels: Relaxed, Standard, Strict — each with progressively tighter thresholds

All governance controls are deterministic, offline, and zero-LLM. Configuration changes are audit-logged when the trail is enabled.
See full plan comparison →

Model Routing

Two-step deterministic routing

PCP classifies prompt complexity and risk, then picks the cheapest model that can still handle the task correctly.

STEP 1: Complexity + Risk → Tier simple_factual + low analytical + medium multi_step + high agent_orchestration small Haiku · Flash mid Sonnet · GPT-4o top Opus · GPT-4-32k STEP 2: Apply Overrides budget_sensitivity = high latency_sensitivity = high profile = quality_first risk_score ≥ 40 ↓ downgrade 1 tier ↓ downgrade 1 tier ↑ upgrade 1 tier ↑ force escalate RECOMMENDATION primary model claude-3-5-sonnet fallback: gpt-4o confidence: 82% COST COMPARISON (500 tokens) Gemini Flash$0.00022 GPT-3.5 Turbo$0.00115 GPT-4o ✓ rec$0.00725 Perplexity$0.00650 Claude 3.5$0.01350 DECISION PATH AUDIT TRAIL Input: analytical + medium risk Step 1: complexity=analytical → tier=mid Step 2: profile=balanced → no override Final: GPT-4o (confidence: 82%)

Use Cases

Who benefits from deterministic prompt governance

Developers building AI features

Score and compile prompts before they reach production. Catch ambiguity that leads to unpredictable LLM output.

Engineering teams managing LLM spend

Multi-provider cost estimates across 11 models. Know exactly what each prompt will cost before you send it.

CI/CD pipelines

Run the CLI as a quality gate. Fail builds when prompt quality drops below your threshold.

MCP-powered workflows

Integrates natively with Claude Desktop, Claude Code, Cursor, Windsurf, and any MCP-compatible client.

Reducing LLM costs

Context compression strips irrelevant tokens. Tool pruning removes unnecessary tools from the context window.

Enterprise & compliance teams

Enforce prompt quality policies before deployment. Hash-chained audit trail, config lock, and custom rules provide governance without runtime overhead.

How It Compares

Alternatives and their trade-offs

Method Pros Cons
Manual prompt rewriting Full control, no tooling needed Inconsistent quality, not scalable, no scoring
Fine-tuning models Optimized for your domain Expensive, slow iteration, requires ML expertise
Trial-and-error Quick to start No structured feedback, wastes tokens, not reproducible
Prompt Control Plane Deterministic scoring, policy enforcement, multi-provider routing, offline, CI/CD ready Deterministic rules only — no semantic understanding of domain context

Programmatic API

Use as a library — no MCP server required

Install
npm install claude-prompt-optimizer-mcp
import { optimize } from 'claude-prompt-optimizer-mcp'; const result = await optimize('Summarize this document'); console.log(result.quality_score, result.compiled);
// Target any LLM const forOpenAI = await optimize('Analyze sales data', { target: 'openai' }); const forClaude = await optimize('Analyze sales data', { target: 'claude' }); const generic = await optimize('Analyze sales data', { target: 'generic' });

Start optimizing prompts today

Get Started Free →