feat(router): add intelligent model routing with auto selectors by vfeitoza · Pull Request #431 · ENTERPILOT/GoModel

vfeitoza · 2026-06-26T19:44:14Z

Summary

Adds an intelligent routing layer that classifies incoming chat requests with a cheap analyzer model and selects the best concrete model from the provider catalog before normal resolution runs.
Clients use virtual selectors (auto, smart, auto-cost, auto-quality) in the model field instead of concrete model names.
Three modes: off (default), observe (dry-run, logs recommendation without changing execution), and enforce (rewrites the selector before dispatch).

How it works

Client sends "model": "auto" (or another intelligent selector).
Gateway intercepts before provider resolution.
A cheap analyzer model classifies the request (complexity, task type, required capabilities, confidence score).
The scorer ranks all eligible catalog models against the active strategy (cost, balanced, quality, latency).
In enforce mode, the winning model replaces the selector; the request then goes through normal authorization and provider routing.

Key components

internal/intelligentrouter/ — classifier, selector, scorer, catalog filtering, prompt builder, types
config/intelligent_routing.go — configuration struct, defaults, validation, env var overrides
internal/gateway/inference_intelligent.go — orchestrator hook that rewrites the model selector before resolution
internal/server/model_validation.go — middleware skip for intelligent selectors to prevent premature model_not_found errors
internal/observability/metrics.go — Prometheus counters and histograms for decisions, latency, fallbacks, and low-confidence cases
internal/virtualmodels/intelligent.go — intelligent virtual model support with strategy-qualified selectors
docs/features/intelligent-routing.mdx — full feature documentation

Configuration

intelligent_routing:
  enabled: true
  mode: "enforce"        # off | observe | enforce
  analyzers:
    - model: "gpt-4o-mini"
      provider: "openai"
      max_tokens: 256
  defaults:
    strategy: "balanced" # cost | balanced | quality | latency
    min_savings_ratio: 0.15
    min_confidence: 0.7
  fallback_model: "openai/gpt-4o-mini"
  analysis_user_path: "/intelligent-router"

Test plan

Unit tests pass: make test-race
"model": "auto" with mode: observe logs intelligent routing decision without changing the executed model
"model": "auto" with mode: enforce rewrites the selector and returns 200 with the selected model
"model": "auto-cost" selects a cheaper model when pricing/tags are configured
All analyzers failing falls back to fallback_model with analysis_failed=true in logs
Intelligent selectors with mode: off return model_not_found
Analyzer usage appears under analysis_user_path in usage reports

Summary by CodeRabbit

New Features
- Added Intelligent Routing for automatic model selection with off/observe/enforce modes and cost/quality/balanced/latency strategies, including analyzer-based classification, candidate allow/deny filtering, and fallback model handling.
- Added intelligent selector aliases and intelligent virtual-model targets, plus routing-guidance-aware classification.
- Conversation-aware routing via X-GoModel-Conversation-ID, with in-memory routing history in enforce mode.
- Added Prometheus metrics for routing requests, decision latency, fallbacks, and low-confidence events.
Bug Fixes
- Ensured intelligent selectors pass through model validation in translated requests.
Documentation
- Added a complete Intelligent Routing documentation page and updated config/environment templates and examples.

Introduces an intelligent routing layer that classifies incoming chat requests with a cheap analyzer model and selects the best concrete model from the provider catalog before normal resolution runs. Clients send an intelligent selector in the model field instead of a concrete model name: - auto / smart balanced cost-quality selection - auto-cost prefer the cheapest eligible model - auto-quality prefer the highest-quality eligible model The router operates in three modes configured via intelligent_routing.mode: - off intelligent selectors return model_not_found (default) - observe classifies and logs the recommendation; executes the original requested model unchanged (dry-run) - enforce replaces the selector with the recommended model before provider dispatch A pool of analyzer models is tried in order with per-call timeout and automatic failover. When all analyzers fail, requests fall back to intelligent_routing.fallback_model or return model_not_found if unset. Analyzer usage is attributed to a dedicated analysis_user_path (/intelligent-router by default) so classification cost is separated from application request cost in usage reports. Key components: - internal/intelligentrouter: classifier, selector, scorer, catalog, prompt builder, and type definitions - config/intelligent_routing.go: configuration struct, defaults, validation, and env var overrides - internal/gateway/inference_intelligent.go: orchestrator hook that rewrites the model selector before resolution - internal/server/model_validation.go: middleware skip for intelligent selectors to prevent premature model_not_found errors - internal/observability/metrics.go: Prometheus counters and histograms for routing decisions, latency, fallbacks, and low-confidence cases - internal/virtualmodels: intelligent virtual model support with strategy-qualified selectors (intelligent:cost, intelligent:quality) - docs/features/intelligent-routing.mdx: full feature documentation All existing tests pass. New tests cover config validation, selector detection, catalog filtering, scoring, and middleware passthrough for intelligent selectors.

coderabbitai · 2026-06-26T19:44:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds intelligent routing configuration, analysis and ranking logic, runtime wiring through server and gateway paths, selector validation bypasses, and virtual-model support.

Changes

Intelligent routing

Layer / File(s)	Summary
Config contract and docs `config/intelligent_routing.go`, `config/config.go`, `config/intelligent_routing_test.go`, `.env.template`, `CLAUDE.md`, `config/config.example.yaml`, `docs/features/intelligent-routing.mdx`, `internal/core/types.go`	`intelligent_routing` config is added with defaults, validation, tests, metadata, and documentation for modes, analyzers, thresholds, filters, and fallback behavior.
Classification primitives `internal/intelligentrouter/types.go`, `selectors.go`, `prompt.go`, `classifier.go`, `selectors_test.go`, `classifier_test.go`	Selector aliases, analyzer prompts, classification parsing, and classifier failover/repair behavior are added with tests.
Candidate ranking `internal/intelligentrouter/catalog.go`, `scorer.go`, `catalog_test.go`	Eligible catalog models are filtered and scored for routing strategies, with tests for filtering and ranking.
Selector decision engine `internal/observability/metrics.go`, `internal/intelligentrouter/selector.go`, `selector_test.go`, `internal/intelligentrouter/conversation_memory.go`, `conversation_memory_test.go`	Routing decisions classify requests, rank candidates, apply observe/enforce mode, record conversation history, and emit intelligent-routing metrics and logs.
Startup and request wiring `internal/app/app.go`, `internal/app/intelligent_routing.go`, `internal/gateway/*.go`, `internal/server/handlers.go`, `internal/server/http.go`, `internal/server/translated_inference_service.go`, `internal/server/request_support.go`, `internal/server/request_support_test.go`, `internal/server/model_validation.go`, `internal/server/model_validation_test.go`, `internal/virtualmodels/intelligent.go`, `internal/virtualmodels/validation.go`, `internal/virtualmodels/snapshot.go`	The app builds the router, injects it into inference, rewrites translated request selectors, skips validation for intelligent aliases, extracts conversation IDs, and adjusts virtual-model handling.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

ENTERPILOT/GoModel#264: Also changes internal/core/types.go, which this PR extends for routing guidance metadata.

Poem

🐇 I sniffed the config, hop by hop,
then watched the router choose a top.
In observe I blinked, in enforce I sped,
with metrics twinkling overhead.
A cozy burrow learned to swap and shop.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.89% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: adding intelligent model routing and new auto-style selectors.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

codecov-commenter · 2026-06-26T19:47:35Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 67.21612% with 358 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/intelligentrouter/selector.go	64.39%	59 Missing and 14 partials ⚠️
internal/app/intelligent_routing.go	0.00%	51 Missing ⚠️
internal/intelligentrouter/scorer.go	75.92%	27 Missing and 12 partials ⚠️
internal/gateway/inference_intelligent.go	9.37%	29 Missing ⚠️
internal/intelligentrouter/catalog.go	74.11%	14 Missing and 8 partials ⚠️
internal/intelligentrouter/prompt.go	79.41%	12 Missing and 9 partials ⚠️
internal/virtualmodels/intelligent.go	0.00%	21 Missing ⚠️
internal/intelligentrouter/classifier.go	85.18%	9 Missing and 7 partials ⚠️
internal/intelligentrouter/health_tracker.go	64.44%	9 Missing and 7 partials ⚠️
config/intelligent_routing.go	84.44%	7 Missing and 7 partials ⚠️
... and 10 more

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-26T19:50:43Z

Confidence Score: 3/5

Merge safety is moderate because the new routing hook can invoke analyzer work before the request has passed the normal authorization and budget gates.

The change is well-scoped and includes tests and documentation, but the request preparation order creates a concrete billing and access-control edge case for virtual selectors.

internal/gateway/inference_prepare.go

T-Rex Logs

What T-Rex did

Created a focused Go repro test harness to exercise the analyzer invocation before final authorization denial.
Tried to run the focused Go repro test with go test, but the environment lacked the Go toolchain, blocking runtime verification.
Ran the PrepareChatRequest test path, which exercises evaluateIntelligentRouter and normal workflow resolution, observing not_found_error for the auto model before analysis and a successful enforce path for mini.
Compared base and head selector harness attempts, noting base lacked intelligentrouter and exited with code 127, and head run also failed due to go not found.
Captured analyzer fallback traces showing missing intelligentrouter and missing Go toolchain in the base run, and go: not found plus an environment check in the head run.
Observed the base compile failure due to missing intelligent routing config symbols, and in the after state the system resolved to default off with YAML/env resolution, plus emitted gomodel_intelligent_routing metrics.

_{Ran code and verified through T-Rex}

_{Reviews (1): Last reviewed commit: "feat(router): add intelligent model rout..." | Re-trigger Greptile}

greptile-apps · 2026-06-26T19:50:47Z

+	requested := core.NewRequestedModelSelector(*model, *provider)
+	requested = o.evaluateIntelligentRouter(ctx, req, requested)
+	*model = requested.Model
+	*provider = requested.ProviderHint
 	workflow, err := o.ensureTranslatedRequestWorkflow(ctx, meta.Workflow, meta.RequestID, meta.Endpoint, model, provider)


Authorize before analysis
evaluateIntelligentRouter runs the analyzer before ensureTranslatedRequestWorkflow resolves and authorizes the final model, and dispatchChatCompletion enforces budget only later. A request with model: "auto" therefore sends its prompt to the configured analyzer and can incur analyzer usage even when the caller is unauthorized for the selected model or already over budget; move intelligent evaluation after workflow authorization/budget checks or add an equivalent pre-check before classification.

coderabbitai

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.env.template:
- Around line 278-279: The INTELLIGENT_ROUTING_MODE and
INTELLIGENT_ROUTING_DEFAULT_STRATEGY entries still use inline value-line
comments, which can be parsed as part of the value by env loaders. Update the
.env.template entries to match the rest of the template by moving the option
descriptions to separate preceding comment lines and keeping the assignment
lines as plain values; use the existing INTELLIGENT_ROUTING_MODE and
INTELLIGENT_ROUTING_DEFAULT_STRATEGY keys to locate the affected lines.
- Line 275: Update the stale intelligent-routing documentation reference so it
points to docs/features/intelligent-routing.mdx instead of
docs/dev/intelligent-model.md. Make the same link change wherever the old path
appears in the related config/docs references, including the .env.template entry
and the intelligent routing config/docs references such as
config/intelligent_routing.go and CLAUDE.md, so all unique symbols and comments
consistently reference the new doc.

In `@CLAUDE.md`:
- Line 131: The intelligent routing docs describe the default analyzer pool with
an inconsistent provider name for the first model entry; update the wording in
CLAUDE.md so it matches the shipped examples and feature docs that use the
OpenAI provider for gpt-5.4-mini. Use the intelligent routing section and its
analyzer pool description as the target, and keep the provider/model naming
aligned with config/config.example.yaml and
docs/features/intelligent-routing.mdx so the documented default pool is
consistent.

In `@config/intelligent_routing_test.go`:
- Around line 99-111: Add coverage in
TestValidateIntelligentRoutingConfig_RatioBounds for the explicit zero boundary
on IntelligentRoutingConfig.Defaults.MinSavingsRatio: create a case with ratio 0
alongside the existing out-of-range values, then assert
ValidateIntelligentRoutingConfig accepts it and does not overwrite it. Use the
existing ValidateIntelligentRoutingConfig helper and
IntelligentRoutingConfig/Defaults.MinSavingsRatio symbols so the test locks in
the “always cheapest” behavior.

In `@config/intelligent_routing.go`:
- Around line 238-254: `applyIntelligentRoutingDefaults` is overwriting
explicitly configured zero values for `MinSavingsRatio` and `MinConfidence`,
making the documented `0` behavior unreachable. Update the defaulting logic in
`applyIntelligentRoutingDefaults` so it only seeds these fields when the config
is truly unset, not when a user intentionally set `0`; use the existing
`DefaultIntelligentRoutingConfig`/merge flow or sentinel/pointer-based presence
tracking to distinguish unset from explicit values, while keeping the validation
in `IntelligentDefaults` unchanged.

In `@docs/features/intelligent-routing.mdx`:
- Around line 234-247: The environment variable table in the intelligent routing
docs is missing several top-level config fields, so update the table in
intelligent-routing.mdx to include INTELLIGENT_ROUTING_ANALYSIS_USER_PATH,
INTELLIGENT_ROUTING_CANDIDATES_ALLOW, and INTELLIGENT_ROUTING_CANDIDATES_DENY
alongside the existing entries, or change the “All top-level fields” wording to
match what is actually documented; use the env-tagged fields from
intelligent_routing.go as the source of truth and keep the table aligned with
the .env.template values.

In `@internal/intelligentrouter/catalog.go`:
- Line 25: BuildCandidates currently accepts requestedContextChars but never
uses it, so the signature is misleading and the selector’s passed value has no
effect. Update BuildCandidates and its call sites (including the selector path)
so requestedContextChars is either incorporated into the long-context decision,
likely via modelSupportsLongContext, or removed entirely if it is not needed.
Use the existing BuildCandidates, modelSupportsLongContext, and
class.RequiresLongContext symbols to keep the candidate filtering logic
consistent.

In `@internal/intelligentrouter/prompt.go`:
- Around line 33-56: The analyzerUserPrompt summary is dropping important
routing signals by reducing non-text content to a generic placeholder and
truncating text without any size metadata. Update analyzerUserPrompt to preserve
lightweight modality hints and original-size indicators for each message so
Classifier.Classify can still infer vision, audio, and long-context requests
without exposing full payloads. Keep the change localized to analyzerUserPrompt
and its message formatting logic, using msg.Content, core.ExtractTextContent,
and the per-message preview builder to add modality flags and original
length/count summaries.

In `@internal/intelligentrouter/scorer.go`:
- Around line 69-83: Balanced scoring in scoreCandidate currently ignores
class.Confidence, so uncertain classifications still over-prioritize cheap
models. Update the default balanced branch in
internal/intelligentrouter/scorer.go to incorporate Confidence into the
weighting before cost dominates, using the existing scoreCandidate switch and
class fields like Confidence, so low-confidence requests are less likely to
select the cheapest model and stay aligned with selector expectations.

In `@internal/intelligentrouter/selector.go`:
- Around line 190-195: The meta SelectionMeta parameter in
Selector.fallbackSelector is unused, so simplify the API by removing it from the
method signature and updating all call sites that pass it, including the one
using SelectionMeta{}. Keep the behavior the same by still preferring
parseFallback(s.fallback) and otherwise returning requestedSelector(requested).
- Around line 272-274: The low-confidence metric in logDecision is still using a
hardcoded 0.7, so it can drift from the selector’s configured threshold. Add the
configured minimum confidence to Decision in Evaluate, then update
choose/logDecision to use that Decision field instead of the literal when
incrementing IntelligentRoutingLowConfidenceTotal. Use the existing choose,
Evaluate, logDecision, and Decision symbols to thread the threshold through
cleanly.
- Around line 51-54: `MinSavingsRatio` is only initialized in `NewSelector` and
not enforced during routing, so the selector can still rewrite models even when
savings are below the configured threshold. Update `Evaluate`, `choose`, or
`RankCandidates` in `selector.go` to check the computed savings ratio against
`minSavings` before allowing a rewrite, and keep the fallback default only if
the setting is meant to remain configurable; otherwise remove the unused field
entirely.

In `@internal/virtualmodels/validation.go`:
- Around line 58-60: The truncation branch in the validation logic is
unreachable because the earlier non-intelligent redirect guard already rejects
multi-target VMs, so remove the dead `!intelligent && len(vm.Targets) > 1` path
from `validateVirtualModel` in `validation.go`. Keep the behavior explicit by
relying on the existing guard/contract around `vm.Targets` instead of silently
trimming targets, and ensure the remaining validation flow in
`validateVirtualModel` clearly reflects the supported strategy rules.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 978ea972-e48d-4a79-8ebf-6d4c9b0cdee7

📥 Commits

Reviewing files that changed from the base of the PR and between f291083 and 9ddbb3d.

📒 Files selected for processing (32)

.env.template
CLAUDE.md
config/config.example.yaml
config/config.go
config/intelligent_routing.go
config/intelligent_routing_test.go
docs/features/intelligent-routing.mdx
internal/app/app.go
internal/app/intelligent_routing.go
internal/gateway/inference_intelligent.go
internal/gateway/inference_orchestrator.go
internal/gateway/inference_prepare.go
internal/intelligentrouter/catalog.go
internal/intelligentrouter/catalog_test.go
internal/intelligentrouter/classifier.go
internal/intelligentrouter/classifier_test.go
internal/intelligentrouter/prompt.go
internal/intelligentrouter/scorer.go
internal/intelligentrouter/selector.go
internal/intelligentrouter/selector_test.go
internal/intelligentrouter/selectors.go
internal/intelligentrouter/selectors_test.go
internal/intelligentrouter/types.go
internal/observability/metrics.go
internal/server/handlers.go
internal/server/http.go
internal/server/model_validation.go
internal/server/model_validation_test.go
internal/server/translated_inference_service.go
internal/virtualmodels/intelligent.go
internal/virtualmodels/snapshot.go
internal/virtualmodels/validation.go

coderabbitai · 2026-06-26T19:53:21Z

+# model for execution. Only triggers for intelligent selectors (auto/smart/
+# auto-cost/auto-quality) or intelligent virtual models, unless mode is observe.
+# Configure analyzers/selectors in config.yaml under intelligent_routing.
+# See docs/dev/intelligent-model.md.


📐 Maintainability & Code Quality | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check whether the referenced dev doc exists and list any intelligent-routing docs. fd -t f 'intelligent' docs rg -n 'docs/dev/intelligent-model\.md' --glob '!**/vendor/**'

Repository: ENTERPILOT/GoModel

Length of output: 194

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the relevant references and whether the referenced dev doc exists. git ls-files '.env.template' 'config/intelligent_routing.go' 'CLAUDE.md' 'docs/dev/*' 'docs/features/*' printf '\n--- refs ---\n' rg -n 'docs/dev/intelligent-model\.md|intelligent-model\.md|intelligent-routing\.mdx' .env.template config/intelligent_routing.go CLAUDE.md docs -g '!**/vendor/**' || true printf '\n--- docs/dev listing ---\n' fd -t f . docs/dev || true printf '\n--- docs/features listing ---\n' fd -t f . docs/features || true

Repository: ENTERPILOT/GoModel

Length of output: 2366

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the exact lines around the references. sed -n '268,280p' .env.template printf '\n--- config/intelligent_routing.go ---\n' sed -n '1,40p' config/intelligent_routing.go printf '\n--- CLAUDE.md ---\n' sed -n '120,140p' CLAUDE.md

Repository: ENTERPILOT/GoModel

Length of output: 8641

Update the intelligent-routing doc link
.env.template, config/intelligent_routing.go, and CLAUDE.md still reference docs/dev/intelligent-model.md; point them to docs/features/intelligent-routing.mdx instead.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.env.template at line 275, Update the stale intelligent-routing documentation reference so it points to docs/features/intelligent-routing.mdx instead of docs/dev/intelligent-model.md. Make the same link change wherever the old path appears in the related config/docs references, including the .env.template entry and the intelligent routing config/docs references such as config/intelligent_routing.go and CLAUDE.md, so all unique symbols and comments consistently reference the new doc.

coderabbitai · 2026-06-26T19:53:21Z

+# INTELLIGENT_ROUTING_MODE=off # off | observe | enforce
+# INTELLIGENT_ROUTING_DEFAULT_STRATEGY=balanced # cost | balanced | quality | latency


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Avoid inline # … comments on value lines.

Lines 278-279 put the enumerations on the same line as the assignment (INTELLIGENT_ROUTING_MODE=off # off | observe | enforce). The rest of this template keeps descriptions on separate comment lines, and several env_file loaders (e.g. Docker Compose) do not strip trailing inline comments, so an uncommented value would become off # off | observe | enforce. Move the option lists to preceding comment lines for consistency and safety.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.env.template around lines 278 - 279, The INTELLIGENT_ROUTING_MODE and INTELLIGENT_ROUTING_DEFAULT_STRATEGY entries still use inline value-line comments, which can be parsed as part of the value by env loaders. Update the .env.template entries to match the rest of the template by moving the option descriptions to separate preceding comment lines and keeping the assignment lines as plain values; use the existing INTELLIGENT_ROUTING_MODE and INTELLIGENT_ROUTING_DEFAULT_STRATEGY keys to locate the affected lines.

coderabbitai · 2026-06-26T19:53:21Z

 - **HTTP client:** `HTTP_TIMEOUT` (600s), `HTTP_RESPONSE_HEADER_TIMEOUT` (600s)
 - **Resilience:** Configured via `config/config.yaml` - global `resilience.retry.*` and `resilience.circuit_breaker.*` defaults with optional per-provider overrides under `providers.<name>.resilience.retry.*` and `providers.<name>.resilience.circuit_breaker.*`. Retry defaults: `max_retries` (3), `initial_backoff` (1s), `max_backoff` (30s), `backoff_factor` (2.0), `jitter_factor` (0.1). Circuit breaker defaults: `failure_threshold` (5), `success_threshold` (2), `timeout` (30s)
 - **Metrics:** `METRICS_ENABLED` (false), `METRICS_ENDPOINT` (/metrics)
+- **Intelligent routing:** Disabled by default. When enabled, the gateway classifies each request with a cheap analyzer model and selects the best catalog model for execution. Configured via `config/config.yaml` under `intelligent_routing` (key env vars: `INTELLIGENT_ROUTING_ENABLED`, `INTELLIGENT_ROUTING_MODE` (`off`/`observe`/`enforce`), `INTELLIGENT_ROUTING_DEFAULT_STRATEGY` (`cost`/`balanced`/`quality`/`latency`), `INTELLIGENT_ROUTING_MAX_ANALYSIS_TOKENS`, `INTELLIGENT_ROUTING_TIMEOUT`, `INTELLIGENT_ROUTING_MIN_SAVINGS_RATIO`, `INTELLIGENT_ROUTING_FALLBACK_MODEL`). It only triggers for intelligent selectors (`auto`, `smart`, `auto-cost`, `auto-quality`) or intelligent virtual models, unless `mode` is `observe` (dry-run that records the recommendation without changing the executed model). The default example pool ships with `codex/gpt-5.4-mini`, `zai/glm-5-turbo`, and `anthropic/claude-haiku-4-5` as ordered analyzers (tried in order with failover). Analysis cost is attributed to `analysis_user_path` (`/intelligent-router` by default) to keep it separate from the main execution in usage reports. See `docs/dev/intelligent-model.md`.


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Analyzer provider inconsistency with the example config.

This line documents the default analyzer pool as codex/gpt-5.4-mini, but config/config.example.yaml (Line 190-191) registers gpt-5.4-mini under provider: "openai", and docs/features/intelligent-routing.mdx also uses openai. Align these so users aren't pointed at a provider name (codex) that isn't in the shipped examples.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CLAUDE.md` at line 131, The intelligent routing docs describe the default analyzer pool with an inconsistent provider name for the first model entry; update the wording in CLAUDE.md so it matches the shipped examples and feature docs that use the OpenAI provider for gpt-5.4-mini. Use the intelligent routing section and its analyzer pool description as the target, and keep the provider/model naming aligned with config/config.example.yaml and docs/features/intelligent-routing.mdx so the documented default pool is consistent.

coderabbitai · 2026-06-26T19:53:21Z

+func TestValidateIntelligentRoutingConfig_RatioBounds(t *testing.T) {
+	for _, ratio := range []float64{-0.1, 1.5} {
+		cfg := &IntelligentRoutingConfig{
+			Enabled:   true,
+			Mode:      IntelligentRoutingModeObserve,
+			Analyzers: []AnalyzerModelConfig{{Model: "gpt-5.4-mini"}},
+		}
+		cfg.Defaults.MinSavingsRatio = ratio
+		err := ValidateIntelligentRoutingConfig(cfg)
+		require.Error(t, err)
+		require.Contains(t, err.Error(), "min_savings_ratio")
+	}
+}


📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add coverage for the min_savings_ratio: 0 boundary.

RatioBounds only tests out-of-range values. Given the defaults-overwrite behavior flagged in intelligent_routing.go, a test asserting that an explicit 0 survives validation (and is honored) would lock in the documented "always cheapest" contract.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@config/intelligent_routing_test.go` around lines 99 - 111, Add coverage in TestValidateIntelligentRoutingConfig_RatioBounds for the explicit zero boundary on IntelligentRoutingConfig.Defaults.MinSavingsRatio: create a case with ratio 0 alongside the existing out-of-range values, then assert ValidateIntelligentRoutingConfig accepts it and does not overwrite it. Use the existing ValidateIntelligentRoutingConfig helper and IntelligentRoutingConfig/Defaults.MinSavingsRatio symbols so the test locks in the “always cheapest” behavior.

coderabbitai · 2026-06-26T19:53:21Z

+func applyIntelligentRoutingDefaults(d *IntelligentDefaults) {
+	if strings.TrimSpace(d.Strategy) == "" {
+		d.Strategy = IntelligentStrategyBalanced
+	}
+	if d.MaxAnalysisTokens == 0 {
+		d.MaxAnalysisTokens = 256
+	}
+	if d.Timeout == 0 {
+		d.Timeout = 1500 * time.Millisecond
+	}
+	if d.MinSavingsRatio == 0 {
+		d.MinSavingsRatio = 0.15
+	}
+	if d.MinConfidence == 0 {
+		d.MinConfidence = 0.7
+	}
+}


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

min_savings_ratio: 0 (and min_confidence: 0) are silently overwritten with defaults, breaking a documented use case.

applyIntelligentRoutingDefaults cannot distinguish an explicit 0 from an unset zero value, so it forces MinSavingsRatio to 0.15 and MinConfidence to 0.7. The validation gate at Lines 186-190 already accepts 0 (< 0 only fails), and docs/features/intelligent-routing.mdx (Line 129) explicitly documents min_savings_ratio: 0 as "always prefer the cheapest model." With this logic that configuration is unreachable.

Consider gating these defaults so they only apply when the parent config wasn't user-populated, or use sentinel/pointer fields. A pragmatic fix is to apply the float defaults only before user values are merged (i.e. seed via DefaultIntelligentRoutingConfig and drop the == 0 reset here), keeping validation as the bound check.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@config/intelligent_routing.go` around lines 238 - 254, `applyIntelligentRoutingDefaults` is overwriting explicitly configured zero values for `MinSavingsRatio` and `MinConfidence`, making the documented `0` behavior unreachable. Update the defaulting logic in `applyIntelligentRoutingDefaults` so it only seeds these fields when the config is truly unset, not when a user intentionally set `0`; use the existing `DefaultIntelligentRoutingConfig`/merge flow or sentinel/pointer-based presence tracking to distinguish unset from explicit values, while keeping the validation in `IntelligentDefaults` unchanged.

coderabbitai · 2026-06-26T19:53:21Z

+	switch strategy {
+	case StrategyCost:
+		return 10*costScore + 0.1*float64(quality) + capabilityBonus(model, class)
+	case StrategyQuality:
+		base := float64(quality)
+		if class.RequiresReasoning || class.QualitySensitivity == "high" {
+			base *= 1.5
+		}
+		return base + capabilityBonus(model, class)
+	case StrategyLatency:
+		// Prefer cheaper tiers as a latency proxy and small-context models.
+		return 5*costScore + float64(4-quality) + capabilityBonus(model, class)
+	default: // balanced
+		return 4*costScore + float64(quality) + capabilityBonus(model, class)
+	}


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Balanced scoring ignores low-confidence classifications.

scoreCandidate never reads class.Confidence, so the balanced path still maximizes on cost even when the analyzer is unsure. That breaks the downstream contract in internal/intelligentrouter/selector_test.go:95-107, which expects low confidence to avoid the cheapest model. Fold confidence into the balanced weighting before cost dominates, or that selector path will keep recommending mini for uncertain requests.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/scorer.go` around lines 69 - 83, Balanced scoring in scoreCandidate currently ignores class.Confidence, so uncertain classifications still over-prioritize cheap models. Update the default balanced branch in internal/intelligentrouter/scorer.go to incorporate Confidence into the weighting before cost dominates, using the existing scoreCandidate switch and class fields like Confidence, so low-confidence requests are less likely to select the cheapest model and stay aligned with selector expectations.

coderabbitai · 2026-06-26T19:53:21Z

+	minSavings := cfg.MinSavingsRatio
+	if minSavings <= 0 {
+		minSavings = 0.15
+	}


🎯 Functional Correctness | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash rg -nP '\b(minSavings|MinSavingsRatio|MinSavings)\b' --type=go -C3

Repository: ENTERPILOT/GoModel

Length of output: 156

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate the target file and inspect relevant surrounding code. git ls-files | rg '^internal/intelligentrouter/selector\.go$|^internal/intelligentrouter/' printf '\n--- selector.go outline ---\n' ast-grep outline internal/intelligentrouter/selector.go --view expanded || true printf '\n--- selector.go relevant lines ---\n' nl -ba internal/intelligentrouter/selector.go | sed -n '1,260p' printf '\n--- search for savings-related usage in intelligentrouter ---\n' rg -n "MinSavingsRatio|minSavings|savings ratio|savings" internal/intelligentrouter -C 3 || true

Repository: ENTERPILOT/GoModel

Length of output: 2793

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect selector.go in smaller slices if it is long. wc -l internal/intelligentrouter/selector.go nl -ba internal/intelligentrouter/selector.go | sed -n '1,140p' printf '\n---\n' nl -ba internal/intelligentrouter/selector.go | sed -n '141,280p'

Repository: ENTERPILOT/GoModel

Length of output: 239

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '%s\n' '--- exact references across repo ---' rg -n "MinSavingsRatio|minSavings" . -g '!**/vendor/**' || true printf '\n%s\n' '--- selector.go with line numbers via cat -n ---' sed -n '1,320p' internal/intelligentrouter/selector.go | cat -n printf '\n%s\n' '--- scorer.go outline ---' ast-grep outline internal/intelligentrouter/scorer.go --view expanded || true printf '\n%s\n' '--- scorer.go relevant lines ---' sed -n '1,260p' internal/intelligentrouter/scorer.go | cat -n

Repository: ENTERPILOT/GoModel

Length of output: 19745

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '%s\n' '--- selector.go (full, numbered) ---' sed -n '1,320p' internal/intelligentrouter/selector.go | cat -n printf '\n%s\n' '--- scorer.go (full, numbered) ---' sed -n '1,260p' internal/intelligentrouter/scorer.go | cat -n

Repository: ENTERPILOT/GoModel

Length of output: 17631

Apply MinSavingsRatio before rewriting. minSavings is only defaulted and stored in NewSelector; Evaluate, choose, and RankCandidates never use it, so enforce mode can still switch models even when projected savings are below the configured threshold. Gate the rewrite on the savings ratio, or remove the setting if it’s intentionally unused.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/selector.go` around lines 51 - 54, `MinSavingsRatio` is only initialized in `NewSelector` and not enforced during routing, so the selector can still rewrite models even when savings are below the configured threshold. Update `Evaluate`, `choose`, or `RankCandidates` in `selector.go` to check the computed savings ratio against `minSavings` before allowing a rewrite, and keep the fallback default only if the setting is meant to remain configurable; otherwise remove the unused field entirely.

coderabbitai · 2026-06-26T19:53:21Z

+func (s *Selector) fallbackSelector(requested core.RequestedModelSelector, meta SelectionMeta) core.ModelSelector {
+	if fb := parseFallback(s.fallback); fb.Model != "" {
+		return fb
+	}
+	return requestedSelector(requested)
+}


📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Unused meta parameter in fallbackSelector.

meta SelectionMeta is never read; both call sites pass it (one as SelectionMeta{}). Drop it to simplify the signature.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/selector.go` around lines 190 - 195, The meta SelectionMeta parameter in Selector.fallbackSelector is unused, so simplify the API by removing it from the method signature and updating all call sites that pass it, including the one using SelectionMeta{}. Keep the behavior the same by still preferring parseFallback(s.fallback) and otherwise returning requestedSelector(requested).

coderabbitai · 2026-06-26T19:53:21Z

+	if d.Confidence > 0 && d.Confidence < 0.7 {
+		observability.IntelligentRoutingLowConfidenceTotal.WithLabelValues(d.Mode, d.Strategy).Inc()
+	}


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Low-confidence metric uses a hardcoded 0.7 instead of the configured minConf.

choose keys its low-confidence behavior off the configurable s.minConf, but logDecision records IntelligentRoutingLowConfidenceTotal using a literal 0.7. When MinConfidence is overridden, the metric will diverge from the actual decision threshold. Since logDecision is a package function without access to the selector, thread the threshold through the Decision (e.g. a MinConfidence field set in Evaluate).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/selector.go` around lines 272 - 274, The low-confidence metric in logDecision is still using a hardcoded 0.7, so it can drift from the selector’s configured threshold. Add the configured minimum confidence to Decision in Evaluate, then update choose/logDecision to use that Decision field instead of the literal when incrementing IntelligentRoutingLowConfidenceTotal. Use the existing choose, Evaluate, logDecision, and Decision symbols to thread the threshold through cleanly.

coderabbitai · 2026-06-26T19:53:21Z

+	if !intelligent && len(vm.Targets) > 1 {
+		vm.Targets = vm.Targets[:1]
 	}


📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Dead truncation branch — unreachable given the line 34 guard.

For non-intelligent strategies, line 34 already rejects any redirect where len(vm.Targets) > 1 (multi-target "not yet supported"). So by the time execution reaches line 58, a non-intelligent VM always has len(vm.Targets) <= 1, making !intelligent && len(vm.Targets) > 1 impossible. This contradicts the apparent intent (silently truncating extras) and may mislead future maintainers about the real contract.

♻️ Remove the unreachable branch

- if !intelligent && len(vm.Targets) > 1 { - vm.Targets = vm.Targets[:1] - }

Based on coding guidelines: "Keep the implementation explicit and maintainable rather than relying on clever abstractions" and follow KISS.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if !intelligent && len(vm.Targets) > 1 {

vm.Targets = vm.Targets[:1]

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/virtualmodels/validation.go` around lines 58 - 60, The truncation branch in the validation logic is unreachable because the earlier non-intelligent redirect guard already rejects multi-target VMs, so remove the dead `!intelligent && len(vm.Targets) > 1` path from `validateVirtualModel` in `validation.go`. Keep the behavior explicit by relying on the existing guard/contract around `vm.Targets` instead of silently trimming targets, and ensure the remaining validation flow in `validateVirtualModel` clearly reflects the supported strategy rules.

Source: Coding guidelines

…bled newIntelligentRouterFromConfig returns a nil *intelligentrouter.Selector when the feature is inactive (the default). Assigning that typed nil directly to the gateway.IntelligentRouter interface field produced a non-nil interface wrapping a nil pointer, so the orchestrator's `== nil` guard was skipped and ShouldEvaluate panicked on every chat request, surfacing as HTTP 500. Only assign the router to the interface when the concrete pointer is non-nil, keeping the field at a true nil interface otherwise. Add a reflection-based typed-nil check in evaluateIntelligentRouter as defense in depth against future regressions. Reproduced and verified against the full integration suite.

SantiagoDePolonia

General thoughts

I definitely want to add this kind of feature. Also, the most probable scenario is that this kind of feature will be in paid version of GoModel (because I need to pay my bills).

The biggest problem I have with this kind of huge external PRs is that they are not composed into the vision - which I still mainly have in my head so far. Also, reviewing 2,500 lines of code line-by-line is impossible with my limited capacity.

The solutions I see are:

working full-time with someone, so we can be aligned on the same vision
agreeing on the specification before implementing huge changes like this
or just better, more adjusted to the vision PRs - but it might be hard

Considering that I don't have money to hire anyone full-time right now - or even to pay myself - only the 2nd and 3rd option is valid.

The problem with the second option is that sharing those specifications is very often... more time-consuming than developing and testing the changes.

3rd option is still valid

Detailed feedback

I see those changes as kind of cumbersome. There are a lot of env variables / config.yaml properties, and they are not easy to understand what those change.

Also, those changes are not in line with the recently improved virtual models, which actually allow creating aliases like auto, smart, etc. There shouldn't be another mechanism for this. In this case:

selectors:
  - name: "auto"
    strategy: "balanced"
  - name: "smart"
    strategy: "balanced"
  - name: "auto-cost"
    strategy: "cost"
  - name: "auto-quality"
    strategy: "quality"

shouldn't be there, because we already have virtual models for that.

Also, as far as I understand, this PR also:

misses the UI configuration for this feature
doesn't address the prompt caching persistence issue, especially for the /responses endpoint

Also, I can't guarantee that I'll merge it after the adjustments described here. If you want to give it a try - go ahead.

SantiagoDePolonia · 2026-06-27T08:52:58Z

+	Enabled bool `yaml:"enabled" env:"INTELLIGENT_ROUTING_ENABLED"`
+
+	// Mode selects how decisions are applied:
+	//   - off:     no analysis; intelligent selectors behave as unknown models


When we have Enabled flag, do we need to have 'off' mode here? I think observe and enforce might be enough. With default to observe.

SantiagoDePolonia · 2026-06-27T08:56:01Z

+	Analyzers []AnalyzerModelConfig `yaml:"analyzers"`
+
+	// Defaults holds resolved defaults shared by every selector.
+	Defaults IntelligentDefaults `yaml:"defaults"`


I'm not sure why we need this one - another type/structure to hold defaults?

SantiagoDePolonia · 2026-06-27T09:18:22Z

+# See docs/dev/intelligent-model.md.
+# ----------------------------------------------------------------------------
+# INTELLIGENT_ROUTING_ENABLED=false
+# INTELLIGENT_ROUTING_MODE=off # off | observe | enforce


Again, IIUC:
INTELLIGENT_ROUTING_ENABLED=false AND INTELLIGENT_ROUTING_MODE=off ===

INTELLIGENT_ROUTING_ENABLED=true AND INTELLIGENT_ROUTING_MODE=off

What's the reason for mode=off ?

…nd gradual context penalty Refine the candidate ranking so the cost and context dimensions contribute signal only when they actually discriminate between models, instead of flooding the weighted sum with neutral noise. - Cost abstention: when every candidate shares the same estimated cost (the common case for gateways whose model registry has no pricing), the cost dimension contributes nothing, letting quality and capability decide. The cost score is now computed once across all candidates in RankCandidates and passed into scoreCandidate. - Free-model tier: models tagged "free", "local", or "self-hosted" (local Ollama/vLLM) now win the cost dimension at 1.0, while paid models are capped at 0.5 when a free model is present, preserving a proportional, visible gap. - Gradual context penalty: requests that approach a model's context window receive a proportional penalty instead of a binary exclude. Requests above 80% of the window decay from 1.0 toward 0.10; requests that exceed it are still hard-excluded. The request size is now estimated from message text and passed through the Candidate.ContextScore field. All existing tests pass. New table-driven tests cover cost abstention, the free-model advantage, the risk-zone penalty, the hard exclude, and unknown windows.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/intelligentrouter/catalog_test.go`:
- Around line 148-172: Update the RankCandidates cost-dimension coverage in
TestRankCandidates_FreeModelWinsOnCostDimension so it also exercises a
free/local model with no Pricing alongside other candidates that likewise omit
Pricing. Use BuildCandidates and RankCandidates to verify the free-tag signal
still wins instead of cost abstention hiding the behavior, and keep the existing
free/paid case as a separate check if needed.

In `@internal/intelligentrouter/scorer.go`:
- Around line 191-205: The cost-scoring logic in the scorer path is returning
too early when minCost/maxCost are flat, which prevents explicit free-tier
signals from being applied. In the cost evaluation around the minCost/maxCost
spread check, make sure free/local/self-hosted tags are converted into a
non-zero cost signal before the abstention branch can return zeroed scores.
Adjust the logic so the early return only happens after tag-based cost inference
has been considered, keeping the weighted sum able to prefer tagged free models
even when pricing data is missing.

In `@internal/intelligentrouter/selector.go`:
- Around line 231-238: The estimateRequestChars helper is counting UTF-8 bytes
instead of characters, which can overestimate non-ASCII prompts and affect
routing. Update the loop in estimateRequestChars to count runes for each
message’s extracted text content rather than using len on the string, and keep
the nil req handling unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7c01fa73-f17a-457b-87e2-19e9f5fb105a

📥 Commits

Reviewing files that changed from the base of the PR and between 140e94c and c601282.

📒 Files selected for processing (4)

internal/intelligentrouter/catalog.go
internal/intelligentrouter/catalog_test.go
internal/intelligentrouter/scorer.go
internal/intelligentrouter/selector.go

coderabbitai · 2026-06-27T23:56:03Z

+// TestRankCandidates_FreeModelWinsOnCostDimension verifies Item 2: a model
+// tagged "free" beats paid models on the cost dimension, and paid models are
+// capped proportionally so the gap stays visible.
+func TestRankCandidates_FreeModelWinsOnCostDimension(t *testing.T) {
+	paid := 1.0
+	free := 0.0
+	cat := fakeCatalog{
+		models: []core.Model{
+			{ID: "free-local", Metadata: &core.ModelMetadata{
+				Modes: []string{"chat"}, Tags: []string{"free", "local"},
+				Pricing: &core.ModelPricing{InputPerMtok: &free, OutputPerMtok: &free},
+			}},
+			{ID: "cheap-paid", Metadata: &core.ModelMetadata{
+				Modes: []string{"chat"}, Tags: []string{"mini"},
+				Pricing: &core.ModelPricing{InputPerMtok: &paid, OutputPerMtok: &paid},
+			}},
+		},
+		provider: map[string]string{"free-local": "ollama", "cheap-paid": "openai"},
+	}
+
+	cands := BuildCandidates(cat, CandidateFilter{}, nil, Classification{}, 0)
+	ranked := RankCandidates(cands, nil, StrategyCost, Classification{})
+	require.NotEmpty(t, ranked)
+	require.Equal(t, "free-local", ranked[0].Candidate.Selector.Model)
+}


🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Add coverage for free-tagged models without pricing.

This test only proves the free path when paid pricing is present. Add a case where "free"/"local" has no Pricing and other candidates also lack pricing, so cost abstention cannot silently bypass the free-tier signal. As per coding guidelines, **/*_test.go: “Add or update tests for behavior changes.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/catalog_test.go` around lines 148 - 172, Update the RankCandidates cost-dimension coverage in TestRankCandidates_FreeModelWinsOnCostDimension so it also exercises a free/local model with no Pricing alongside other candidates that likewise omit Pricing. Use BuildCandidates and RankCandidates to verify the free-tag signal still wins instead of cost abstention hiding the behavior, and keep the existing free/paid case as a separate check if needed.

Source: Coding guidelines

coderabbitai · 2026-06-27T23:56:04Z

+func estimateRequestChars(req *core.ChatRequest) int {
+	if req == nil {
+		return 0
+	}
+	total := 0
+	for _, msg := range req.Messages {
+		total += len(core.ExtractTextContent(msg.Content))
+	}


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the estimator no longer uses byte length for extracted chat text. rg -n 'estimateRequestChars|len\(core\.ExtractTextContent|RuneCountInString' internal/intelligentrouter internal/core

Repository: ENTERPILOT/GoModel

Length of output: 827

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '210,250p' internal/intelligentrouter/selector.go printf '\n---\n' sed -n '130,160p' internal/intelligentrouter/selector.go

Repository: ENTERPILOT/GoModel

Length of output: 2560

Count runes instead of bytes in estimateRequestChars. UTF-8 text is currently overcounted, which can skew routing toward larger models for non-ASCII prompts.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/selector.go` around lines 231 - 238, The estimateRequestChars helper is counting UTF-8 bytes instead of characters, which can overestimate non-ASCII prompts and affect routing. Update the loop in estimateRequestChars to count runes for each message’s extracted text content rather than using len on the string, and keep the nil req handling unchanged.

Allow operators to attach routing_guidance to model metadata so the analyzer prompt can see explicit hints about when a model should be preferred. The classifier now builds its prompt from a first-pass candidate pool so guided models are visible before classification runs, while the final candidate list is still rebuilt afterward to apply capability-derived filters. Improve analyzer robustness by giving the same analyzer one repair attempt when its first response is not valid JSON. The repair call reuses the system prompt but asks only for the compact JSON object, without re-sending the user request. If repair succeeds, the router keeps the same analyzer; otherwise failover continues to the next analyzer as before. Also document routing_guidance in the intelligent routing feature guide.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/intelligentrouter/classifier_test.go (1)
23-49: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Assert that the repair call replays the original analyzer prompt.

These tests only prove a second call happened. Because the fake executor ignores the request body, they still pass when the repair path drops the original prompt and returns arbitrary JSON. Capture the second ChatRequest and assert it still contains the request summary. Based on learnings: Add or update tests for behavior changes, with coverage for error handling.

Also applies to: 156-193
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/intelligentrouter/classifier_test.go` around lines 23 - 49, The
repair-path test is only verifying that a second call occurs, but not that the
original analyzer prompt is preserved. Update the fake executor in
fakeExecutor.ChatCompletion (or the specific test harness using it) to capture
the full second ChatRequest, then add an assertion that the retry/repair request
still includes the original request summary or analyzer prompt content. Use the
existing ChatCompletion, ChatRequest, and repair-call test setup to ensure the
second request body is validated, not just the model call count.
Source: Learnings

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/intelligentrouter/classifier.go`:
- Around line 141-160: The repair path in Classifier.repairClassification is
missing the original request context, so it can return valid JSON for the wrong
intent. Update the repair call in Classifier.repairClassification to include the
original request prompt/summary from the classification attempt alongside the
JSON-only instruction, using the existing callCtx/analyzer flow so the model
repairs formatting without changing the underlying classification target.

---

Outside diff comments:
In `@internal/intelligentrouter/classifier_test.go`:
- Around line 23-49: The repair-path test is only verifying that a second call
occurs, but not that the original analyzer prompt is preserved. Update the fake
executor in fakeExecutor.ChatCompletion (or the specific test harness using it)
to capture the full second ChatRequest, then add an assertion that the
retry/repair request still includes the original request summary or analyzer
prompt content. Use the existing ChatCompletion, ChatRequest, and repair-call
test setup to ensure the second request body is validated, not just the model
call count.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 68850f2e-9b68-43be-b119-ce8dbb544459

📥 Commits

Reviewing files that changed from the base of the PR and between c601282 and 995a4c8.

📒 Files selected for processing (6)

docs/features/intelligent-routing.mdx
internal/core/types.go
internal/intelligentrouter/classifier.go
internal/intelligentrouter/classifier_test.go
internal/intelligentrouter/prompt.go
internal/intelligentrouter/selector.go

coderabbitai · 2026-06-28T00:36:42Z

+	repaired, repairErr := c.repairClassification(callCtx, analyzer, temperature, maxTokens)
+	if repairErr == nil {
+		slog.Info("intelligent router analyzer repair succeeded",
+			"analyzer", analyzer.QualifiedModel(),
+		)
+		return repaired, analyzer, nil
+	}
+	return Classification{}, analyzer, fmt.Errorf("parse analyzer response: %w (repair failed: %v)", err, repairErr)
+}
+
+func (c *Classifier) repairClassification(ctx context.Context, analyzer core.ModelSelector, temperature float64, maxTokens int) (Classification, error) {
+	resp, err := c.executor.ChatCompletion(ctx, &core.ChatRequest{
+		Model:       analyzer.Model,
+		Provider:    analyzer.Provider,
+		Temperature: &temperature,
+		MaxTokens:   &maxTokens,
+		Messages: []core.Message{
+			{Role: "system", Content: analyzerSystemPrompt},
+			{Role: "user", Content: "Your previous response was invalid JSON. Return ONLY the compact JSON object described in the system prompt."},
+		},


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Pass the original request prompt into the repair call.

Line 159 asks the analyzer to resend valid JSON, but repairClassification no longer includes the original request summary. A syntactically valid repair can therefore succeed with an unrelated classification and misroute the request instead of just recovering from formatting noise.

Suggested fix

- repaired, repairErr := c.repairClassification(callCtx, analyzer, temperature, maxTokens) + repaired, repairErr := c.repairClassification(callCtx, analyzer, prompt, temperature, maxTokens) ... -func (c *Classifier) repairClassification(ctx context.Context, analyzer core.ModelSelector, temperature float64, maxTokens int) (Classification, error) { +func (c *Classifier) repairClassification(ctx context.Context, analyzer core.ModelSelector, prompt string, temperature float64, maxTokens int) (Classification, error) { resp, err := c.executor.ChatCompletion(ctx, &core.ChatRequest{ Model: analyzer.Model, Provider: analyzer.Provider, Temperature: &temperature, MaxTokens: &maxTokens, Messages: []core.Message{ {Role: "system", Content: analyzerSystemPrompt}, - {Role: "user", Content: "Your previous response was invalid JSON. Return ONLY the compact JSON object described in the system prompt."}, + {Role: "user", Content: prompt + "\n\nYour previous response was invalid JSON. Return ONLY the compact JSON object described in the system prompt."}, }, })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

repaired, repairErr := c.repairClassification(callCtx, analyzer, temperature, maxTokens)

if repairErr == nil {

slog.Info("intelligent router analyzer repair succeeded",

"analyzer", analyzer.QualifiedModel(),

)

return repaired, analyzer, nil

}

return Classification{}, analyzer, fmt.Errorf("parse analyzer response: %w (repair failed: %v)", err, repairErr)

}

func (c *Classifier) repairClassification(ctx context.Context, analyzer core.ModelSelector, temperature float64, maxTokens int) (Classification, error) {

resp, err := c.executor.ChatCompletion(ctx, &core.ChatRequest{

Model: analyzer.Model,

Provider: analyzer.Provider,

Temperature: &temperature,

MaxTokens: &maxTokens,

Messages: []core.Message{

{Role: "system", Content: analyzerSystemPrompt},

{Role: "user", Content: "Your previous response was invalid JSON. Return ONLY the compact JSON object described in the system prompt."},

},

repaired, repairErr := c.repairClassification(callCtx, analyzer, prompt, temperature, maxTokens)

if repairErr == nil {

slog.Info("intelligent router analyzer repair succeeded",

"analyzer", analyzer.QualifiedModel(),

)

return repaired, analyzer, nil

}

return Classification{}, analyzer, fmt.Errorf("parse analyzer response: %w (repair failed: %v)", err, repairErr)

}

func (c *Classifier) repairClassification(ctx context.Context, analyzer core.ModelSelector, prompt string, temperature float64, maxTokens int) (Classification, error) {

resp, err := c.executor.ChatCompletion(ctx, &core.ChatRequest{

Model: analyzer.Model,

Provider: analyzer.Provider,

Temperature: &temperature,

MaxTokens: &maxTokens,

Messages: []core.Message{

{Role: "system", Content: analyzerSystemPrompt},

{Role: "user", Content: prompt + "\n\nYour previous response was invalid JSON. Return ONLY the compact JSON object described in the system prompt."},

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/classifier.go` around lines 141 - 160, The repair path in Classifier.repairClassification is missing the original request context, so it can return valid JSON for the wrong intent. Update the repair call in Classifier.repairClassification to include the original request prompt/summary from the classification attempt alongside the JSON-only instruction, using the existing callCtx/analyzer flow so the model repairs formatting without changing the underlying classification target.

Add optional conversation-aware memory to the intelligent router so repeated requests in the same conversation can keep model choice more consistent across multi-turn sessions. Clients can now send X-GoModel-Conversation-ID on translated chat/responses routes. The transport metadata is propagated into the intelligent routing selection meta, where the selector reads a short in-memory history keyed by (user_path, conversation_id) and includes it in the analyzer prompt as "Previous routing decisions". When intelligent routing applies a concrete model in enforce mode, the router records the applied qualified model back into the same in-memory store. Entries expire automatically after one hour and each conversation is capped to the most recent 50 routing decisions. Also document the optional header in the intelligent routing feature guide and add unit coverage for header extraction, history prompt rendering, memory expiry/limits, and applied-model recording.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/intelligentrouter/selector.go (1)

136-147: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Record the applied fallback before this early return.

When classification fails in enforce mode, decision.applyForMode(...) can still produce an applied fallback, but this branch returns before addRoutingDecision(...) runs. That drops the applied model from conversation memory and breaks the new continuity contract on analyzer outages.

Proposed fix

 	if err != nil {
 		decision.AnalysisFailed = true
 		decision.Analyzers = s.classifier.Analyzers()
 		decision.Duration = time.Since(start)
 		decision.SelectedModel = s.fallbackSelector(requested, meta)
 		decision.Reason = "analysis failed: " + err.Error()
 		// In enforce, still apply the fallback so analysis failure never blocks the
 		// user's request; in observe, execute the requested model unchanged.
 		decision.applyForMode(s.mode, requested)
+		if decision.Applied && decision.AppliedModel.Model != "" && strings.TrimSpace(meta.ConversationID) != "" {
+			addRoutingDecision(meta.UserPath, meta.ConversationID, decision.AppliedModel.QualifiedModel())
+		}
 		logDecision(decision)
 		return decision
 	}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/intelligentrouter/selector.go` around lines 136 - 147, Record the
applied fallback before returning from the classification failure branch in
selector.go. In the ClassifyWithCandidates error path inside the routing logic,
keep the existing decision.applyForMode(s.mode, requested) call, but also
persist the resulting applied model by invoking addRoutingDecision with the same
request/decision state before logDecision and return. This ensures the fallback
chosen by s.fallbackSelector and applyForMode is captured in conversation memory
even when analysis fails.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/features/intelligent-routing.mdx`:
- Around line 278-282: Clarify in the intelligent-routing docs that GoModel’s
in-memory history is only written when routing actually applies a model, not for
every request with X-GoModel-Conversation-ID. Update the description around the
short-lived per-(user_path, conversation_id) history and “Previous routing
decisions” to note that observe-only deployments remain stateless and do not
accumulate history unless a model is applied.

In `@internal/gateway/inference_prepare.go`:
- Around line 97-99: The routing update in evaluateIntelligentRouter is being
overwritten because ensureTranslatedRequestWorkflow still receives the original
meta.Workflow, so the stale resolved selector can replace the new model/provider
choice. Update inference_prepare.go so the workflow passed into
ensureTranslatedRequestWorkflow is dropped or rebuilt when
requested.Model/requested.ProviderHint change after evaluateIntelligentRouter,
and make sure ApplyResolvedSelector uses the routed selector instead of the
preexisting one. Add a regression test covering the existing workflow plus
enforced reroute path to verify the routed choice is preserved.

In `@internal/intelligentrouter/conversation_memory.go`:
- Around line 27-33: The routing memory key in routingMemoryKey is built by
concatenating userPath and conversationID with a delimiter, which can collide
when either value contains "::". Replace this with a structured key approach,
such as a small composite struct or nested map keyed by the two fields, and
update any callers or map usages in conversation_memory.go accordingly so
distinct tuples cannot alias each other.

In `@internal/intelligentrouter/selector_test.go`:
- Around line 109-140: Add a regression test for the Evaluate analysis-failure
fallback path in selector_test.go. The current
TestSelector_RecordsAppliedModelInConversationMemory and
TestSelector_DoesNotRecordHistoryInObserveMode only cover the happy enforce flow
and observe mode, so add a test that forces classification/analysis to fail in
ModeEnforce and verifies the fallback still records the applied model in routing
memory. Use the existing Evaluate, getRoutingHistory, defaultRoutingMemory, and
fakeExecutor symbols to locate and exercise the behavior.

---

Outside diff comments:
In `@internal/intelligentrouter/selector.go`:
- Around line 136-147: Record the applied fallback before returning from the
classification failure branch in selector.go. In the ClassifyWithCandidates
error path inside the routing logic, keep the existing
decision.applyForMode(s.mode, requested) call, but also persist the resulting
applied model by invoking addRoutingDecision with the same request/decision
state before logDecision and return. This ensures the fallback chosen by
s.fallbackSelector and applyForMode is captured in conversation memory even when
analysis fails.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fff161ad-82ab-44c9-9c87-e75dc0198284

📥 Commits

Reviewing files that changed from the base of the PR and between 995a4c8 and 6901217.

📒 Files selected for processing (13)

docs/features/intelligent-routing.mdx
internal/gateway/inference_intelligent.go
internal/gateway/inference_orchestrator.go
internal/gateway/inference_prepare.go
internal/intelligentrouter/classifier_test.go
internal/intelligentrouter/conversation_memory.go
internal/intelligentrouter/conversation_memory_test.go
internal/intelligentrouter/selector.go
internal/intelligentrouter/selector_test.go
internal/intelligentrouter/types.go
internal/server/request_support.go
internal/server/request_support_test.go
internal/server/translated_inference_service.go

coderabbitai · 2026-06-28T01:00:31Z

+	requested = o.evaluateIntelligentRouter(ctx, req, requested, meta)
+	*model = requested.Model
+	*provider = requested.ProviderHint


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Drop the stale workflow when routing rewrites the selector.

Lines 97-99 update model and provider, but Line 100 still passes the original meta.Workflow into ensureTranslatedRequestWorkflow. If that workflow already has a resolution, the old selector is reused and ApplyResolvedSelector overwrites the routed choice, so intelligent routing never takes effect on that path.

Proposed fix

requested := core.NewRequestedModelSelector(*model, *provider) requested = o.evaluateIntelligentRouter(ctx, req, requested, meta) *model = requested.Model *provider = requested.ProviderHint + if meta.Workflow != nil && meta.Workflow.Resolution != nil { + current := meta.Workflow.Resolution.ResolvedSelector + if current.Model != requested.Model || current.Provider != requested.ProviderHint { + meta.Workflow = nil + } + } workflow, err := o.ensureTranslatedRequestWorkflow(ctx, meta.Workflow, meta.RequestID, meta.Endpoint, model, provider)

Please add a regression test for the "existing workflow + enforced reroute" case as well.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

requested = o.evaluateIntelligentRouter(ctx, req, requested, meta)

*model = requested.Model

*provider = requested.ProviderHint

requested = o.evaluateIntelligentRouter(ctx, req, requested, meta)

*model = requested.Model

*provider = requested.ProviderHint

if meta.Workflow != nil && meta.Workflow.Resolution != nil {

current := meta.Workflow.Resolution.ResolvedSelector

if current.Model != requested.Model || current.Provider != requested.ProviderHint {

meta.Workflow = nil

}

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/gateway/inference_prepare.go` around lines 97 - 99, The routing update in evaluateIntelligentRouter is being overwritten because ensureTranslatedRequestWorkflow still receives the original meta.Workflow, so the stale resolved selector can replace the new model/provider choice. Update inference_prepare.go so the workflow passed into ensureTranslatedRequestWorkflow is dropped or rebuilt when requested.Model/requested.ProviderHint change after evaluateIntelligentRouter, and make sure ApplyResolvedSelector uses the routed selector instead of the preexisting one. Add a regression test covering the existing workflow plus enforced reroute path to verify the routed choice is preserved.

coderabbitai · 2026-06-28T01:00:31Z

+func routingMemoryKey(userPath, conversationID string) string {
+	userPath = strings.TrimSpace(userPath)
+	conversationID = strings.TrimSpace(conversationID)
+	if conversationID == "" {
+		return ""
+	}
+	return userPath + "::" + conversationID


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Use a structured key instead of delimiter concatenation.

userPath + "::" + conversationID aliases distinct tuples whenever either component contains ::. That can merge unrelated routing histories and feed the wrong “previous decisions” back into the analyzer. A struct key (or nested map) avoids that state-corruption edge case entirely.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/conversation_memory.go` around lines 27 - 33, The routing memory key in routingMemoryKey is built by concatenating userPath and conversationID with a delimiter, which can collide when either value contains "::". Replace this with a structured key approach, such as a small composite struct or nested map keyed by the two fields, and update any callers or map usages in conversation_memory.go accordingly so distinct tuples cannot alias each other.

coderabbitai · 2026-06-28T01:00:31Z

+func TestSelector_RecordsAppliedModelInConversationMemory(t *testing.T) {
+	defaultRoutingMemory = &routingMemoryStore{data: make(map[string][]routingMemoryEntry)}
+	t.Cleanup(func() {
+		defaultRoutingMemory = &routingMemoryStore{data: make(map[string][]routingMemoryEntry)}
+	})
+
+	exec := &fakeExecutor{responses: map[string]string{"a-mini": validClassification}}
+	s := newTestSelector(t, ModeEnforce, exec)
+	req := &core.ChatRequest{Messages: []core.Message{{Role: "user", Content: "hi"}}}
+	requested := core.NewRequestedModelSelector("auto", "")
+	d := s.Evaluate(context.Background(), req, requested, SelectionMeta{Mode: ModeEnforce, UserPath: "/team", ConversationID: "conv-1"})
+
+	require.True(t, d.Applied)
+	history := getRoutingHistory("/team", "conv-1", 5)
+	require.Equal(t, []string{d.AppliedModel.QualifiedModel()}, history)
+}
+
+func TestSelector_DoesNotRecordHistoryInObserveMode(t *testing.T) {
+	defaultRoutingMemory = &routingMemoryStore{data: make(map[string][]routingMemoryEntry)}
+	t.Cleanup(func() {
+		defaultRoutingMemory = &routingMemoryStore{data: make(map[string][]routingMemoryEntry)}
+	})
+
+	exec := &fakeExecutor{responses: map[string]string{"a-mini": validClassification}}
+	s := newTestSelector(t, ModeObserve, exec)
+	req := &core.ChatRequest{Messages: []core.Message{{Role: "user", Content: "hi"}}}
+	requested := core.NewRequestedModelSelector("auto", "")
+	d := s.Evaluate(context.Background(), req, requested, SelectionMeta{Mode: ModeObserve, UserPath: "/team", ConversationID: "conv-1"})
+
+	require.False(t, d.Applied)
+	require.Nil(t, getRoutingHistory("/team", "conv-1", 5))
+}


📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a regression test for enforce-mode fallback recording.

These cases only cover successful enforce and observe. The analysis-failure branch in Evaluate can still apply a fallback and return before writing to conversation memory, so this behavior is currently unprotected by tests.

As per coding guidelines, **/*_test.go: Add or update tests for behavior changes, and ensure tests cover request translation, response normalization, error handling, default configuration, and provider-specific parameter mapping.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/intelligentrouter/selector_test.go` around lines 109 - 140, Add a regression test for the Evaluate analysis-failure fallback path in selector_test.go. The current TestSelector_RecordsAppliedModelInConversationMemory and TestSelector_DoesNotRecordHistoryInObserveMode only cover the happy enforce flow and observe mode, so add a test that forces classification/analysis to fail in ModeEnforce and verifies the fallback still records the applied model in routing memory. Use the existing Evaluate, getRoutingHistory, defaultRoutingMemory, and fakeExecutor symbols to locate and exercise the behavior.

Source: Coding guidelines

Track the outcome of every provider call in a lightweight in-memory health tracker (exponential decay, configurable window/half-life, Bayesian smoothing). Candidates whose recent weighted error rate crosses the circuit-breaker threshold are hard-excluded from the intelligent-router candidate pool; others receive a proportional penalty so a degraded-but-alive model still competes at a lower score. Health outcomes are recorded at two points in the gateway: - executeWithUsage: credits the model that actually completed the request, and debits the originally-requested model when a fallback took over. - tryFallbackResponse: debits both the primary and each fallback attempt that fails before a winner is found. The IntelligentRouter interface gains RecordExecution(qualifiedModel, success) so the gateway can report outcomes without importing the intelligentrouter package directly. Config: new optional intelligent_routing.defaults.health block with defaults window=20m, half_life=5m, pseudo_counts=2.0, circuit_breaker=0.9. All values are configurable via env vars (INTELLIGENT_ROUTING_HEALTH_*).

Expose active intelligent selectors through the models endpoint and use the configured selector set during workflow resolution so custom selectors do not fail early as unsupported models.

SantiagoDePolonia · 2026-06-28T13:59:02Z

+# virtual models, unless mode is observe. See docs/dev/intelligent-model.md.
+intelligent_routing:
+  enabled: false
+  mode: "off" # off | observe | enforce; observe classifies but keeps the requested model


There is still 'off' mode here. What is it for?

vfeitoza · 2026-06-28T16:52:39Z

I am still carefully reviewing the entire codebase and adding new ideas to the PR.
I will examine your comments more closely and align the implementation with the existing standards in Gomodel.
I’ve already modified the code to create a virtual model, though I haven't yet considered the dashboard experience.
I’ll take a closer look, and once everything is fully polished, I’ll let you know so you can test it.

greptile-apps Bot reviewed Jun 26, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

vfeitoza mentioned this pull request Jun 26, 2026

Feature: Intelligent Routing #432

Closed

SantiagoDePolonia requested changes Jun 27, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 27, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

vfeitoza added 2 commits June 27, 2026 22:29

fix(router): support configured intelligent selectors

11b1f42

Expose active intelligent selectors through the models endpoint and use the configured selector set during workflow resolution so custom selectors do not fail early as unsupported models.

SantiagoDePolonia reviewed Jun 28, 2026

View reviewed changes

		# INTELLIGENT_ROUTING_MODE=off # off \| observe \| enforce
		# INTELLIGENT_ROUTING_DEFAULT_STRATEGY=balanced # cost \| balanced \| quality \| latency

	if !intelligent && len(vm.Targets) > 1 {
	vm.Targets = vm.Targets[:1]
	}

Uh oh!

Uh oh!

Conversation

vfeitoza commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Key components

Configuration

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov-commenter commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 26, 2026

Confidence Score: 3/5

T-Rex Logs

Uh oh!

greptile-apps Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

SantiagoDePolonia left a comment

Choose a reason for hiding this comment

Uh oh!

SantiagoDePolonia Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

SantiagoDePolonia Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

SantiagoDePolonia Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

vfeitoza commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

codecov-commenter commented Jun 26, 2026 •

edited

Loading