The default stack is backwards

Most AI systems put the LLM in the runtime decision path. Agent needs to decide whether to grant access? Ask the LLM. Need to route a request to the right data source? Let the model figure it out. This is fast to prototype and impossible to audit. The LLM’s reasoning is opaque, non-deterministic, and changes with every model update. You can prompt-engineer your way to 95% accuracy, but the remaining 5% is a different 5% every time you redeploy.

For governance decisions — who can access what, which actions are authorized, how data flows between classification boundaries — that’s disqualifying. Compliance frameworks don’t have a carve-out for “the model usually gets it right.” They require deterministic, traceable decision chains. Putting an LLM in that path means every decision is a black box that produces different outputs under identical inputs.

The inversion

Nautilus v2 flips the stack: LLMs work as knowledge engineers during maintenance windows, and CLIPS provides deterministic inference at runtime. The Curator persona watches request patterns across the system, identifies emerging data relationships, and proposes new routing rules. Those proposals go through a validation pipeline — tested against historical requests, reviewed for consistency with existing policy, scored for certainty — before they enter the production rule base. The LLM never touches a live request.

This means the intelligence is in the authoring, not the execution. The model’s strengths — pattern recognition across large corpora, natural language understanding of policy documents, identification of implicit relationships — are applied where non-determinism is acceptable. The execution layer, where decisions must be auditable and repeatable, runs on a rule engine that has been doing exactly this since 1985.

Meta-rules and the validation pipeline

A meta-rule is a rule that generates other rules. When the Curator detects a pattern — say, analysts consistently need both threat intel and geolocation data for threat assessments — it proposes a new routing rule that pre-fetches both sources. The proposed rule isn’t YAML or JSON configuration. It’s a CLIPS rule with explicit fact patterns, salience, and module scope, ready to slot into the existing rule base.

The validation pipeline tests the proposed rule against six months of audit logs, checks for conflicts with existing policy rules, assigns a certainty factor based on the strength of the observed pattern, and stages it for human review. Bad proposals die in the pipeline — a rule that would grant broader access than current policy allows gets flagged and discarded automatically. Good proposals improve routing efficiency over time without anyone hand-writing CLIPS syntax or editing configuration files.

Why this matters for compliance

Auditors don’t care about your model weights. They care about decision chains: what rule fired, what facts it matched, what the output was. A CLIPS rule base gives you exactly that. Every evaluation produces a complete trace — the rules that fired in order, the facts they matched against working memory, the modules they traversed, and the final routing decision. You can replay any historical decision and get the same result.

The LLM’s contribution is authoring better rules over time, not making runtime decisions. When you swap from one model to another — or when your provider ships a new version that changes behavior on edge cases — the runtime doesn’t change. The rules in production are the same rules. The audit trail is complete, deterministic, and decoupled from whichever model happens to be running the Curator. That’s the difference between “we use AI” and “we use AI responsibly.”