← Back to home
● The capability behind every Mise answer

One question. Numbers, procedures, and a cited answer.

Most copilots answer half the question. They can pull a sales number, or they can quote your manual — not both at once. Mise reasons across both at the same time, on the same prompt, with citations that point at the underlying data and the underlying SOP.


The split that other tools force you to live with

Today, an above-store leader running a hard question lives in two tools. A BI dashboard tells them what happened — sales were down 7% at three stores last weekend. A document search tool, or a slow Slack thread with a GM, tells them what should happen — the closing playbook, the labor adjustment procedure, the brand standard for incident reporting.

The dashboard doesn't know about the playbook. The playbook doesn't know about the numbers. The leader stitches them together by hand, every time, on every question.

Cross-stack reasoning means: in one prompt, Mise reaches into your numerical stack and your document stack, composes the answer, and cites both. No tab-switching. No "let me also pull up the procedure for that."

What it looks like in practice

mise.to / chat / above-store
Why was Aurora's Friday labor 6% over plan, and what does our playbook say to do about a Friday closing-shift over-run?
M

Aurora ran 6.4% over plan on Friday — driven by the closing shift.

Scheduled hours were 142 and actual hours landed at 151.1 [1] labor_daily. The gap concentrates between 9pm and close: Friday's drive-thru count was up 11% vs. the trailing 4-Friday average [2] sales_daily, and the closing manager kept a third line cook on past 10pm rather than cutting at the 9:30 demand point.

Your closing playbook covers this exact scenario. "Demand re-check at 21:30" says the closing manager should pull the 30-minute trailing ticket count and cut to the matrix value [3] Closing Playbook v4 · p. 12. The matrix value at Friday's volume is two cooks, not three.

The same playbook flags this as a recurring-pattern issue if it happens 3+ times in a 30-day window. Aurora has hit it twice in 22 days [4] alerts.history.

[1] labor_daily · Aurora · 2026-04-24 [2] sales_daily · drive-thru tickets [3] Closing Playbook v4 · p. 12 [4] alerts.history · last 30d

This is a real shape of answer Mise produces. The numbers and the procedure are reasoned about together. The citations point at the underlying source — a row in the warehouse or a page of your manual — and clicking one opens the source.

How the reasoning works

Under the hood, every question goes through four stages. None of them is exotic on its own; the work is in making them cooperate.

Stage 1

Route

Classify the question: metric, knowledge, or — almost always — both.

Stage 2

Plan in parallel

SQL plane picks a named template; RAG plane retrieves chunks. Both run at once.

Stage 3

Compose

Synthesizer reads tool results and retrieved chunks together; produces a single grounded answer.

Stage 4

Cite

Citations are filtered to only the sources the model actually referenced. The rest are dropped.

1 — Route

A small router classifies each question into one of three buckets: pure metric ("net sales last week at Aurora"), pure knowledge ("what's the closing-time temp log requirement"), or mixed. Mixed is the realistic case for above-store work; almost every question that matters needs both.

2 — Plan in parallel

The SQL plane does not let the model write SQL freely. It picks a named template from a versioned registry and fills typed slots. A store_id slot must resolve to a store in your org; a date_range slot must be bounded; a metric slot must be one of the registered metrics. Anything off-list is rejected before SQL is touched.

While the SQL plane works, the RAG plane retrieves: vector search over pgvector, BM25 over your document chunks, reciprocal-rank fusion to combine, then a reranker pass to drop chunks that look superficially relevant but don't actually answer the question.

3 — Compose

The synthesizer is the moment the two halves meet. It receives the SQL results as structured tool output and the retrieved chunks as supporting context, and produces a single answer. The model is instructed to ground every load-bearing claim in either a tool result or a retrieved chunk — and to refuse rather than guess if the data isn't there.

4 — Cite

Citations are filtered post-hoc. We parse the model's answer for [N] markers and only surface sources actually referenced. A distance floor (currently 0.55) keeps low-relevance chunks out, even if the retriever's top-K included them. The result: the citation list reads like a bibliography of the answer, not a dragnet of nearby chunks.

Why the SQL plane is locked down

Free-form NL→SQL is the kind of feature that demos beautifully and breaks in production. Wrong joins. Tenant filters that go missing. Long-running queries that lock a table during peak. The named-template pattern trades a little flexibility for properties you actually need — predictable performance, auditable behavior, and a code path that cannot ever return another tenant's data.

When a new question pattern shows up enough to matter, we add a template. Adding one is cheap. Operating without guard rails is not.

Why citations are filtered

An unfiltered citation list is worse than no citations at all. If the bottom of every answer carries five plausible-looking links and only two of them were actually used, your team learns to ignore the citation chip. Once they ignore it, the trust property is gone — and trust is the whole point of grounding.

Mise filters citations to the chunks the model named. If the model didn't reference a chunk, it doesn't get cited. The list is short, on purpose.

The shapes of question this unlocks

Performance + procedure

Why is X over plan, and what's our playbook for it?

The numerical "why" plus the documented "do this." The most common shape of question on Mise.

Audit + history

Which stores failed this week's check, and what does our remediation policy say?

Audit data joined to the brand-standard remediation steps, with severity scored against your tolerance.

Pattern + escalation

Has this same alert fired three times in 30 days? When does it escalate?

Operational history reasoned about against your written escalation procedure.

Comparative + standard

How does Aurora's labor mix compare to brand standard at this volume?

Live numbers from the warehouse, brand-standard ratios from the manual, single answer.

Where it doesn't apply

Not every question is a cross-stack question. "What's our net sales last week" is metric-only; "what does the closing playbook say about the alarm code" is knowledge-only. Mise routes those cleanly to one plane and skips the other. The cross-stack capability is the moment the two reach for each other on the same prompt — which, for above-store work, is most of the time.


See it on your data

The shortest path to evaluating cross-stack reasoning is to ask a real above-store question against your own stack. Book a 30-minute call — we'll spin up a sandbox connected to your sandbox POS and a folder of your manuals, and run the question shapes above through it together.

Request a Demo → See how it gets deployed Read the technical brief →