A reference for the engineering, IT, and security teams evaluating Mise. It opens with the security posture and the controls behind it, then walks through tenant isolation, architecture, the data model, integrations, deployment topology, evaluations, and SLAs. Everything in this document is contractable.
Mise is a tenant-isolated reasoning service for multi-unit restaurant franchisees. It ingests POS, labor, inventory, financials, and food-safety signals from the systems your stores already run, layers your brand-standard documents on top as a private knowledge base, and answers your above-store leaders' questions with grounded citations.
Mise is read-mostly. The only writes it performs are inside its own tenant database — alerts, snoozes, assignments, and the audit log of who saw what and when. It does not write back into your POS, your scheduling system, or your ERP.
Security is the load-bearing property of Mise — every other section of this brief assumes the controls described here are in place. Your data stays inside your trust boundary, your credentials stay encrypted, and the path between Claude and your systems is audited end-to-end.
org_id on issue and re-validated on every request.audit_log. The table has no UPDATE or DELETE permission for the application role; rows can only be inserted. 7-year retention.org_id, request_id, and user_id on every line. A Logback filter scrubs known sensitive field names (password, token, secret, ssn, card) as a backstop against accidental leakage.mvn org.owasp:dependency-check run on every CI build. Critical CVEs block the build.org_id fails before merge.X-Anthropic-Privacy: do-not-train equivalents enabled per the provider's enterprise privacy terms.Tenant isolation is enforced at three independent layers. Each layer is sufficient on its own for many threat models; together they create a defense-in-depth that fails closed.
org_id in its JWT. An auth filter writes it to a request-scoped TenantContext before any controller runs. Code outside the filter cannot see anonymous or cross-tenant requests.org_id as a parameter, drawn from TenantContext. Plain @Query without an org_id clause is forbidden, and a CI ArchUnit rule fails the build if one slips in.USING (org_id = current_setting('app.current_org')::uuid) as a database-level net beneath the application — even a compromised application role cannot read across tenants.Vector retrieval has the same rule: the retriever refuses to run without an org_id filter, and after retrieval every returned chunk's org_id is re-validated against the request context before it reaches the synthesizer. A document chunk from another tenant cannot reach your prompt.
Mise is a single Spring Boot service backed by Postgres + pgvector and Redis. There is no Kafka, no Kubernetes, no service mesh. Boring on purpose: fewer moving parts means a smaller attack surface and a shorter incident MTTR.
A question enters through chat or the web UI. The router classifies it as either a metric question (handled by the NL→SQL plane) or a knowledge question (handled by RAG). Most real questions cross both, so the synthesizer composes results from each side and emits an answer with inline citations.
HTTP request
→ AuthFilter (JWT → TenantContext.org_id)
→ Router (metric? knowledge? mixed?)
→ SQL plane RAG plane
─ template registry ─ vector search (pgvector, org-scoped)
─ slot validation ─ BM25 + RRF fusion
─ Postgres execute ─ bge reranker
→ Synthesizer (LLM call w/ tool results + retrieved chunks)
→ SSE stream → client
Ingestion is one-way. Webhooks (Toast, Jolt) and pollers (7shifts, Square, Clover, Shopify, Lightspeed, QuickBooks, Deputy) land raw events, which a normalization stage maps into provider-agnostic warehouse tables. Documents follow a separate path: PDF/DOCX upload → chunking → embedding → indexed in pgvector.
Spring Web MVC + SSE for streaming answers. Tenant context is set on every request from a JWT-derived org_id; repositories refuse to run without it.
Operational tables, document chunks, and 1024-d embeddings live in one database. Row-level security enforced at the DB role; the application layer holds belt + suspenders.
Soft holds during ingestion, rate limits, async jobs. No durable message queue — recovery is built around replaying from the source of truth.
Reasoning calls go through a single LlmClient interface. Production code never references a specific provider directly. Self-host target: Qwen 2.5 32B-Instruct on Ollama for tenants that require it.
Operational data lives in provider-agnostic warehouse tables — sales_daily, labor_daily, inventory_snapshots, financials_monthly, food_safety_events. Each row carries org_id, store_id, the date, the source (square, clover, toast, …), and the normalized metric. The SQL plane reads from these, never from raw provider tables.
Documents are split into chunks (≈800 tokens, 120-token overlap) and indexed with both a 1024-d embedding and a BM25 column. Each chunk stores its org_id, document_id, page_number, and a stable citation_anchor so the UI can deep-link.
| Table | Holds | Tenant key | Retention |
|---|---|---|---|
stores | Store records, addresses, brand, GM | org_id | Lifetime of contract |
sales_daily | Net sales, tickets, daypart splits | org_id + RLS | 5y rolling |
labor_daily | Hours, OT, scheduled vs actual | org_id + RLS | 5y rolling |
document_chunks | Embedded prose from your manuals | org_id + RLS | Until document deleted |
alerts | Generated by rule engine; in-memory overlay for snooze/ack/assign | org_id + RLS | 2y rolling |
audit_log | Auth, queries, mutations, exports | org_id + RLS, append-only | 7y |
Mise does not let the model write SQL freely. Free-form NL→SQL is the kind of feature that demos well and breaks in production — wrong joins, missing tenant filters, queries that lock a table during peak. Instead, the router picks a named template from a versioned registry (in docs/sql-templates/) and fills typed slots.
Slot values are validated against types and an allowlist before the query runs. A store_id slot must resolve to a store in the caller's org; a date_range slot must be bounded; a metric slot must be one of the registered metrics. Anything else is rejected before SQL is touched.
// Example: "How did Aurora compare to plan last week?"
template_id: store_vs_plan_weekly
slots:
store_id: 7e3c-... (validated: belongs to org_id)
iso_week: 2026-W16 (validated: ≤ today)
metric: net_sales (validated: in registry)
RAG retrieval runs in parallel: vector search (pgvector cosine), BM25 (Postgres tsvector), reciprocal-rank fusion, then a bge-reranker pass to drop chunks that don't actually answer the question. A distance floor (currently 0.55) keeps low-relevance chunks out, and the citation list is filtered post-hoc to only the chunks the LLM actually referenced in its answer.
All integrations are BYOA — bring your own app. Your team registers a developer app in the vendor's portal and pastes the client ID and secret into Mise's settings. Tokens are encrypted at rest with AES-256-GCM, keyed off a master key held in the host's secret store, and decrypted in memory only at request time.
| Vendor | Domain | Mechanism | Scopes |
|---|---|---|---|
| Square | POS / payments | OAuth 2.0 | Read-only: payments, orders, locations |
| Clover | POS | OAuth 2.0 | Read-only: orders, payments, employees |
| Toast | POS | API key + webhook | Read-only: orders, checks, sales summary |
| Shopify | POS / e-comm | OAuth 2.0 | Read-only: orders, locations, inventory |
| Lightspeed | POS | OAuth 2.0 | Read-only: sales, items, locations |
| 7shifts | Labor | API key (poll) | Read-only: shifts, time punches |
| Deputy | Labor | OAuth 2.0 | Read-only: roster, timesheets |
| QuickBooks (Intuit) | Financials | OAuth 2.0 | Read-only: P&L, COGS, payroll |
| Jolt | Food safety | Webhook | Read-only: tasks, photos, temp logs |
Every connection records the org, store, granting user, granted scopes, and last-refreshed timestamp. Revoking the app in the vendor's portal kills Mise's access immediately; token validity is re-confirmed on every poll, with failures surfaced in your settings page within minutes.
Each customer is a single-tenant deployment by default — your data, your application instance, your encryption keys. The reference target is a Hetzner CCX23 (4 vCPU, 16GB) with Caddy + systemd; a single VM is sufficient through ~60 stores. Beyond that, the deployment scales vertically to CCX33 / CCX43, then splits read replicas. There is no multi-tenant SaaS pool that mixes customer data.
┌─────────────────────────────┐
│ Caddy (TLS, HSTS, HTTP/3) │
└──────────────┬──────────────┘
│ :8080
┌──────────────▼──────────────┐
│ Mise (Spring Boot, JVM) │
└──────────────┬──────────────┘
│
┌───────────┴───────────┐
┌──▼───┐ ┌───▼───┐
│ pg16 │ │ redis │
│+pgvec│ └───────┘
└──────┘
│
┌──▼───────┐
│ R2 blobs │ (uploaded docs, photos)
└──────────┘
Deploys flow through GitHub Actions → SCP → systemctl reload. Migrations run via Flyway on startup; schema is forward-only. A bad migration is rolled forward with a compensating migration, never backward — so the tenant database never sits in an indeterminate state.
The reasoning plane has a versioned eval suite (docs/eval/cases.json). Every change to a SQL template, a prompt, or the retrieval pipeline runs against the suite in CI. New features add 1–2 cases. Cases come in three flavors:
Groundedness is scored by a separate LLM judge. Three numbers are tracked on the internal dashboard: citation precision (cited chunks that are actually relevant), citation recall (relevant chunks that were cited), and refusal rate on out-of-scope questions. A regression on any of the three blocks the deploy.
status.mise.to — incidents posted within 15 minutes of detection.The most useful part of a brief is what's not inside the box. Mise does not:
Engineering, IT, and security teams reviewing Mise can request the full security packet — DPA template, sub-processor list, control matrix, pen-test summary, and a sandbox connected to a synthetic POS — by booking a working session: calendly.com/max-mise/intro-call-for-mise.