Skip to content

Tenant Suspend Flow

How the Console communicates tenant suspension and resumption to the instance, and how the instance should handle it for users who have access to multiple tenants.


Overview

Tenant suspension is tenant-scoped, not instance-scoped. When a tenant is suspended, only users of that tenant are affected — other tenants on the same instance continue serving normally. The instance needs its own local cache of tenant statuses so it can enforce this per-request without calling Console on every request.


Console → Instance: Tenant Status Push

When an admin suspends or resumes a tenant, Console immediately notifies the instance via a best-effort push call (fire-and-forget goroutine). The instance updates its local cache on receipt.

POST {apiBaseUrl}/internal/tenant-status
Authorization: Bearer <zitadel-worker-token>
Content-Type: application/json

Request body:

FieldTypeDescription
tenantIdstringExternal tenant ID (7-char code, e.g. ABC1234) — same ID used in provision-tenant
statusstringNew status: suspended or active
json
{
  "tenantId": "ABC1234",
  "status": "suspended"
}

Expected response: HTTP 2xx. Failures are logged by Console but do not fail the suspend operation.

Best-effort only. If the instance is unreachable at the time of suspension, it will not receive this push. The instance recovers full state from its own local DB on next boot — no Console call is needed.


Instance: Local Tenant State Cache

The instance maintains an in-memory map of tenantId → status, updated from two sources:

  1. On startup — load all tenant records from the instance's local DB
  2. On push — write to local DB first, then update in-memory cache
go
type TenantStateCache struct {
    mu     sync.RWMutex
    states map[string]string // tenantId → status
}

func (c *TenantStateCache) Set(tenantID, status string) {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.states[tenantID] = status
}

func (c *TenantStateCache) IsSuspended(tenantID string) bool {
    c.mu.RLock()
    defer c.mu.RUnlock()
    return c.states[tenantID] == "suspended"
}

On push received (POST /internal/tenant-status):

go
// 1. Write to DB (source of truth)
db.Tenants.UpdateOne(ctx,
    bson.M{"tenantId": body.TenantID},
    bson.M{"$set": bson.M{"status": body.Status}},
)
// 2. Update in-memory cache
tenantCache.Set(body.TenantID, body.Status)

On startup, before the HTTP server accepts traffic:

go
tenants, _ := db.Tenants.Find(ctx, bson.M{}) // non-archived
for _, t := range tenants {
    tenantCache.Set(t.TenantID, t.Status)
}

No Console API call is made on startup. The local DB is the source of truth — if the instance was down when a suspend/resume push was sent, it will already be in the DB from a previous successful push (or from initial provisioning). Either way the cache is correct before traffic is accepted.


Instance: Per-Request Tenant Check

After the instance-level availability middleware (which checks instance status), add a tenant-level check that reads the xto-session's tenant claim:

go
func TenantAvailabilityMiddleware(cache *TenantStateCache) gin.HandlerFunc {
    return func(c *gin.Context) {
        tenantID := getTenantIDFromSession(c) // from xto-session JWT claim
        if tenantID == "" {
            c.Next()
            return
        }
        if cache.IsSuspended(tenantID) {
            // Do not 503 — redirect to tenant picker so user can switch.
            c.Redirect(http.StatusFound, "/tenant-picker?reason=suspended&from="+tenantID)
            c.Abort()
            return
        }
        c.Next()
    }
}

Do not return a 503. Users often have access to multiple tenants on the same instance. A redirect to the tenant picker lets them continue working in another tenant.


Tenant Picker Page

When a user is redirected to the tenant picker (either at login or mid-session after suspension), the page should:

  1. List all tenants the user has access to on this instance (fetched from local auth/Zitadel)
  2. Show the current status of each tenant
  3. Highlight the suspended tenant and explain why they were redirected

URL: /tenant-picker?reason=suspended&from=ABC1234

UI behaviour:

Your workspace is currently unavailable

The tenant you were using (ABC1234) has been suspended.
Please select another tenant to continue, or contact support.

  ● ABC1234  [Suspended]   ← last used, grayed out
  ○ DEF5678  [Active]       ← clickable
  ○ GHI9012  [Active]       ← clickable

  Need help? → support.exto360.com

On tenant selection: issue a new xto-session for the selected tenant and redirect to the app.

If all tenants are suspended or the user has only one tenant: show a message with a link to the support portal. Do not auto-redirect anywhere.


Admin API: Suspend and Resume

Suspend a Tenant

POST /api/v1/tenants/:id/suspend
Authorization: Bearer <zitadel-admin-jwt>

Response (200 OK):

json
{ "status": "suspended" }

Error responses:

ConditionHTTPBody
Not found404{"error": "tenant not found"}
Already suspended200{"status": "suspended"} — idempotent, no error
Archived409{"error": "tenant cannot be suspended (archived)"}

What happens:

  1. Tenant status set to suspended in PostgreSQL.
  2. Console fires a best-effort POST /internal/tenant-status to the instance with status: "suspended".
  3. Instance updates its local cache. Users with an active session for this tenant are redirected to the tenant picker on their next request (within one request cycle).

Resume a Tenant

POST /api/v1/tenants/:id/resume
Authorization: Bearer <zitadel-admin-jwt>

Response (200 OK):

json
{ "status": "active" }

Error responses:

ConditionHTTPBody
Not found404{"error": "tenant not found"}
Not suspended409{"error": "tenant is not suspended"}

What happens:

  1. Tenant status set to active in PostgreSQL.
  2. Console fires a best-effort POST /internal/tenant-status to the instance with status: "active".
  3. Instance updates its cache. The tenant picker will show the tenant as active on next load.
  4. The Console worker will re-provision the tenant if needed (e.g., if the instance was previously decommissioned).

Sequence: Suspension During an Active Session

User is in app → xto-session for tenant ABC1234
Admin: POST /tenants/:id/suspend
  → Console: DB update (status: suspended)
  → Console: goroutine POST /internal/tenant-status {tenantId: "ABC1234", status: "suspended"}
  → Instance: cache.Set("ABC1234", "suspended")

User makes next request (any route)
  → TenantAvailabilityMiddleware: cache.IsSuspended("ABC1234") = true
  → Redirect 302 → /tenant-picker?reason=suspended&from=ABC1234

User sees tenant picker
  → Selects DEF5678
  → New xto-session issued for DEF5678
  → User continues working

Sequence: Instance Restart After Suspension

Instance restarts (deploy, crash, scale)
  → Startup: load all tenant records from local DB
  → [{tenantId: "ABC1234", status: "suspended"}, ...]
  → Instance: cache initialized with correct state before accepting traffic

User request arrives
  → TenantAvailabilityMiddleware sees suspended state from cache
  → Redirect to tenant picker (no gap, no window where suspended tenant was accessible)

Status Reference

Tenant statusAccessible via instance?Shown in tenant picker?
provisioningNo (not yet provisioned)No
activeYesYes
suspendedNo (redirect to picker)Yes, labeled [Suspended]
archivedNoNo