Tenant CLI Tools

Command-line utilities for tenant data migration, backup, cleanup, and verification against a MongoDB instance database. All tools share one package, internal/migrate/, and speak the same archive format.

Build all tools:

bash

go build -o bin/ ./cmd/tenant-dump ./cmd/tenant-import ./cmd/tenant-delete ./cmd/tenant-verify

Shared concepts
tenant-dump — export
tenant-import — ingest
tenant-delete — wipe
tenant-verify — confirm wipe
End-to-end lifecycle

Shared concepts

Tenant code

Every tenant has a 7-char short code (e.g. DahyCZM). Stored in MongoDB as tenantId / tenantID / first element of tenantIDs, plus the byTenant.<code> key on user docs. Same code also appears inside tenant-namespaced collection names.

Tenant-namespaced collections

Some collections embed the tenant code in their name. Recognised prefixes (see collections.go:20):

Prefix	Example
`custom_<code>_`	`custom_DahyCZM_field`
`x_<code>_`	`x_DahyCZM_foo`
`x_mt_<code>_`	`x_mt_DahyCZM_bar`
`cx_s_<code>_`	`cx_s_DahyCZM_baz`

Dump includes them whole (no tenant filter). Import renames the prefix <srcCode> → <tgtCode>. Delete drops them entirely.

Shared collections vs skip lists

Coll	Dump	Import	Delete
`system.*`	skip	skip	skip
`appAudit`, `version-history`, `test`, `exto-modules-old`	skip	skip (`appAudit` always)	full wipe (unless `NoExclusions` off)
`user`, `user-session`	include	special path (reuse / remap)	tenant-ref strip, not drop
`customer`	include	identity rewrite (`name`, `code`)	wipe tenant rows
everything else	tenant-filtered	full idempotent pipeline	wipe tenant rows

Archive layout

Zip produced by tenant-dump, consumed by tenant-import:

_metadata.json                      # tenant + source DB metadata
<db>/<coll>.jsonl                   # one Mongo Extended JSON doc per line
<db>/<coll>.indexes.jsonl           # one index spec per line (no _id_, no v/ns)

Metadata shape (import.go:23):

json

{
  "tenantId":   "DahyCZM",
  "tenantCode": "DahyCZM",
  "tenantName": "American Transmission Company",
  "dbName":     "exto-core-plat-prod-01",
  "format":     "jsonl",
  "exportedAt": "2026-04-16T22:16:38Z"
}

tenant-dump

Source: cmd/tenant-dump/main.go, runner: migrate/dump.go.

Export one tenant's data from a live MongoDB instance into a portable zip.

Usage

bash

./bin/tenant-dump \
  --mongo-uri "mongodb://host:27017/exto_prod" \
  --tenant-code DahyCZM \
  --tenant-name "Acme Corp" \
  --output acme-dump.zip

Flags

Flag	Short	Required	Default	Description
`--mongo-uri`		yes		Mongo URI, DB in path
`--tenant-code`		yes		7-char code
`--tenant-name`		no		Label, metadata only
`--output`	`-o`	no	`<name>_<code>_<ts>.zip`	Zip path
`--dry-run`		no	false	Count docs, no write
`--verbose`	`-v`	no	false	Info logs
`--report`	`-r`	no		JSON report path

What gets written

_metadata.json — tenant + source DB.
<db>/<coll>.jsonl — per coll, Extended JSON, one doc per line.
<db>/<coll>.indexes.jsonl — index specs, skips _id_, strips v / ns.

Older archives without index files still import (docs-only fallback).

Filter rules per collection

system.* → skip.
Tenant-namespaced (e.g. custom_<code>_*) → include all docs, no filter.
Excluded list (appAudit, version-history, test, exto-modules-old) → skip.
Everything else → filter via CollectionFilter(name, tenantCode) (matches tenantId / tenantID / tenantIDs / byTenant.<code> depending on coll).

Flow

mermaid

flowchart TD
  A[Start] --> B[Connect Mongo, list collections]
  B --> C{For each coll}
  C --> D{"system.* or excluded?"}
  D -- yes --> C
  D -- no --> E{tenant-namespaced for code?}
  E -- yes --> F["filter = bson.M{}"]
  E -- no --> G[filter = tenantId/IDs/byTenant match]
  F --> H[Count docs]
  G --> H
  H --> I{count == 0?}
  I -- yes --> J[Mark skipped, next]
  I -- no --> K{dry-run?}
  K -- yes --> L[Record count, next]
  K -- no --> M[Create zip entry docs.jsonl]
  M --> N[Cursor.Find w/ filter, write ExtJSON line per doc]
  N --> O[List indexes, drop _id_, write indexes.jsonl]
  O --> C
  C -- done --> P[Write _metadata.json]
  P --> Q[Close zip, print summary]
  Q --> R{HadErrors?}
  R -- yes --> S[exit 1]
  R -- no --> T[exit 0]

Decision tree — what happens to one collection

mermaid

flowchart TD
  X[Collection name] --> Y{system.*?}
  Y -- yes --> Z1[skip silently]
  Y -- no --> AA{starts with custom_/x_/x_mt_/cx_s_ + tenantCode?}
  AA -- yes --> BB[dump all docs unfiltered + indexes]
  AA -- no --> CC{in dumpExcludedCollections?}
  CC -- yes --> Z2[skip]
  CC -- no --> DD[dump docs matching tenant filter + indexes]

tenant-import

Source: cmd/tenant-import/main.go, runner: migrate/import.go. Idempotent design: tenant-import-idempotent.md.

Import a dump archive into a target DB, rewriting tenant refs and handling _id collisions.

Usage

bash

./bin/tenant-import \
  --zip acme-dump.zip \
  --mongo-uri "mongodb://host:27017/targetdb" \
  --tenant-code NEWTENT \
  --tenant-name "Acme Corp (QA)"

Flags

Flag	Short	Required	Default	Description
`--zip`	`-z`	yes		Archive path
`--mongo-uri`		yes		Target URI, DB in path
`--tenant-code`		yes		New tenant code
`--tenant-name`		yes		New tenant name (avoids `customer.name_1` clash)
`--batch-size`		no	1000	Bulk batch size
`--remap`	`-m`	no		User email remap JSON
`--reuse-existing-users`		no	false	Dedup against existing target users
`--dry-run`		no	false	Parse + classify, no writes
`--verbose`	`-v`	no	false	Info logs
`--report`	`-r`	no		JSON report path

Both --tenant-code and --tenant-name are hard required — the target's unique indexes on customer.name_1 / customer.code_1 would collide without a rewrite.

Stages

Read metadata — _metadata.json out of zip.
Preflight unique — abort if target customer already has a doc with matching code or name (preflight.go).
Index preflight — apply each collection's indexes before data writes. Unique-index failure → abort (target schema not compatible). Non-unique failure → log, continue.
User pre-scan (if --reuse-existing-users or --remap) — scan target user coll across all tenants, build email → _id map.

Pass 1 — classify — for every non-skip coll, stream source _ids, $in against target in 10 000-id chunks, bucket each into case 1 / 2 / 3:

Case	Target state	Action
1	exists under target tenant	keep `_id`, upsert is no-op or update
2	does not exist	keep `_id`, upsert inserts
3	exists under different tenant	allocate fresh `_id`, add `srcHex → newOID` to global `idMap`, mark for swap

Pass 2 — stream + write — per coll, stream docs, apply path-specific transform, upsert in batches of --batch-size. Three paths:
- User + reuse/remap — email dedupe, may skip insert + record target OID into idMap for ref rewrite.
- Skip list (user w/o remap, user-session, customer) — fresh _id, blind insert. customer gets name + code forced to target values. user-session in importSkipCollections is silently dropped.
- All others — apply idMap _id swap if case 3, deep-walk OIDs, rewrite user refs, upsert {_id: …}.
Report — JSON summary incl. userRemapDetails[] with per-user action (reused / inserted / inserted_renamed / remapped).

Rewrites applied to each doc

Rewrite	Scope	Trigger
`rewriteTenantRefs`	`tenantId`, `tenantID`, `tenantIDs[]`, `byTenant.<code>`	non-namespaced coll
`ensureTenantID`	backfill scalar `tenantId` where schema requires	non-user/customer, non-namespaced
`_id` swap	case-3 docs only	Pass 1 marked
`rewriteDocObjectIDs`	every field at every depth — replace OID values found in `idMap`	all non-skip colls
`rewriteDocUserRefs`	known email/OID paths + `customFields` subtree	non-skip colls + user-session/customer carriers
`customer.name` / `customer.code`	force target values	`customer` only

Index preflight

Read each <coll>.indexes.jsonl entry.
Rename coll if tenant-namespaced (<srcCode> → <tgtCode>).
Try batched createIndexes first (atomic).
On batch reject → per-index fallback; non-conflicting ones still land.
Unique-index failure → hard abort, migration is not safe.
Count = numIndexesAfter - numIndexesBefore (identical specs count as 0, no error).

User remap

--remap rewrites all user email references across every collection. JSON:

json

{
  "users": [
    { "from": "alice@oldcorp.com", "to": "alice@newcorp.com" },
    { "from": "bob@oldcorp.com",   "to": "bob@newcorp.com" }
  ],
  "default": "testuser@newcorp.com"
}

users — explicit per-email map.
default — fallback for any source email not in users.

Effective email resolution:

If in users[] → use mapped value.
Else if default set → use default.
Else → keep source email.

Rewrite scope:

Tier 1 — known paths (top-level, nested, OID arrays): createdBy, updatedBy, deletedBy, owner, recepient, userId, doc.createdBy, doc.updatedBy, workflowMeta.lastStepCompletedUser, workflowMeta.ballInCourt, space.spaceAdmins, responsibleUserNames, checklist[].createdBy, checklist[].updatedBy, checklist.items[].filledBy.userName, workflowMeta.participants, workflowMeta.responsibleUsers, responsibleUsers.users.
Tier 2 — deep-walk: recursively scans customFields subtree.
User coll: rewrites email, username.

--reuse-existing-users

Prevents leaking source-origin users into the target DB.

Off (default): every incoming user inserted. --remap still rewrites emails per config.
On: pre-scan target user across all tenants (email & username are practically unique DB-wide). For each incoming user:
- Resolve effective email (per --remap rules).
- If effective email already exists in target → do not insert; grant the existing user membership in the new tenant (tenantIDs + byTenant.<newCode>); rewire all refs to existing _id.
- If effective email does not exist → insert with effective email (explicit remap honored literally even if it looks prod-shaped).
Multiple source users collapsing to the same effective email share one target user.

Example — target has raj@qa.com, test@qa.com; remap alice@prod.com → test@qa.com, default throwaway@qa.com (not in target); dump has raj@qa.com, alice@prod.com, bob@prod.com:

Incoming	Effective	Action
`raj@qa.com`	`raj@qa.com`	reuse QA's raj, skip insert
`alice@prod.com`	`test@qa.com`	reuse QA's test, skip insert
`bob@prod.com`	`throwaway@qa.com`	insert one `throwaway@qa.com`

Flow

mermaid

flowchart TD
  A[Start] --> B[Open zip, read _metadata.json]
  B --> C[Connect target]
  C --> D{dry-run?}
  D -- no --> E[Preflight customer.code + customer.name unique]
  E --> F{clash?}
  F -- yes --> FA[abort]
  F -- no --> G[Index preflight per coll: batched createIndexes, per-index fallback]
  G --> H{unique idx failed?}
  H -- yes --> HA[abort]
  H -- no --> I
  D -- yes --> I[Pre-scan target users if reuse/remap]
  I --> J[Pass 1: classify each coll via $in chunks → idMap + remapSets]
  J --> K{For each coll, stream docs}
  K --> L{coll == user AND reuse/remap?}
  L -- yes --> M[Email-dedupe path: reuse or insert, update idMap/emailMap]
  L -- no --> N{coll in skip list user-wo-remap/user-session/customer?}
  N -- yes --> O[Fresh _id insert; customer rewrites name/code]
  N -- no --> P[Generic path: _id swap if case3, deep-walk OIDs, user refs, upsert by _id]
  M --> Q[Flush batch if full]
  O --> Q
  P --> Q
  Q --> K
  K -- done --> R[Finalize report: counts, userRemapDetails, index failures]
  R --> S{HadErrors?}
  S -- yes --> T[exit 1]
  S -- no --> U[exit 0]

Decision tree — per source `_id` (classify)

mermaid

flowchart TD
  A[Source _id X] --> B{exists in target coll?}
  B -- no --> C[Case 2: keep _id, upsert inserts]
  B -- yes --> D{target doc tenantId == targetCode?}
  D -- yes --> E[Case 1: keep _id, upsert no-op or update]
  D -- no --> F[Case 3: allocate new OID, record X → newOID in idMap, mark remapSet]

Decision tree — per incoming user (reuse path)

mermaid

flowchart TD
  U[Incoming user doc] --> A{in explicit remap users[]?}
  A -- yes --> B[effective = mapped email]
  A -- no --> C{default set?}
  C -- yes --> D[effective = default]
  C -- no --> E[effective = source email]
  B --> F{effective exists in target user coll?}
  D --> F
  E --> F
  F -- yes --> G[Action reused: skip insert, rewire refs to existing _id, grant tenant membership]
  F -- no --> H{effective != source?}
  H -- yes --> I[Action inserted_renamed: insert new, email=effective]
  H -- no --> J[Action inserted: insert new as-is]

Examples

bash

# Dry run to validate archive
./bin/tenant-import -z acme.zip \
  --mongo-uri "mongodb://host:27017/qa" \
  --tenant-code NEWTENT --tenant-name "Acme QA" \
  --dry-run -v

# Prod → staging with reuse + remap
./bin/tenant-import -z prod-DahyCZM.zip \
  --mongo-uri "mongodb+srv://.../ex-core-plat-stg-01" \
  --tenant-code DahyCZM --tenant-name "ATC" \
  --reuse-existing-users \
  --remap user-remap.json \
  --report stg-import-report.json -v

# All users → one test user
echo '{"default":"test@company.com"}' > remap.json
./bin/tenant-import -z acme.zip \
  --mongo-uri "mongodb://host:27017/qa" \
  --tenant-code NEWTENT --tenant-name "Acme QA" \
  -m remap.json

tenant-delete

Source: cmd/tenant-delete/main.go, runner: migrate/delete.go.

Wipe one tenant's data. Single-tenant or batch via JSON file.

Usage

bash

./bin/tenant-delete \
  --mongo-uri "mongodb://host:27017/mydb" \
  --tenant-code DahyCZM \
  --verify

Flags

Flag	Short	Required	Default	Description
`--mongo-uri`		yes		Mongo URI, DB in path
`--tenant-code`		yes*		7-char code (*or `--file`)
`--file`	`-f`	yes*		JSON array of `{tenantId, name}`
`--dry-run`		no	false	Count, no delete
`--verbose`	`-v`	no	false	Info logs
`--report`	`-r`	no		JSON report path
`--verify`		no	false	Post-delete verification scan
`--yes`	`-y`	no	false	Skip confirmation prompt

Note: CLI passes NoExclusions: true, so it wipes even appAudit, test, etc. for the tenant. Only system.* are always skipped.

Batch file

json

[
  { "tenantId": "X4KM92P", "name": "Acme Corp" },
  { "tenantId": "B7TN31Q", "name": "Beta Inc" }
]

Per-collection behaviour

Collection	Behaviour
`system.*`	skip
tenant-namespaced (`custom_<code>_`, `x_<code>_`, …)	drop collection
`user`, `user-session`	`$pull tenantIDs, $unset byTenant.<code>`; then delete users whose tenantIDs is empty/missing (except `username == "dev"`)
anything else	`DeleteMany` matching `tenantFilter(code)`

Confirmation

Destructive by default. Unless -y or --dry-run, prompts Type 'yes' to confirm:. Batch prompt lists all tenants first.

Flow

mermaid

flowchart TD
  A[Start] --> B{tenantCode or file?}
  B -- file --> C[Parse JSON array]
  B -- code --> D[single-entry list]
  C --> E{dry-run or -y?}
  D --> E
  E -- no --> F[Print warning + prompt 'yes']
  F --> G{confirmed?}
  G -- no --> GA[Abort]
  G -- yes --> H
  E -- yes --> H[For each tenant]
  H --> I[List all collections]
  I --> J{For each coll}
  J --> K{tenant-namespaced for code?}
  K -- yes --> L[Drop collection]
  K -- no --> M{system.*?}
  M -- yes --> J
  M -- no --> N{user/user-session?}
  N -- yes --> O["$pull tenantIDs + $unset byTenant; delete users with no tenantIDs (keep 'dev')"]
  N -- no --> P[DeleteMany tenant filter]
  L --> J
  O --> J
  P --> J
  J -- done --> Q{--verify?}
  Q -- yes --> R[Run RunVerifyWithDB]
  Q -- no --> S
  R --> S[Collect report]
  S --> H
  H -- all done --> T[Write report, print summary]
  T --> U{any errors?}
  U -- yes --> V[exit 1]
  U -- no --> W[exit 0]

Decision tree — what happens to one collection

mermaid

flowchart TD
  X[Collection name] --> Y{tenant-namespaced for code?}
  Y -- yes --> Z1[DROP coll]
  Y -- no --> A{system.*?}
  A -- yes --> Z2[skip]
  A -- no --> B{user or user-session?}
  B -- yes --> C["pull tenantIDs, unset byTenant.code; delete empty users except 'dev'"]
  B -- no --> D[DeleteMany where tenantId/tenantIDs/byTenant.code matches]

Examples

bash

# Dry-run single tenant
./bin/tenant-delete --mongo-uri "…/mydb" --tenant-code X4KM92P --dry-run

# Delete with verification, no prompt
./bin/tenant-delete --mongo-uri "…/mydb" --tenant-code X4KM92P --verify -y

# Batch delete from file, write report
./bin/tenant-delete --mongo-uri "…/qa" -f delete-qa1-tenants.json \
  -r delete-report.json -y

tenant-verify

Source: cmd/tenant-verify/main.go, runner: migrate/verify.go.

Scan all collections for any remaining tenant data. Run after tenant-delete (or as --verify inline).

Usage

bash

./bin/tenant-verify \
  --mongo-uri "mongodb://host:27017/mydb" \
  --tenant-code DahyCZM

Flags

Flag	Short	Required	Default	Description
`--mongo-uri`		yes		Mongo URI, DB in path
`--tenant-code`		yes*		7-char code (*or `--file`)
`--file`	`-f`	yes*		JSON array of tenants to batch verify
`--verbose`	`-v`	no	false	Info logs
`--report`	`-r`	no		JSON report path

Checks per collection

Collection	Check
`system.*`	skipped
tenant-namespaced (`custom_<code>_*`, etc.)	presence alone = FAIL; coll should have been dropped
everything else	count docs matching `CollectionFilter(name, code)` (scalar tenantId + `byTenant.<code>` for user/session)

Output

PASSED — no findings across all colls.
FAILED — lists collection, doc count, type (tenant-prefixed collection exists / docs with tenantID/tenantId/tenantIDs / user/session with tenant refs). Process exits 1.

Flow

mermaid

flowchart TD
  A[Start] --> B[Connect, list collections]
  B --> C{For each coll}
  C --> D{system.*?}
  D -- yes --> C
  D -- no --> E{tenant-namespaced for code?}
  E -- yes --> F["Count all docs → FAIL if coll exists"]
  E -- no --> G[Count docs matching tenant filter]
  F --> H{count > 0 OR coll exists?}
  G --> H
  H -- yes --> I[Append finding, Passed=false]
  H -- no --> C
  I --> C
  C -- done --> J{Passed?}
  J -- yes --> K[Print PASSED, exit 0]
  J -- no --> L[Print FAILED + findings, exit 1]

End-to-end lifecycle

mermaid

flowchart LR
  subgraph SOURCE["Source DB (prod)"]
    S[(MongoDB)]
  end
  subgraph TARGET["Target DB (staging/QA)"]
    T[(MongoDB)]
  end

  S -- tenant-dump --> Z["<code>.zip _metadata + .jsonl + .indexes.jsonl"]
  Z -- tenant-import --> T
  T -- tenant-verify --> VR[Verify report]
  T -. tenant-delete .-> T
  T -- tenant-verify --> VR

Common recipes:

bash

# 1. Clone prod tenant into staging
./bin/tenant-dump  --mongo-uri "$PROD" --tenant-code DahyCZM --tenant-name "ATC" -o prod-DahyCZM.zip -v
./bin/tenant-import -z prod-DahyCZM.zip --mongo-uri "$STG" \
  --tenant-code DahyCZM --tenant-name "ATC" \
  --reuse-existing-users --remap user-remap.json -v

# 2. Remove tenant from QA + confirm
./bin/tenant-delete --mongo-uri "$QA" --tenant-code X4KM92P --verify -y
./bin/tenant-verify --mongo-uri "$QA" --tenant-code X4KM92P

# 3. Batch cleanup QA
./bin/tenant-delete --mongo-uri "$QA" -f delete-qa1-tenants.json -r delete-report.json -y
./bin/tenant-verify --mongo-uri "$QA" -f delete-qa1-tenants.json -r verify-report.json

tenant-import-idempotent.md — algorithm + memory profile.
instance-maintenance.md — operational context.

Tenant CLI Tools ​

Contents ​

Shared concepts ​

Tenant code ​

Tenant-namespaced collections ​

Shared collections vs skip lists ​

Archive layout ​

tenant-dump ​

Usage ​

Flags ​

What gets written ​

Filter rules per collection ​

Flow ​

Decision tree — what happens to one collection ​

tenant-import ​

Usage ​

Flags ​

Stages ​

Rewrites applied to each doc ​

Index preflight ​

User remap ​

--reuse-existing-users ​

Flow ​

Decision tree — per source _id (classify) ​

Decision tree — per incoming user (reuse path) ​

Examples ​

tenant-delete ​

Usage ​

Flags ​

Batch file ​

Per-collection behaviour ​

Confirmation ​

Flow ​

Decision tree — what happens to one collection ​

Examples ​

tenant-verify ​

Usage ​

Flags ​

Checks per collection ​

Output ​

Flow ​

End-to-end lifecycle ​

Related docs ​

Tenant CLI Tools

Contents

Shared concepts

Tenant code

Tenant-namespaced collections

Shared collections vs skip lists

Archive layout

tenant-dump

Usage

Flags

What gets written

Filter rules per collection

Flow

Decision tree — what happens to one collection

tenant-import

Usage

Flags

Stages

Rewrites applied to each doc

Index preflight

User remap

--reuse-existing-users

Flow

Decision tree — per source `_id` (classify)

Decision tree — per incoming user (reuse path)

Examples

tenant-delete

Usage

Flags

Batch file

Per-collection behaviour

Confirmation

Flow

Decision tree — what happens to one collection

Examples

tenant-verify

Usage

Flags

Checks per collection

Output

Flow

End-to-end lifecycle

Related docs