Skip to content

Tenant CLI Tools

Command-line utilities for tenant data migration, backup, cleanup, and verification against a MongoDB instance database. All tools share one package, internal/migrate/, and speak the same archive format.

Build all tools:

bash
go build -o bin/ ./cmd/tenant-dump ./cmd/tenant-import ./cmd/tenant-delete ./cmd/tenant-verify

Contents

  1. Shared concepts
  2. tenant-dump — export
  3. tenant-import — ingest
  4. tenant-delete — wipe
  5. tenant-verify — confirm wipe
  6. End-to-end lifecycle

Shared concepts

Tenant code

Every tenant has a 7-char short code (e.g. DahyCZM). Stored in MongoDB as tenantId / tenantID / first element of tenantIDs, plus the byTenant.<code> key on user docs. Same code also appears inside tenant-namespaced collection names.

Tenant-namespaced collections

Some collections embed the tenant code in their name. Recognised prefixes (see collections.go:20):

PrefixExample
custom_<code>_custom_DahyCZM_field
x_<code>_x_DahyCZM_foo
x_mt_<code>_x_mt_DahyCZM_bar
cx_s_<code>_cx_s_DahyCZM_baz

Dump includes them whole (no tenant filter). Import renames the prefix <srcCode><tgtCode>. Delete drops them entirely.

Shared collections vs skip lists

CollDumpImportDelete
system.*skipskipskip
appAudit, version-history, test, exto-modules-oldskipskip (appAudit always)full wipe (unless NoExclusions off)
user, user-sessionincludespecial path (reuse / remap)tenant-ref strip, not drop
customerincludeidentity rewrite (name, code)wipe tenant rows
everything elsetenant-filteredfull idempotent pipelinewipe tenant rows

Archive layout

Zip produced by tenant-dump, consumed by tenant-import:

_metadata.json                      # tenant + source DB metadata
<db>/<coll>.jsonl                   # one Mongo Extended JSON doc per line
<db>/<coll>.indexes.jsonl           # one index spec per line (no _id_, no v/ns)

Metadata shape (import.go:23):

json
{
  "tenantId":   "DahyCZM",
  "tenantCode": "DahyCZM",
  "tenantName": "American Transmission Company",
  "dbName":     "exto-core-plat-prod-01",
  "format":     "jsonl",
  "exportedAt": "2026-04-16T22:16:38Z"
}

tenant-dump

Source: cmd/tenant-dump/main.go, runner: migrate/dump.go.

Export one tenant's data from a live MongoDB instance into a portable zip.

Usage

bash
./bin/tenant-dump \
  --mongo-uri "mongodb://host:27017/exto_prod" \
  --tenant-code DahyCZM \
  --tenant-name "Acme Corp" \
  --output acme-dump.zip

Flags

FlagShortRequiredDefaultDescription
--mongo-uriyesMongo URI, DB in path
--tenant-codeyes7-char code
--tenant-namenoLabel, metadata only
--output-ono<name>_<code>_<ts>.zipZip path
--dry-runnofalseCount docs, no write
--verbose-vnofalseInfo logs
--report-rnoJSON report path

What gets written

  • _metadata.json — tenant + source DB.
  • <db>/<coll>.jsonl — per coll, Extended JSON, one doc per line.
  • <db>/<coll>.indexes.jsonl — index specs, skips _id_, strips v / ns.

Older archives without index files still import (docs-only fallback).

Filter rules per collection

  • system.* → skip.
  • Tenant-namespaced (e.g. custom_<code>_*) → include all docs, no filter.
  • Excluded list (appAudit, version-history, test, exto-modules-old) → skip.
  • Everything else → filter via CollectionFilter(name, tenantCode) (matches tenantId / tenantID / tenantIDs / byTenant.<code> depending on coll).

Flow

mermaid
flowchart TD
  A[Start] --> B[Connect Mongo, list collections]
  B --> C{For each coll}
  C --> D{"system.* or excluded?"}
  D -- yes --> C
  D -- no --> E{tenant-namespaced for code?}
  E -- yes --> F["filter = bson.M{}"]
  E -- no --> G[filter = tenantId/IDs/byTenant match]
  F --> H[Count docs]
  G --> H
  H --> I{count == 0?}
  I -- yes --> J[Mark skipped, next]
  I -- no --> K{dry-run?}
  K -- yes --> L[Record count, next]
  K -- no --> M[Create zip entry docs.jsonl]
  M --> N[Cursor.Find w/ filter, write ExtJSON line per doc]
  N --> O[List indexes, drop _id_, write indexes.jsonl]
  O --> C
  C -- done --> P[Write _metadata.json]
  P --> Q[Close zip, print summary]
  Q --> R{HadErrors?}
  R -- yes --> S[exit 1]
  R -- no --> T[exit 0]

Decision tree — what happens to one collection

mermaid
flowchart TD
  X[Collection name] --> Y{system.*?}
  Y -- yes --> Z1[skip silently]
  Y -- no --> AA{starts with custom_/x_/x_mt_/cx_s_ + tenantCode?}
  AA -- yes --> BB[dump all docs unfiltered + indexes]
  AA -- no --> CC{in dumpExcludedCollections?}
  CC -- yes --> Z2[skip]
  CC -- no --> DD[dump docs matching tenant filter + indexes]

tenant-import

Source: cmd/tenant-import/main.go, runner: migrate/import.go. Idempotent design: tenant-import-idempotent.md.

Import a dump archive into a target DB, rewriting tenant refs and handling _id collisions.

Usage

bash
./bin/tenant-import \
  --zip acme-dump.zip \
  --mongo-uri "mongodb://host:27017/targetdb" \
  --tenant-code NEWTENT \
  --tenant-name "Acme Corp (QA)"

Flags

FlagShortRequiredDefaultDescription
--zip-zyesArchive path
--mongo-uriyesTarget URI, DB in path
--tenant-codeyesNew tenant code
--tenant-nameyesNew tenant name (avoids customer.name_1 clash)
--batch-sizeno1000Bulk batch size
--remap-mnoUser email remap JSON
--reuse-existing-usersnofalseDedup against existing target users
--dry-runnofalseParse + classify, no writes
--verbose-vnofalseInfo logs
--report-rnoJSON report path

Both --tenant-code and --tenant-name are hard required — the target's unique indexes on customer.name_1 / customer.code_1 would collide without a rewrite.

Stages

  1. Read metadata_metadata.json out of zip.

  2. Preflight unique — abort if target customer already has a doc with matching code or name (preflight.go).

  3. Index preflight — apply each collection's indexes before data writes. Unique-index failure → abort (target schema not compatible). Non-unique failure → log, continue.

  4. User pre-scan (if --reuse-existing-users or --remap) — scan target user coll across all tenants, build email → _id map.

  5. Pass 1 — classify — for every non-skip coll, stream source _ids, $in against target in 10 000-id chunks, bucket each into case 1 / 2 / 3:

    CaseTarget stateAction
    1exists under target tenantkeep _id, upsert is no-op or update
    2does not existkeep _id, upsert inserts
    3exists under different tenantallocate fresh _id, add srcHex → newOID to global idMap, mark for swap
  6. Pass 2 — stream + write — per coll, stream docs, apply path-specific transform, upsert in batches of --batch-size. Three paths:

    • User + reuse/remap — email dedupe, may skip insert + record target OID into idMap for ref rewrite.
    • Skip list (user w/o remap, user-session, customer) — fresh _id, blind insert. customer gets name + code forced to target values. user-session in importSkipCollections is silently dropped.
    • All others — apply idMap _id swap if case 3, deep-walk OIDs, rewrite user refs, upsert {_id: …}.
  7. Report — JSON summary incl. userRemapDetails[] with per-user action (reused / inserted / inserted_renamed / remapped).

Rewrites applied to each doc

RewriteScopeTrigger
rewriteTenantRefstenantId, tenantID, tenantIDs[], byTenant.<code>non-namespaced coll
ensureTenantIDbackfill scalar tenantId where schema requiresnon-user/customer, non-namespaced
_id swapcase-3 docs onlyPass 1 marked
rewriteDocObjectIDsevery field at every depth — replace OID values found in idMapall non-skip colls
rewriteDocUserRefsknown email/OID paths + customFields subtreenon-skip colls + user-session/customer carriers
customer.name / customer.codeforce target valuescustomer only

Index preflight

  • Read each <coll>.indexes.jsonl entry.
  • Rename coll if tenant-namespaced (<srcCode><tgtCode>).
  • Try batched createIndexes first (atomic).
  • On batch reject → per-index fallback; non-conflicting ones still land.
  • Unique-index failure → hard abort, migration is not safe.
  • Count = numIndexesAfter - numIndexesBefore (identical specs count as 0, no error).

User remap

--remap rewrites all user email references across every collection. JSON:

json
{
  "users": [
    { "from": "alice@oldcorp.com", "to": "alice@newcorp.com" },
    { "from": "bob@oldcorp.com",   "to": "bob@newcorp.com" }
  ],
  "default": "testuser@newcorp.com"
}
  • users — explicit per-email map.
  • default — fallback for any source email not in users.

Effective email resolution:

  1. If in users[] → use mapped value.
  2. Else if default set → use default.
  3. Else → keep source email.

Rewrite scope:

  • Tier 1 — known paths (top-level, nested, OID arrays): createdBy, updatedBy, deletedBy, owner, recepient, userId, doc.createdBy, doc.updatedBy, workflowMeta.lastStepCompletedUser, workflowMeta.ballInCourt, space.spaceAdmins, responsibleUserNames, checklist[].createdBy, checklist[].updatedBy, checklist.items[].filledBy.userName, workflowMeta.participants, workflowMeta.responsibleUsers, responsibleUsers.users.
  • Tier 2 — deep-walk: recursively scans customFields subtree.
  • User coll: rewrites email, username.

--reuse-existing-users

Prevents leaking source-origin users into the target DB.

  • Off (default): every incoming user inserted. --remap still rewrites emails per config.
  • On: pre-scan target user across all tenants (email & username are practically unique DB-wide). For each incoming user:
    • Resolve effective email (per --remap rules).
    • If effective email already exists in target → do not insert; grant the existing user membership in the new tenant (tenantIDs + byTenant.<newCode>); rewire all refs to existing _id.
    • If effective email does not exist → insert with effective email (explicit remap honored literally even if it looks prod-shaped).
  • Multiple source users collapsing to the same effective email share one target user.

Example — target has raj@qa.com, test@qa.com; remap alice@prod.com → test@qa.com, default throwaway@qa.com (not in target); dump has raj@qa.com, alice@prod.com, bob@prod.com:

IncomingEffectiveAction
raj@qa.comraj@qa.comreuse QA's raj, skip insert
alice@prod.comtest@qa.comreuse QA's test, skip insert
bob@prod.comthrowaway@qa.cominsert one throwaway@qa.com

Flow

mermaid
flowchart TD
  A[Start] --> B[Open zip, read _metadata.json]
  B --> C[Connect target]
  C --> D{dry-run?}
  D -- no --> E[Preflight customer.code + customer.name unique]
  E --> F{clash?}
  F -- yes --> FA[abort]
  F -- no --> G[Index preflight per coll: batched createIndexes, per-index fallback]
  G --> H{unique idx failed?}
  H -- yes --> HA[abort]
  H -- no --> I
  D -- yes --> I[Pre-scan target users if reuse/remap]
  I --> J[Pass 1: classify each coll via $in chunks → idMap + remapSets]
  J --> K{For each coll, stream docs}
  K --> L{coll == user AND reuse/remap?}
  L -- yes --> M[Email-dedupe path: reuse or insert, update idMap/emailMap]
  L -- no --> N{coll in skip list user-wo-remap/user-session/customer?}
  N -- yes --> O[Fresh _id insert; customer rewrites name/code]
  N -- no --> P[Generic path: _id swap if case3, deep-walk OIDs, user refs, upsert by _id]
  M --> Q[Flush batch if full]
  O --> Q
  P --> Q
  Q --> K
  K -- done --> R[Finalize report: counts, userRemapDetails, index failures]
  R --> S{HadErrors?}
  S -- yes --> T[exit 1]
  S -- no --> U[exit 0]

Decision tree — per source _id (classify)

mermaid
flowchart TD
  A[Source _id X] --> B{exists in target coll?}
  B -- no --> C[Case 2: keep _id, upsert inserts]
  B -- yes --> D{target doc tenantId == targetCode?}
  D -- yes --> E[Case 1: keep _id, upsert no-op or update]
  D -- no --> F[Case 3: allocate new OID, record X → newOID in idMap, mark remapSet]

Decision tree — per incoming user (reuse path)

mermaid
flowchart TD
  U[Incoming user doc] --> A{in explicit remap users[]?}
  A -- yes --> B[effective = mapped email]
  A -- no --> C{default set?}
  C -- yes --> D[effective = default]
  C -- no --> E[effective = source email]
  B --> F{effective exists in target user coll?}
  D --> F
  E --> F
  F -- yes --> G[Action reused: skip insert, rewire refs to existing _id, grant tenant membership]
  F -- no --> H{effective != source?}
  H -- yes --> I[Action inserted_renamed: insert new, email=effective]
  H -- no --> J[Action inserted: insert new as-is]

Examples

bash
# Dry run to validate archive
./bin/tenant-import -z acme.zip \
  --mongo-uri "mongodb://host:27017/qa" \
  --tenant-code NEWTENT --tenant-name "Acme QA" \
  --dry-run -v

# Prod → staging with reuse + remap
./bin/tenant-import -z prod-DahyCZM.zip \
  --mongo-uri "mongodb+srv://.../ex-core-plat-stg-01" \
  --tenant-code DahyCZM --tenant-name "ATC" \
  --reuse-existing-users \
  --remap user-remap.json \
  --report stg-import-report.json -v

# All users → one test user
echo '{"default":"test@company.com"}' > remap.json
./bin/tenant-import -z acme.zip \
  --mongo-uri "mongodb://host:27017/qa" \
  --tenant-code NEWTENT --tenant-name "Acme QA" \
  -m remap.json

tenant-delete

Source: cmd/tenant-delete/main.go, runner: migrate/delete.go.

Wipe one tenant's data. Single-tenant or batch via JSON file.

Usage

bash
./bin/tenant-delete \
  --mongo-uri "mongodb://host:27017/mydb" \
  --tenant-code DahyCZM \
  --verify

Flags

FlagShortRequiredDefaultDescription
--mongo-uriyesMongo URI, DB in path
--tenant-codeyes*7-char code (*or --file)
--file-fyes*JSON array of {tenantId, name}
--dry-runnofalseCount, no delete
--verbose-vnofalseInfo logs
--report-rnoJSON report path
--verifynofalsePost-delete verification scan
--yes-ynofalseSkip confirmation prompt

Note: CLI passes NoExclusions: true, so it wipes even appAudit, test, etc. for the tenant. Only system.* are always skipped.

Batch file

json
[
  { "tenantId": "X4KM92P", "name": "Acme Corp" },
  { "tenantId": "B7TN31Q", "name": "Beta Inc" }
]

Per-collection behaviour

CollectionBehaviour
system.*skip
tenant-namespaced (custom_<code>_*, x_<code>_*, …)drop collection
user, user-session$pull tenantIDs, $unset byTenant.<code>; then delete users whose tenantIDs is empty/missing (except username == "dev")
anything elseDeleteMany matching tenantFilter(code)

Confirmation

Destructive by default. Unless -y or --dry-run, prompts Type 'yes' to confirm:. Batch prompt lists all tenants first.

Flow

mermaid
flowchart TD
  A[Start] --> B{tenantCode or file?}
  B -- file --> C[Parse JSON array]
  B -- code --> D[single-entry list]
  C --> E{dry-run or -y?}
  D --> E
  E -- no --> F[Print warning + prompt 'yes']
  F --> G{confirmed?}
  G -- no --> GA[Abort]
  G -- yes --> H
  E -- yes --> H[For each tenant]
  H --> I[List all collections]
  I --> J{For each coll}
  J --> K{tenant-namespaced for code?}
  K -- yes --> L[Drop collection]
  K -- no --> M{system.*?}
  M -- yes --> J
  M -- no --> N{user/user-session?}
  N -- yes --> O["$pull tenantIDs + $unset byTenant; delete users with no tenantIDs (keep 'dev')"]
  N -- no --> P[DeleteMany tenant filter]
  L --> J
  O --> J
  P --> J
  J -- done --> Q{--verify?}
  Q -- yes --> R[Run RunVerifyWithDB]
  Q -- no --> S
  R --> S[Collect report]
  S --> H
  H -- all done --> T[Write report, print summary]
  T --> U{any errors?}
  U -- yes --> V[exit 1]
  U -- no --> W[exit 0]

Decision tree — what happens to one collection

mermaid
flowchart TD
  X[Collection name] --> Y{tenant-namespaced for code?}
  Y -- yes --> Z1[DROP coll]
  Y -- no --> A{system.*?}
  A -- yes --> Z2[skip]
  A -- no --> B{user or user-session?}
  B -- yes --> C["pull tenantIDs, unset byTenant.code; delete empty users except 'dev'"]
  B -- no --> D[DeleteMany where tenantId/tenantIDs/byTenant.code matches]

Examples

bash
# Dry-run single tenant
./bin/tenant-delete --mongo-uri "…/mydb" --tenant-code X4KM92P --dry-run

# Delete with verification, no prompt
./bin/tenant-delete --mongo-uri "…/mydb" --tenant-code X4KM92P --verify -y

# Batch delete from file, write report
./bin/tenant-delete --mongo-uri "…/qa" -f delete-qa1-tenants.json \
  -r delete-report.json -y

tenant-verify

Source: cmd/tenant-verify/main.go, runner: migrate/verify.go.

Scan all collections for any remaining tenant data. Run after tenant-delete (or as --verify inline).

Usage

bash
./bin/tenant-verify \
  --mongo-uri "mongodb://host:27017/mydb" \
  --tenant-code DahyCZM

Flags

FlagShortRequiredDefaultDescription
--mongo-uriyesMongo URI, DB in path
--tenant-codeyes*7-char code (*or --file)
--file-fyes*JSON array of tenants to batch verify
--verbose-vnofalseInfo logs
--report-rnoJSON report path

Checks per collection

CollectionCheck
system.*skipped
tenant-namespaced (custom_<code>_*, etc.)presence alone = FAIL; coll should have been dropped
everything elsecount docs matching CollectionFilter(name, code) (scalar tenantId + byTenant.<code> for user/session)

Output

  • PASSED — no findings across all colls.
  • FAILED — lists collection, doc count, type (tenant-prefixed collection exists / docs with tenantID/tenantId/tenantIDs / user/session with tenant refs). Process exits 1.

Flow

mermaid
flowchart TD
  A[Start] --> B[Connect, list collections]
  B --> C{For each coll}
  C --> D{system.*?}
  D -- yes --> C
  D -- no --> E{tenant-namespaced for code?}
  E -- yes --> F["Count all docs → FAIL if coll exists"]
  E -- no --> G[Count docs matching tenant filter]
  F --> H{count > 0 OR coll exists?}
  G --> H
  H -- yes --> I[Append finding, Passed=false]
  H -- no --> C
  I --> C
  C -- done --> J{Passed?}
  J -- yes --> K[Print PASSED, exit 0]
  J -- no --> L[Print FAILED + findings, exit 1]

End-to-end lifecycle

mermaid
flowchart LR
  subgraph SOURCE["Source DB (prod)"]
    S[(MongoDB)]
  end
  subgraph TARGET["Target DB (staging/QA)"]
    T[(MongoDB)]
  end

  S -- tenant-dump --> Z["<code>.zip _metadata + .jsonl + .indexes.jsonl"]
  Z -- tenant-import --> T
  T -- tenant-verify --> VR[Verify report]
  T -. tenant-delete .-> T
  T -- tenant-verify --> VR

Common recipes:

bash
# 1. Clone prod tenant into staging
./bin/tenant-dump  --mongo-uri "$PROD" --tenant-code DahyCZM --tenant-name "ATC" -o prod-DahyCZM.zip -v
./bin/tenant-import -z prod-DahyCZM.zip --mongo-uri "$STG" \
  --tenant-code DahyCZM --tenant-name "ATC" \
  --reuse-existing-users --remap user-remap.json -v

# 2. Remove tenant from QA + confirm
./bin/tenant-delete --mongo-uri "$QA" --tenant-code X4KM92P --verify -y
./bin/tenant-verify --mongo-uri "$QA" --tenant-code X4KM92P

# 3. Batch cleanup QA
./bin/tenant-delete --mongo-uri "$QA" -f delete-qa1-tenants.json -r delete-report.json -y
./bin/tenant-verify --mongo-uri "$QA" -f delete-qa1-tenants.json -r verify-report.json