Skip to content

Release Archive & Env-Version Retention

How old releases are aged out and how env_versions history is reclaimed without breaking the "what's running right now" lookup.

Written for engineers / ops. Code references throughout.


1. Why this exists

The console keeps two append-only histories:

  • releases — every feature/hotfix/config release ever cut.
  • env_versions — every mutation to env-var defaults or per-instance overrides.

Without retention, both grow unbounded. We don't want to delete blindly — a release row is still useful as audit lineage years later, and an env_versions row referenced by a live instance must never disappear or the env-diff lookup breaks.

The design uses release archival as the signal for what's worth keeping. Once a release is archived past its unarchive window, its tied resources (env-version pin, per-service config, command logs) become reclaimable. Env-version cleanup falls out naturally from that signal.


2. Lifecycle at a glance

T=0              release completes (terminal: completed, failed, rolled_back, cancelled)
                 │  full data intact: per_service, env_version pin, instance_commands

T=3 years        auto-archive sweep flips status → 'archived', stamps archived_at,
                 stashes prior status in pre_archive_status
                 │  full data still intact — unarchive available

T=3y + 30d       cascade-purge sweep runs:
                   - DELETE FROM instance_commands WHERE release_id = …
                   - UPDATE releases SET per_service = '{}', env_version = 0
                 │  release row survives: id, name, type, changelog,
                 │  approvals, activity log, archived_at, pre_archive_status

Same sweep tick  env-version retention runs:
                   any env_versions row >30 days old AND no longer pinned by
                   a release or instance state is DELETEd

The 3-year, 30-day, and "30 days minimum visibility" windows are all defined in code as constants:

ConstantValueWhere
releaseArchiveAge3 yearsinternal/worker/release_archive.go
releaseUnarchiveWindow / releases.UnarchiveWindow30 daysinternal/worker/release_archive.go + internal/handler/releases/service.go
releaseArchiveInterval24 hinternal/worker/release_archive.go
releaseArchiveBatch100 rowsinternal/worker/release_archive.go
envvars.EnvVersionMinAge30 daysinternal/handler/envvars/repository.go

3. Release archival

3.1 States

ReleaseStatusArchived = "archived" is added alongside the existing terminal statuses in internal/model/release.go.

Two new columns on releases, added in migration 009:

  • archived_atTIMESTAMPTZ NULL. Set when the release flips to archived. Drives both the unarchive window and the cascade-purge cutoff.
  • pre_archive_statusTEXT NOT NULL DEFAULT ''. Holds the terminal status the release had before archive, so unarchive can restore it.

Partial index idx_releases_archived_at covers WHERE archived_at IS NOT NULL so the purge selector is cheap.

3.2 Who archives a release

Two paths, same code:

  • ManualPOST /releases/:id/archive, actor = the requesting user. Available from any terminal status (completed, failed, rolled_back, cancelled).
  • Auto — the worker sweep at T=3y. Actor = "system".

Both go through Service.ArchiveRelease (internal/handler/releases/service.go), which calls Repository.ArchiveRelease (one atomic UPDATE with a status guard). An archived event lands on the release's activity log.

3.3 Unarchive (the 30-day window)

POST /releases/:id/unarchive, available only while now() - archived_at < 30 days. The SQL guard in Repository.UnarchiveRelease enforces this independently of any UI check:

sql
UPDATE releases
SET status             = pre_archive_status,
    archived_at        = NULL,
    pre_archive_status = ''
WHERE id = $1
  AND status = 'archived'
  AND archived_at IS NOT NULL
  AND archived_at >= $3            -- cutoff = now() - 30d
  AND pre_archive_status <> ''

The frontend (ReleaseDetailPage.tsx) hides the button past the deadline and shows a banner stating archive is now permanent.

3.4 Cascade purge (at archived_at + 30d)

When the unarchive window closes, the next sweep runs Repository.PurgeReleaseCascade for each eligible row inside a single transaction:

  1. DELETE FROM instance_commands WHERE release_id = …
  2. UPDATE releases SET per_service = '{}', env_version = 0 WHERE id = …

The release row stays — status stays archived, archived_at stays set, pre_archive_status stays set. A purged event is appended to the activity log so the timeline reflects what happened.

What survives the cascade:

  • Release id, type, priority, name, version, semver bump, release notes
  • Stage statuses + approvals (the audit story for "who approved this and when")
  • Activity log entries (release_events)
  • initiated_by, created_at, deployed_at, completed_at, archived_at, pre_archive_status

What's gone:

  • per_service map (which images deployed where)
  • env_version pin (which env snapshot was active)
  • All instance_commands for the release (per-instance payloads + agent replies)

3.5 The sweep

runReleaseArchive (internal/worker/release_archive.go) runs once at startup and every 24h after. Leader-gated, so only one replica acts per tick. Three steps per tick, in order:

go
archived := w.autoArchive(ctx, now)        // T=3y selector
purged   := w.purgeArchived(ctx, now)      // T=archived+30d selector
envDel,_ := w.envVarRepo.DeleteOrphanedVersions(ctx)
w.tracker.Record(ctx, WorkerReleaseArchive, ..., map[string]any{
    "archived":          archived,
    "purged":            purged,
    "envVersionsPruned": envDel,
})

Order matters: step 2 NULLs releases.env_version for newly-purged rows, which is exactly what makes those versions eligible for deletion in step 3. The same tick reconciles both.

Batch size is capped at releaseArchiveBatch = 100 per step per tick to bound transaction time. If there's a backlog (e.g., first run after deploy), it drains across multiple days.

Visible to operators as worker release_archive in the system dashboard.


4. Env-version retention

4.1 The rule

env_versions.DeleteOrphanedVersions (internal/handler/envvars/repository.go) is one DELETE:

sql
DELETE FROM env_versions
WHERE changed_at < now() - interval '30 days'
  AND version NOT IN (
    SELECT env_version FROM releases
     WHERE env_version IS NOT NULL AND env_version <> 0
    UNION
    SELECT applied_env_version FROM instance_release_states
     WHERE applied_env_version IS NOT NULL AND applied_env_version <> 0
  )

Three reasons to keep a version, any one suffices:

  1. A release pins itreleases.env_version is non-zero. Every release except a cascade-purged one keeps its pin. The cascade purge at archived_at + 30d is the only thing that NULLs this column, which is exactly what makes the version eligible here.
  2. An instance is running itinstance_release_states.applied_env_version is non-zero. Live state pin. Never deleted by anything except instance decommission, so a running env can never lose its history.
  3. It's younger than EnvVersionMinAge (30 days) — age floor. Recent intermediate edits between releases stay browsable regardless of orphan status.

4.2 Why the age floor

Without (3), an edit storm between two releases would lose its history within ~24h: the next release's pin moves to the latest version, the intermediates instantly orphan, and the next sweep deletes them. The age floor gives at least 30 days for "what did I just change?" to have an answer without re-introducing per-key bookkeeping.

4.3 Visibility window summary

ScenarioHow long is it visible?
Version pinned by a non-purged releaseUntil the release is archived and the 30-day cascade window expires
Version pinned by a live instance_release_states.applied_env_versionUntil the instance moves to a different version (then it joins one of the other buckets)
Intermediate edit, never pinned by a release or live instanceMinimum 30 days from changed_at, up to 30 days + 24h depending on where the sweep tick lands
Edit on a release that gets cascade-purgedThe version is unpinned at archived_at + 30d; from there, the 30-day age floor still applies if it's young enough, else it's deleted on the next tick

4.4 Tuning

All knobs are constants. To extend the visibility floor:

go
// internal/handler/envvars/repository.go
const EnvVersionMinAge = 30 * 24 * time.Hour  // bump to extend

To slow the sweep (longer guaranteed presence past the age floor):

go
// internal/worker/release_archive.go
const releaseArchiveInterval = 24 * time.Hour  // bump to slow

No schema migration needed for either.


5. Operational notes

5.1 Recovery: "I archived the wrong release"

  • Within 30 days → use the unarchive button on the release detail page, or POST /releases/:id/unarchive. Restores prior status, clears archive metadata. Data was never touched.
  • After 30 days → not recoverable. The cascade has run; instance_commands are gone, per_service is empty, env_version is 0. The release row remains as an audit stub.

5.2 Recovery: "An env version got deleted that I needed"

If the env_version was pinned by a non-purged release or a live instance, this can't happen — the DELETE excludes both. If it was orphaned + older than 30 days, it's gone and not recoverable from the live DB. Restore from a backup if you need the diff.

If you regularly need older env diffs, lift EnvVersionMinAge.

5.3 Backlog after first deploy

When migration 009 lands on an instance with years of release history, the first auto-archive sweep will find every terminal release older than 3 years. Batch cap is 100 per tick, so a few hundred old releases drain over 2–3 days. No spike in DB load — each archive is a single-row UPDATE.

The first cascade-purge sweep runs nothing on day one because nothing has archived_at >= now() - 30d yet. Cascade starts hitting rows 30 days after they're auto-archived.

The first env-retention sweep may delete a meaningful chunk on day one (anything older than 30 days that never had a release pin and isn't currently applied). Bound by the single DELETE — fine for the table's expected size (thousands of rows).

5.4 Activity-log entries

Three new actions land on the release timeline:

  • archivedactorType = user for manual, system for auto. Meta: { priorStatus: "completed" | ... }.
  • unarchivedactorType = user. Meta: { restoredStatus: "completed" | ... }.
  • purgedactorType = system. Cosmetic — confirms the cascade ran.

5.5 Frontend UX

ReleaseDetailPage.tsx:

  • Archive button visible when status is one of completed, failed, rolled_back, cancelled. Shows a confirm dialog with the 30-day + permanent-purge wording.
  • Unarchive button visible only while now() - archivedAt < 30d.
  • Banner under the header for archived releases:
    • Inside window: "Archived <timestamp>. Unarchive available until <deadline>. After that, per-service config, env-version pin, and command logs will be purged."
    • Outside window: "Archived <timestamp>. Archive is permanent — per-service config, env-version pin, and command logs have been purged."

6. What this design rejects, and why

Rejected ideaWhy
Keep last N versions per (key, scope) in env_versionsAdds per-key bookkeeping for a UI we don't have. Archive-driven retention is a simpler, semantically meaningful signal.
Hard-delete the release row when archive becomes permanentLoses changelog + approvals + activity-log lineage. Audit teams care about "who approved this release in 2024" even if the env diff is gone.
Drop the env-version age floor and rely only on release/instance pinsEdit storms between releases lose history within ~24h. Bad UX for "what did I just change?".
Cascade-purge immediately at archive time, no 30-day windowNo room for "oh wait, I needed that." Unarchive becomes useless.
Make archive reversible foreverIndistinguishable from "no retention." Cascade only fires when archive is final; if final never happens, env-version history never reclaims.

7. Quick reference — files touched

LayerFile
Migrationinternal/db/migrations/009_release_archive_lifecycle.sql
Modelinternal/model/release.go
Release repositoryinternal/handler/releases/repository.goArchiveRelease, UnarchiveRelease, ListArchivableTerminal, ListPurgeable, PurgeReleaseCascade
Release serviceinternal/handler/releases/service.goArchiveRelease, UnarchiveRelease, UnarchiveWindow
HTTPinternal/handler/releases/handler.goHandleArchiveRelease, HandleUnarchiveRelease
Routescmd/api/main.goPOST /releases/:id/archive, POST /releases/:id/unarchive
Env-version retentioninternal/handler/envvars/repository.goDeleteOrphanedVersions, EnvVersionMinAge
Worker loopinternal/worker/release_archive.go
Worker wiringinternal/worker/worker.go, cmd/api/wire.go, cmd/worker/main.go
Frontend typesweb/src/types/index.ts
Frontend APIweb/src/api/releases.ts
Frontend UIweb/src/pages/ReleaseDetailPage.tsx