S4: Rewrite existing peer_url rows to canonical form (with collision merge) #14
Labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
posta/server#14
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent
posta/server#10 — Absorb spec §4.1/§4.2 canonicalization and §9 key-management simplification
What to build
A new versioned schema migration in
internal/storethat walks every tablecarrying a peer URL —
messages,contacts, and any otherpeer_url-bearingtable — and re-canonicalizes each row to §4.1 canonical form. Where the strict
form collapses two pre-existing rows into one, merge them according to the
policy below. Ship with a
--dry-runmode that prints the merge plan withoutwriting.
End-to-end behaviour after this slice:
inbox.dbcontaining bothhttps://x.example/inboxand
https://x.example/inbox/produces one canonical row whose history isthe older row's id; counts, pinned status, last-message timestamps, and
per-peer read watermarks are folded into the surviving row according to a
deterministic policy.
slog.Infowith both original URLs, the resultingcanonical URL, and the kept/dropped row ids — so an operator can audit
exactly what the upgrade did.
--dry-run(or equivalent CLI surface on the migrate command)prints the merge plan and exits without writing.
a no-op.
This slice is operationally irreversible. The dry-run flag is the safety net;
operators should run it on a copy of the production DB first.
Acceptance criteria
migrationslist ininternal/store(alongside the existing transactional migration framework). Existing
applied versions are not edited.
peer_urlinmessagesandcontactstocanonical form using the adapter /
posta.Canonicalize.peer_url, keep the row withthe older
id(or the oldercreated_atwhen there is no surrogate id).messages: do not merge — each row is an independent envelope andjust gets its
peer_urlrewritten. (Two duplicate envelopes only collideon
(peer_url, msg_id)ifmsg_idmatches, which is a pre-existinginvariant.)
contacts: fold into the surviving row by —display_name: keep theolder row's value;
pinned: logical OR;last_message_at: max;per-peer read watermark fields: max. Drop the loser row.
slog.Infoline tagged with the migration version,both input URLs, the canonical URL, and the kept/dropped ids.
--dry-runmode is reachable via the operator surface (CLI flag onwhichever subcommand triggers migrations on demand, or an env var
consulted by
OpenSQLite). In dry-run, the migration computes the plan,emits the would-merge log lines, and rolls back the transaction.
internal/store/sqlite_test.go(or new file) builds a:memory:SQLite, seeds rows that deliberately collide under §4.1(trailing slash, percent-encoded unreserved like
%7Evs~, IDNU-label vs A-label, case-only-distinct hosts that survive lowering),
runs the migration, and asserts: rows merged onto the older id;
display_name/pinned/last_message_at/watermarks folded per thepolicy; loser rows dropped.
after success leaves the table identical.
--dry-runemits the same plan as a real runbut leaves the database unchanged.
go build ./...andgo test ./...pass.Blocked by
posta/specshippingCanonicalize()(the precondition documented inposta/spec/TODO.md). The migration uses the spec library directly; it doesnot depend on S3 (the call-site migration). Soft preference to ship after
S3 so newly-written rows are already canonical before the historical sweep
runs — but the migration is correct in either order.
Precondition resolved.
posta/speccommit5aa3aa3landsCanonicalize(s) (string, error)inpkg/posta. The migration can call it directly on each row'speer_url.The server's
go.modalready usesreplace … => ../spec, so no version bump is needed.Implementation notes for the agent:
migrationslist ininternal/store/sqlite.go(versioned, transactional — same framework as existing migrations).peer_urlis rejected byCanonicalize(e.g. legacy entries with userinfo or non-HTTPS scheme) need a documented policy — recommend: log atslog.Warnwith the row id and reject category, leave the row untouched, and surface the count in the migration summary. Worth flagging in the slice comment / PR description.--dry-runsurface is operator-facing; a CLI flag on whichever subcommand currently triggersOpenSQLiteis the natural seam, or an env var if no migration-specific subcommand exists. The dry-run should compute the merge plan inside a transaction that's rolled back at the end.%7Evs~, IDN U-label vs A-label, case-only-distinct hosts) using the canonical outputs fromposta/spec/testdata/vectors/url-canonical/.This slice is operationally irreversible on production data. The
--dry-runoutput is the artifact a human reviews before applying.Category: enhancement
State: ready-for-agent