--- title: Be Civic — Lifecycle, Branch Policy, and Test Fixtures type: spec status: v0.5.1 — post-2026-05-12 tag-resolution reconciliation date: 2026-05-12 parent_spec: ./README.md sibling_specs: - architecture.md - protocol.md - schemas.md - privacy.md - skills.md - website.md tags: ["be-civic", "bc-internal", "architecture"] --- # Be Civic — Lifecycle, Branch Policy, and Test Fixtures This sub-spec covers the artefact lifecycle: the state-machine promotion model (§9) with its promotion thresholds and rollback/quarantine mechanics, the branch and CI policy (§10) including retraction and rollback semantics, and the test fixture structure (§12) that exercises every validator and PII detector. For the schemas of the artefacts that advance through this state machine, see `schemas.md`. For PII scrub mechanics invoked at commit time, see `privacy.md`. For the skills-drafting protocol that produces new artefacts, see `skills.md`. ## 9. State-machine promotion Validation-by-consensus is the corpus-growth path. The state machine is automated from day 1 (S31): D1 aggregates drive promotion of catalogue rows (volatile values, references) by direct UPDATE on the row, and drive promotion of skill / path bodies by opening a PR that flips the `status` frontmatter field. Maintainer review is reserved for `draft` PRs (target_type=skill | path; S31, brand-new artefacts); amendments, catalogue updates, and concern flows are fully automatic. Compaction as a primary corpus-growth mechanism is **superseded** in v1 by validation-by-consensus and the deterministic state machine. Any future "extract patterns from concern clusters into proposed amendments" job (compaction in the old sense) is deferred to v1.1 as a separate concern; it would produce `amendment` (target_type=skill) submissions that enter the same state machine as any other PR. ### 9.1 State diagram ``` draft submitted (target_type=skill | path; 24h staging) ── consumer DELETE ──► [cancelled] │ ▼ Worker opens PR (S10 / S18) ──── PR-CI green ──── maintainer review (S31) ──┐ │ ▼ [alpha] on canonical.md or paths/index.json │ │ validations accumulate (D1) ▼ [beta] │ │ validations accumulate ▼ [stable] │ │ rejects exceed confirms by ≥2 │ (D1 supersession or git revert) ▼ (rollback mechanism; no `rolled_back` enum value) amendment submitted (target_type=skill | path | path_source; 24h staging) ── PR-CI green ── auto-merge ► canonical.md or paths/index.json edit; cohort resets if `version` bumps; status returns to alpha amendment submitted (target_type=volatile_value | reference) ── INSERT-with-supersede in D1 ─► row at alpha; same threshold table promotes (fast-path; no PR pipeline) Validation submitted (any of six target_types) ── written to D1 immediately ─► aggregates feed state machine Concern / feedback / rating submitted ── staged + scrubbed ─► concerns / feedback_channel / ratings table (concerns: visible via aggregator; feedback: operator-private triage; ratings: on canonical) (any artefact) → quarantine via D1 supersession or git revert; maintainer reviews ≥1 injection_flag from a non-submitter ``` **Naming discipline.** The `draft` feedback type is the **submission** that introduces a new artefact; the resulting on-disk artefact's `status:` starts at `alpha`, NOT `draft`. PR-CI Rule 17 (§10.1 below) rejects any skill/path canonical commit authored via a `draft` submission whose on-disk artefact carries `status: draft`. Agent-facing prose should prefer "proposal" / "new-artefact proposal" to avoid confusion with `status: draft` (which is reserved for skeletons and pre-alpha content). ### 9.2 Promotion thresholds (first-pass per G.5) These numbers are tunable post-launch — revisit at end of v1's first 90 days based on observed validation volume. The same threshold table applies to skills and to catalogue rows (volatile values, references) per S12. | Transition | Conditions | |------------|------------| | `draft → alpha` | For skills and paths: 24h staging window elapses on the `draft` submission; Worker opens PR; PR-CI green; the maintainer reviews and merges (S31; the `draft` flow is the one feedback type that requires maintainer review). For catalogue rows: row INSERTed in D1 directly at `status: alpha`. | | `alpha → beta` | ≥3 confirms, 0 rejects, ≥48h since cohort start, validations from ≥3 distinct IPs (per-artefact salted hash) | | `beta → stable` | ≥10 confirms, ≥14 days since cohort start, confirm rate >85%, ≥10 distinct IPs | | Rollback (any → cohort restart) | Rejects exceed confirms by ≥2 — for skill / path bodies: state machine opens a PR reverting `canonical.md` or `paths/index.json` to the prior stable (via `git revert`) and resetting `status` to the prior stable's status. For catalogue rows: state machine performs a fresh INSERT in D1 with the prior stable's value (or null if no prior stable) and supersedes the rejected row. | | Quarantine | ≥1 `injection_flag: true` from a non-submitter validator (per G.6). For skills / paths: `git revert` + maintainer-review issue opened; the reverted body is the served body. For catalogue rows: D1 supersession + maintainer-review issue. | **Cohort-start anchor (S22, refined 2026-05-15).** The `cohort_started_at` timestamp is the moment the artefact's current content was last committed (skills: the last commit to `canonical.md` that bumped `version`; paths: the last commit to the entry that bumped `version`; catalogue rows: the row's `committed_at`). Threshold time anchors (`≥48h`, `≥14 days`) are measured from `cohort_started_at`, not from the first validation. A maintenance edit (typo, citation refresh, whitespace) on `stable` content keeps the cohort because `version` is unchanged (S25); a substantive edit bumps `version` and resets the cohort to `alpha`. **`cohort_anchor` Worker-stamp (C1, added 2026-05-15).** On every staged row whose `target_type ∈ {skill, path}`, the Worker stamps `cohort_anchor: @` between cross-ref (submit pipeline step 6) and timing (step 7). The Worker reads the current `version:` from the targeted canonical at staging time; agents never carry `cohort_anchor` (the schema rejects it as `additionalProperty`). The state-machine cron uses `cohort_anchor` as the canonical key for "which cohort does this validation belong to" — rows whose `cohort_anchor` doesn't match the artefact's current `@` are counted as historical (do not contribute to the current cohort). **Validation cohort = matched `cohort_anchor`.** Validations whose `cohort_anchor` does not match the artefact's current `@` do not count toward the current cohort. (Per pre-2026-05-15 semantics this was expressed as "validations submitted before `cohort_started_at` do not count"; the new `cohort_anchor` field is the precise mechanism.) **Distinct-IP counting** uses the **per-artefact salt** described in §8.3 (see privacy.md) (self-validation prevention subsection). Because the per-artefact salt is stable for the artefact's lifetime in `alpha`/`beta`, validations submitted on different days from the same IP hash to the same value and are correctly de-duplicated. The daily-rotating salt is used only for rate-limit counters; it is **not** used for state-machine distinct-IP counting. The per-artefact IP record is destroyed when the artefact reaches `stable` or is superseded. **Rollback mechanism (S29, S36).** - **Catalogue rows (D1).** A rollback is a fresh INSERT — the state machine inserts a new row carrying the prior stable's content (or a null/superseded marker if no prior stable existed) and sets the rejected row's `superseded_at = now()`. History is preserved: querying `WHERE uid = X ORDER BY committed_at` returns the full chain. The `GET /api/volatile-values//history` and `GET /api/references//history` endpoints (S36) expose this directly. - **Skill bodies (Git).** A rollback is `git revert ` on `canonical.md`. History is preserved in the Git log; `GET /api/skills//history` (S36) surfaces the commit chain via the GitHub API. The post-revert body returns to its prior `status` (typically the last stable). A future re-attempt at the rejected change opens a fresh PR; there is no "un-revert" shortcut in v1. **Quarantine.** A non-submitter validator setting `injection_flag: true` triggers immediate quarantine: the offending content is rolled back via the same mechanism (D1 supersession or `git revert`), and a maintainer-review issue is opened. The maintainer reviews; if the flag was bogus, the flagger's IP-hash receives a stronger rate limit (or permanent ban on repeat), and the maintainer reinstates the content via a fresh INSERT (D1) or PR (skill body). (Per G.6.) ### 9.3 State-machine Action A small Action (`state-machine.yml`, calling `tools/scripts/state-machine-tick.ts`) runs on a scheduled basis (every 5–15 minutes) and on D1 webhook (when one becomes available). It: 1. Queries D1 for all artefacts at `status: alpha` or `beta` (skills via the skills index; paths via the paths index; catalogue rows directly) 2. For each artefact, computes aggregates from D1: confirms, rejects, distinct IP hashes, elapsed time since `cohort_started_at`, injection flags. Cohort membership is determined by `cohort_anchor = @` matching the artefact's current version (per §9.2 `cohort_anchor` Worker-stamp rule). 3. Applies the threshold table (§9.2) deterministically 4. On `→ beta` / `→ stable` for a catalogue row: UPDATE the row's `status` directly in D1 5. On `→ beta` / `→ stable` for a skill or path: open a PR editing the canonical's frontmatter `status` field (or the path entry's `status` in `paths/index.json`); PR-CI green ⇒ auto-merge. The state-machine bot is the author identity (`Be Civic Bot `); the bot bundles status flips with version bumps in a single PR per the auto-version-bumping amendment (see §9.7 below). 6. On rollback: D1 supersession (catalogue rows) or `git revert` PR (skills / paths); auto-merge on PR-CI green 7. On quarantine: same mechanism as rollback PLUS open a maintainer-review issue (no PII echoed) 8. The Action is excluded from triggering itself (D1 UPDATEs and PRs authored by the state-machine bot do not retrigger it). `[skip ci]` discipline on bot-authored commits follows the `uid-fill.yml` precedent. The logic is purely deterministic; no LLM. The validation-as-LLM-judgment work happens consumer-side at validation submission time. Maintainer review is required only on `draft` PRs (new skill / path creation, S31). ### 9.4 Optional learning-from-concerns job (deferred to v1.1) A future job may extract patterns from concern clusters (volatile-value drift, citation-rot accumulation, repeated commune-specific caveats) into proposed `amendment` submissions or catalogue corrections, which then enter the state machine like any other change. This is **deferred to v1.1**; v1 corpus growth does not depend on it. Per-skill thresholds and recency windows for any such job would live in `tools/compaction/config.json` with PR-reviewed prompt versions in `tools/compaction/prompts/v.md`. Architecture is sketched but not built. ### 9.5 Path and path-source lifecycle Paths follow the same `draft → alpha → beta → stable` state machine as skills (§9.1, §9.2). Promotion thresholds and cohort semantics are identical, with separate cohorts per path entry. Sources within a path do not carry their own `status` field; source health is consensus-driven and expressed at the path-entry level. When the validation cohort for a path entry accumulates rejects exceeding confirms by two or more across all its sources, the state-machine Action transitions the **path entry's** `status` to `deprecated`. When all sources of a path have been rejected by the validation cohort, the path entry transitions to `status: deprecated`. When the path entry transitions to `status: deprecated`, the agent surfaces this to the customer at traversal time with a message in the following form: > "Be Civic doesn't currently have a working online way to obtain this document. Would you like to visit your commune instead?" This message is surfaced only when no source of any kind (including offline sources) remains viable. If an `offline` source is still available, the agent offers it as the traversal option rather than surfacing the deprecation notice. Post-2026-05-15 normalization, path submissions land via the unified `amendment` (target_type=path | path_source) and `draft` (target_type=path) shapes — there are no longer separate `path_amendment` / `path_draft` types. The Worker routes by `target_type` to the appropriate D1 table or repository path. The state-machine Action (§9.3) treats path artefacts identically to skill artefacts for cohort tracking and threshold-driven promotion; no separate tick logic is required. ### 9.6 PR composition — research-report sidecars on submissions carrying provenance When a submission carries `provenance.research_notes_markdown` (see schemas.md §6.2.2 / §6.2.4), the Worker's PR composition extends to include a `research-report.md` sidecar. Sidecar placement and write mode depend on the submission's (type, target_type) cell: | `(type, target_type)` | Sidecar path | Write mode | |---|---|---| | `(draft, skill)` | `bc-docs/skills//research-report.md` | Create — written alongside `canonical.md` in the same PR; both files land on `main` together. | | `(draft, path)` | `bc-docs/paths/research-reports/.md` | Create — written alongside the `bc-docs/paths/index.json` entry insert in the same PR; both land on `main` together. | | `(amendment, skill)` | `bc-docs/skills//research-report.md` | Append — Worker appends a dated section to the existing `research-report.md` in the same PR that applies the amendment to `canonical.md`. If no file exists yet (legacy skill predating this convention), the Worker creates it with just the new section. | | `(amendment, path)` / `(amendment, path_source)` | `bc-docs/paths/research-reports/.md` | Append — Worker appends a dated section to the existing path research-report in the same PR that applies the field edit or source add to `paths/index.json`. Creates if absent. | **Atomicity.** The canonical (or path entry) and the sidecar share a single PR and a single commit on `main`; they MUST land or fail together. PR-CI failure on either file fails the PR. **Sidecar content.** On `draft` (target_type=skill | path), the sidecar is the scrubbed `provenance.research_notes_markdown` with a frontmatter header carrying `kind`, `submitted_at`, `session_count`, `first_session_at`, `last_session_at`, `verified_corpus_refs`, and `research_sources` (per schemas.md §6.2.4). On `amendment` (target_type=skill | path | path_source), the appended section uses an H2 heading like `## 2026-05-14 — amendment ` followed by the same metadata in a fenced YAML block, then the scrubbed body. **Auto-merge interaction.** `amendment` PRs (target_type=skill | path | path_source) auto-merge on green PR-CI per §10.1; the sidecar file diff is part of the auto-merge surface and PR-CI gates on both the canonical edit AND the appended section (Layer 2 scrub re-applied to the appended content, size cap re-checked after append). `draft` PRs (target_type=skill | path) require maintainer review per S31 and §10.1 regardless of sidecar presence; the sidecar does not change the review gate. **24h staging.** Staging-window semantics (§6.2 in schemas.md) apply to the submission as a whole; provenance content stages with the submission and is not separately gated. **PR-CI checks on sidecars:** - File size ≤50KB after the write (`draft`) or after the append (`amendment`). - New content passes Layer 2 regex scrub. - Every URL in `provenance.research_sources[]` resolves (well-formedness; reachability deferred to the monthly linkcheck Action per §11). **Legacy canonicals.** When an `amendment` (target_type=skill | path | path_source) with provenance targets a canonical authored before this convention and no `research-report.md` exists yet, the Worker creates the file with just the new dated section. Subsequent amendments append. No retroactive backfill of pre-convention history is required; the maintainer-side `bc-corpus-creator` walks remain the path for richer backfill. ### 9.7 Auto-version-bumping workflow (added by `2026-05-15-auto-version-bumping.md`) A scheduled `version-bump.yml` workflow inspects every commit on `main` and bumps the artefact's `version:` frontmatter (skill canonicals) or path-entry `version:` (paths/index.json) according to a deterministic mapping driven by the artefact's `status:` field. The mapping: | Status | Version range | |---|---| | `draft` | `0.0.x` (patch increments per commit) | | `alpha` | `0.1.x` (patch increments per commit) | | `beta` | `0.2.x` (patch increments per commit) | | `stable` | `1.0.x` (patch increments per commit; sub-cohort changes per OPEN-1 stable-lock) | **Bot identity.** Same as the state-machine bot — `Be Civic Bot `. Differentiated via commit-message prefix (`version-bump: …`). The state-machine bot bundles status flips with version bumps in a single PR (OPEN-3 squash-merge collapse); only the bump observable on `main` post-merge counts as the canonical version. **Concurrency.** No cancel-in-progress (OPEN-2 locked). Per-push correctness over latency. **Quarantine demote.** Relies on the existing `previous_stable_sha` field on the skill's frontmatter (OPEN-5 locked); no new audit artefact. **Audit trail.** Git log on the version-bumping workflow's commits is the audit trail (OPEN-4 locked); no separate JSON artefact. **First-deploy migration.** When the workflow lands on a corpus where skills already carry hand-authored `version:` values that don't match the auto-bump rule, the workflow emits a warning (not an error) and proceeds (OPEN-6 locked). The operator may sweep manually with the optional rebase script shipped as a companion (OPEN-7). Full workflow spec lives in `bc-docs/.github/workflows/version-bump.yml`; the spec records the contract, not the implementation. ## 10. Branch policy and retraction ### 10.1 Branches and CI Single-branch flow on `main`. PRs target `main`, merged after review, skills are immediately live (subject to state-machine gating for non-canonical content). No `develop` branch. Consuming agents fetch skills from `main`. State-machine commits, communes-refresh PRs, and protocol-amendment PRs all target `main`. Branch protection: `main` requires PR review (self-review acceptable for solo maintainer). PRs into `main` run a CI suite checking: - Schema validity (skill frontmatter, concern / amendment / draft / validation / feedback-channel / rating fixtures — renamed from observation / skill-amendment / skill-draft / path-amendment / path-draft per the 2026-05-15 taxonomy normalization) - Citation URL well-formedness (not reachability — that's the monthly linkcheck Action) - Skills-index regeneration consistency (the index file in the PR must match what regeneration would produce — no hand-edits) - Cross-reference validation (`tools/scripts/validate-cross-refs.ts`): - Every skill's `submission_contract_version` resolves to an existing `docs/submission-contract-v.mdx` - Every skill's frontmatter `id` equals its parent folder name - **Frontmatter `status` consistency.** Every `canonical.md` carries `status ∈ {draft, alpha, beta, stable}`. Skeletons are at `draft` (body empty or marker-only). All other content is at one of `alpha`, `beta`, or `stable`. There is no `proposal_id`, `version_status`, `corpus_status`, or `proposals/` / `archive/` directory in round 6. - **State-transition validation** (on commits authored by the state-machine bot): - Threshold conditions in §9.2 must hold for the asserted transition - Distinct-IP counting honoured (queried from D1) - **Injection-flag quarantine check**: an artefact with ≥1 `injection_flag: true` validation from a non-submitter must have been quarantined (D1 supersession or `git revert`) and a maintainer-review issue opened - Every `requires:` field resolves to an existing skill_id (an existing `skills//canonical.md` folder). Skills at `status: deprecated` or `status: quarantined` are not valid targets; skills at any of the promotion statuses (`draft`, `alpha`, `beta`, `stable`) are valid (the consumer loads them at their current status; alpha banner applies recursively per §6.1 (see schemas.md)) - Type-matched inputs/outputs across the DAG; DAG is acyclic - Every `category` value matches the `^[a-z][a-z0-9-]+(-[a-z][a-z0-9-]+)*$` regex; deterministic Levenshtein guards (auto-extension on first commit; monthly audit) - Every `superseded_by` resolves to an existing skill - **Rule 7 — `authority_id` resolves.** When a skill's frontmatter carries `authority_id`, the value resolves to a top-level entry id in `data/authorities.json` - **Rule 8 — `` and `` tag uids resolve.** Every `value` and `label` tag in body MDX resolves to an existing row in D1 (queried via `/api/volatile-values/` or `/api/references/`). Tags whose `uid` is empty (authoring stage, before PR-CI uid assignment) are accepted on the authoring branch but rewritten by PR-CI before merge (§6.11 (see schemas.md)) - **Rule 9 — `requires.selects_on` values resolve.** Each value supplied for a `selects_on` key (`region`, `origin_country`, `sponsor_type`, `entry_type`, `card_outcome`) resolves to the corresponding enum in `schemas/types.json`. `origin_country` is open against ISO-3166-1 alpha-2 lowercase. The validator emits a warning (not an error) when `schemas/types.json` is absent, supporting phased corpus migration - **Rule 10 — `origin` invariant.** Every skill carries `origin ∈ {be-civic, community}`. Skeletons (status: draft) MAY have `origin: be-civic` only; community drafts are not skeletons by definition. - **Rule 11 — `(name, uid)` consistency on tags.** For every `value` and `label` tag in body MDX, the row in D1 with `uid = Y` MUST carry `name = X`. PR-CI rewrites the tag's `name` to match D1 when D1's row was renamed since the tag was authored; PR-CI fails the PR when the tag's name + uid combination is inconsistent in a way that can't be auto-resolved. (Rename safety is in §6.11.) - **Rule 12 — agents never author uids.** PR-CI rejects any commit where a `` or `` tag's `uid` was filled by a non-bot author identity (the PR validator inspects `git blame` against the line; a uid filled in the same commit as the tag's introduction by a non-bot author is a Rule-12 failure). Authors leave `uid` empty; PR-CI fills it via the `/api/_internal/catalogue-entries` endpoint (§6.11 (see schemas.md)). - **Rule 13 — wrapper-tag form on first citation.** Every first appearance of a volatile value or reference in a skill body MUST use the wrapper-tag form `` or `` per §6.10 (see schemas.md). Self-closing forms (``, ``) are rejected on first appearance. Subsequent re-citations of the same reference within the same body MAY use the bibliography-style `` `[ref-id]` `` bracket form per S81. - **Rule 14 — tag-only edits do not bump `version`.** When a PR changes ONLY tag-form (e.g. converts `` `[ref-id]` `` brackets to `` wrapper tags, or populates empty children with the current catalogue value), the skill's `version` field MUST NOT bump per §6.1 cohort-reset rules (the semantic content is unchanged). The validator detects tag-only diffs by comparing the post-edit body with the pre-edit body after stripping tag wrappers; if the stripped bodies are equal, the edit is tag-only. A PR that bumps `version` on a tag-only diff fails Rule 14. See S83. - **Rule 15 — `target_type` / `target_id` consistency on every submission** (added 2026-05-15). PR-CI rejects submission rows in D1 where `target_type` is not in the permitted set for that submission's `type` (per the §6.2 target_type table in `schemas.md`), or where `target_id` does not resolve to a live artefact in the right table/path. **Carve-out:** `target_type=skill_graph` (concern only) MAY have an unresolved or empty `target_id` — the corpus-graph-itself-has-a-gap signal. Cross-ref step 6 enforces the resolution at submission time with the `skill_graph` short-circuit; Rule 15 lifts the same check into the formal PR-CI rule list so that any committed row that fails the invariant (e.g. due to a manual D1 edit or a Worker bug) is rejected at PR-CI. - **Rule 16 — DROPPED (cohort_stats are render-time-derived, not frontmatter-materialised).** Pass 1 of the 2026-05-15 proposal proposed a PR-CI rule rejecting any skill commit that authored `cohort_stats:` in frontmatter. Per locked G4, cohort_stats are NOT materialised in canonical frontmatter — the `` element is composed at render time from D1. There is no frontmatter field to author; Rule 16 is unnecessary and dropped from the rule list. (Rule numbering preserved; the slot stays at 16 to keep downstream rule references stable.) - **Rule 17 — `draft` submission type vs `status: draft` distinction** (added 2026-05-15). PR-CI rejects skill / path canonical commits where the resulting on-disk artefact has `status: draft` AND was authored via a `draft` submission. A `draft` submission produces an `alpha`-status artefact on commit (per §6.2.4 in `schemas.md`); `status: draft` is reserved for skeletons and pre-alpha content authored directly by maintainers. The validator inspects the PR's commit author identity + the new file's `status:` frontmatter; on conflict it fails the PR with an `invalid_status_for_draft_submission` error. - **Rule 18 — auto-bump consistency** (added 2026-05-15 by `2026-05-15-auto-version-bumping.md`; numbered Rule 18 to keep Rule 15 anchored on target_type consistency per the prior taxonomy amendment). PR-CI checks every modified canonical / path-entry's `version` against its `status` and the prior commit's `version`. The post-commit `version`'s major.minor MUST equal the expected major.minor for `status` per §6.1 (see schemas.md). The patch increment MUST be either `(prior_patch + 1)` (within-major.minor edit), `0` (status flip in the same commit), or unchanged (tag-only edit per Rule 14, or `version_pin: true` per §6.1 override). Any other `version` value fails the PR with an `invalid_version_for_status` error. The version-bump workflow's own commits ARE subject to Rule 18 (defence against bumper bugs); the rule is verifiable purely from the diff so the workflow's commits pass cleanly when the bumper is correct. Author identity is recognised via `git author.email == 'bot@becivic.be'`; the validator silently accepts the bumper's correctly-formed `[skip ci]` commits and applies Rule 18 to the diff regardless of author. - Every fixture in `tests/fixtures//invalid/` violates exactly one rule and has a corresponding rule in the relevant schema or `regex-rules.json` **Cloudflare Workers preview deployments** auto-run for PRs that change `bc-infra/site/` (renderer + router) or `bc-infra/api/` (staging Worker). **Renderer preview** auto-runs for PRs that change content (skills, docs, mdx files in bc-docs). **Path catalogue PR policy.** Path catalogue entries follow the same auto-merge PR policy as skill amendments. Path drafts (new path entries, submitted via `draft` with `target_type=path`) require maintainer review before merging, matching the `target_type=skill` draft policy (S31). Source amendments (updates to URLs, eligibility predicates, `validation_path` shapes, or `priority` values within an existing path entry, submitted via `amendment` with `target_type=path | path_source`) auto-merge on green PR-CI, matching the skill-amendment auto-merge path. The rationale is the same: new entries carry authorial intent risk that warrants a maintainer check; amendments to existing entries are narrow changes that validators and automated checks can gate adequately. ### 10.2 Skill retraction and rollback **Status field** — skill `status` encodes both the promotion lifecycle and the terminal maintainer states in a single 6-value enum (`draft | alpha | beta | stable | quarantined | deprecated`): - `draft | alpha | beta | stable` — the promotion lifecycle; consuming agents render skills at all four states (alpha and beta carry in-page banners per §6.1 (see schemas.md)) - `deprecated` — superseded or no longer recommended; consuming agents render with a prominent warning; state machine skips deprecated skills; `superseded_by` points at the replacement - `quarantined` — content is harmful or fundamentally wrong; skill body is pulled for cause (history preserved in Git); consuming agents refuse to render; state machine skips; audit-only **Skill rollback** (a content rejection or quarantine, distinct from retraction) is `git revert` against `canonical.md`, executed automatically by the state-machine bot when validation thresholds trip (§9.2) or when a non-submitter validator sets `injection_flag: true`. The post-revert body returns to its prior `status`; the cohort restarts. **Catalogue-row rollback** (volatile values, references) is D1 supersession: a fresh INSERT carrying the prior stable's content marks the rejected row's `superseded_at`. History is preserved (S29). Retraction does not delete files; rollback does not delete history. Git history is the audit trail for skill bodies; D1's `superseded_at` chain is the audit trail for catalogue rows. A quarantined skill may carry any prior `status` value in the audit trail (the quarantine notice replaces the served body); amendments against a quarantined skill are blocked at the Worker. ### 10.3 Submission retraction If a committed submission is later found to contain PII (slipped through both consumer-side and receiving-side scrub AND was released by the maintainer from NER hold as a false positive), the only remediation that removes it from the public record is destructive history rewrite via `git filter-repo`. This requires: - Maintainer acknowledgement and a Tier C-flagged decision - Force-push to `main` (the one canonical exception to "no force pushes") - Notification to consumers that they must refresh - An incident note in `docs/incidents/-.md` (PII details obviously not included) Pre-emption is the protection; rewrite is the last resort. Documented in `docs/retraction-protocol.md`. ## 11. Citation rot ### 11.1 Source rot Path sources (URLs, deeplinks, form addresses) are inherently rot-prone. URL changes from upstream government portals are the dominant rot vector for the Path Directory. The following mitigations apply: **Validation submissions.** Every successful or failed traversal generates a `validation` submission with `target_type=path_source` and the `traversal_metadata` block (per §6.2.3 in `schemas.md` — single shape; pre-2026-05-15 the same wire was named `path_validation`). Aggregate validations drive promotion (toward `stable`) or demotion (toward `deprecated`) using the same threshold table as skill artefacts (§9.2). When the validation cohort accumulates rejects exceeding confirms by two or more, the state-machine Action transitions the parent path entry to `status: deprecated`. **CI probes.** For sources without `audited_document_delivery: true`, CI runs periodic non-mutating probes (HTTP HEAD request, page-load with success-signal check) and submits machine-authored `validation` records on the same schema as customer-submitted validations. Machine-authored validations carry a `submitter_type: ci_probe` marker; the state machine counts them toward the cohort but applies a lower per-record weight than customer-confirmed validations (weight configuration in `tools/compaction/config.json`). **30-day staleness flag.** A source whose `last_validated` timestamp is more than 30 days old is flagged by the renderer with a `stale` badge but remains in the catalogue and continues to be offered to customers. After 90 days without a confirming validation of any kind (customer-submitted or CI probe) across all sources of a path entry, the state machine transitions the path entry to `status: deprecated`. The path entry's `last_validated` reflects the most recent confirming validation across any of its sources. **Audited deliveries.** Sources flagged `audited_document_delivery: true` MUST NOT be CI-probed: each call generates a real, audited document delivery on a live government system. These sources rely entirely on customer-driven validation submissions. The renderer surfaces "last user confirmation" prominently for these sources, in place of the CI-probe-derived last-validated timestamp, so customers can assess freshness before consenting to an audited delivery. **Source-class template rot.** When a `source_class` itself changes (for example, a government portal restructures its deeplink scheme), all sources of that class are flagged for re-validation regardless of their individual `last_validated` timestamps. This is a maintainer-triggered operation recorded as an `amendment` (target_type=path_source) submission to the spec. ## 12. Test fixtures `tests/fixtures/` contains synthetic data exercising every validator and detector. Updated alongside any change to detection logic. - `concerns/valid/*.json` — realistic, fully scrubbed concerns that pass all validators. Cover every `target_type` (skill / volatile_value / reference / path / path_source / skill_graph), every `region`, single and multi-cell cases. Renamed from `observations/valid/` per the 2026-05-15 taxonomy normalization. - `concerns/invalid/*.json` — each file violates exactly one rule, named for the rule (`schema-missing-target-id.json`, `pii-nrn-in-body.json`, `skill_graph_with_resolvable_target_id.json`, etc.). - `amendments/valid//*.json` — clean amendments grouped by `target_type`: `skill/` covers body (unified-diff) and frontmatter subtypes; `path/` and `path_source/` cover field_edit and source_add; `volatile_value/` and `reference/` cover the fast-path scalar shapes. - `amendments/invalid//*.json` — each violates exactly one rule (capability tier mismatch, oversized rationale, identity-shaped field, `skill_commit` drift, source_class template non-conformance, etc.). - `drafts/valid//*.json` — `skill/` covers full main-skill and sub-skill drafts; `path/` covers full path-entry drafts. - `drafts/invalid//*.json` — each violates one rule (missing required frontmatter, oversized commit_message, missing `requires` resolution, proposed_id already exists, etc.). - `validations/valid/*.json` — confirm + reject + injection_flag examples for every `target_type`, against active submission IDs in fixtures. - `feedback_channel/valid/*.json` and `feedback_channel/invalid/*.json` — bug, suggestion, praise, confusion, accessibility shapes (new in 2026-05-15 amendment). - `ratings/valid/*.json` and `ratings/invalid/*.json` — three-axis star fixtures per §6.2.7 (Lock A, sprint 2026-W23). - `validations/invalid/*.json` — self-validation attempt, oversized `rationale` or `injection_reason`, missing `rationale` on reject, missing `injection_reason` when `injection_flag: true`, etc. - `pii-samples/*.txt` — text snippets with known PII, accompanied by `.expected.json` listing detector hits. - `nrn-checksum/*.json` — known valid and invalid Belgian NRN values with expected outcomes. A dedicated test-fixtures Action runs these on every PR touching `tools/scrub/`, `schemas/`, or any submission schema (separate from the per-commit NER Action). **Path entry fixtures.** Path entries are tested via two complementary fixture types: 1. **Per-source `validation_path` fixtures** (in `tests/fixtures/path-sources/`) — HTTP response samples and page-text samples paired with each source's `success_signals` and `failure_signals`. The validator checks that the signal classifiers correctly identify each sample as `success`, `failure`, or `ambiguous`. Every `source_class` template MUST have at least one valid and one invalid sample in this directory. 2. **Per-path eligibility fixtures** (in `tests/fixtures/path-eligibility/`) — sample `profile.json` objects with varying `region`, `civic_status`, `residency_status`, and `commune_nis5` values, paired with expected source-inclusion outcomes. The validator confirms that each path's eligibility predicates include and exclude sources correctly for each sample profile. Edge cases MUST include a profile that matches zero sources (expected outcome: `no-eligible-sources`) and a profile that matches multiple sources in priority order. Path fixture validation runs as part of the existing test-fixtures Action. No separate Action is required. **Running tests locally.** All test runners are TypeScript and invoked via `npx tsx`: | Suite | Command | |-------|---------| | Schema validation against fixtures | `npx tsx tools/scripts/validate-fixtures.ts` | | Cross-reference validation | `npx tsx tools/scripts/validate-cross-refs.ts` | | State-machine logic against synthetic validations | `npx tsx tools/scripts/state-machine-tick.ts --dry-run` | | PII-samples regression | `npx tsx tools/scripts/validate-pii-samples.ts` | | NRN checksum implementation | `npx tsx tools/scripts/validate-nrn-checksum.ts` | | Category audit | `npx tsx tools/scripts/audit-categories.ts` | ## Cross-references _Cross-doc references are inlined throughout this document in the form §X.Y (see .md). The list below was the pre-reconciliation manifest from the 2026-05-11 split, retained for audit; it can be deleted at the next split-or-merge cycle._ - §3 (Non-negotiable principles) — see `architecture.md` §3 - §6.1 (Skill schema / status enum / alpha banner) — see `schemas.md` §6.1 - §6.2 (Submission schemas / staging windows) — see `schemas.md` §6.2 - §6.2.4 (Validation submission / immediate D1 write) — see `schemas.md` §6.2.4 - §6.3 (Volatile values / INSERT-with-supersede) — see `schemas.md` §6.3 - §6.11 (Catalogue UID convention / PR-CI uid assignment) — see `schemas.md` §6.11 - §6.12 (Path Directory schema — new section) — see `schemas.md` §6.12 - §7 (Trust model / maintainer review queue) — see `protocol.md` §7 - §8.3 (Per-artefact salted IP hash for self-validation prevention) — see `privacy.md` §8.3 - §8.7.2.2 (path_history files) — see `privacy.md` §8.7.2.2 - §8.10.4 (Paths anonymous-by-construction) — see `privacy.md` §8.10.4 - §15.1 (Skill-drafting protocol / submission flow) — see `skills.md` §15.1