dual-research · spec dashboard

read-only view of specs/, handoffs/ and dashboard/events/ at main

v1.68.0 liveupdated 2026-05-31 15:51 UTCrepo ↗

pause_circle

Queue · idle

Nothing in flight. Run /dev-next in your queue session to start the next spec.

0 in flight0 queuedlast shipped 58s ago

58s ago

last deploy · 0260

Drafts

ideation

Queued

pending · queue empty

In flight

idle

Shipped

124

all-time · 120 with full timings

Avg cycle (last 10)

12m 49s

↓ 4m 35s vs prior 10

Queue

/dev-next picks position 1 first

Queue is empty. Promote a draft with /spec-promote <id> or queue a new one with /spec-queue.

Recent activity

last 24 hours

15:50:29 UTC

bookmark

handoff written0260 · handoffs/2026-05-31-spec-0260-name-flag-doubles-as-run-display-title.md

—

15:50:10 UTC

circle

deploy health check ok0260

—

15:49:59 UTC

check_circle

deployed0260 · Make --name double as the run display title by overriding brief.md's first H1 · v1.68.0

—

15:48:10 UTC

circle

deploy started0260

—

15:48:05 UTC

merge

merged0260 · PR #302 · admin squash

—

15:47:40 UTC

merge

pr opened0260 · https://github.com/Lexiz/dual-research/pull/302

—

15:46:54 UTC

task_alt

tests green0260 · 2454 passed

—

15:46:15 UTC

circle

tests started0260

—

15:46:14 UTC

rule

implement0260 · 0 commits

—

15:44:42 UTC

flag_circle

branched0260 · branch spec/0260-name-flag-doubles-as-run-display-title

—

Drafts

backlog · promote with /spec-promote <id>

001

JSX cache-bust query strategy is fragile during in-spec iteration

unclassified

park · 5d 15h

002

User-facing classifier over-flags entries that cite UI files only for context

unclassified

park · 5d 15h

003

`/dev-next` skill: filter the queue head on `disposition: ship` so backfilled-as-`archive` carve-outs are no longer picked

unclassified

park · 4d 15h

004

`/spec-queue` + `/spec-promote` + `/spec-draft` skills: prompt for `disposition` interactively rather than relying on the validator error to surface omissions after the fact

unclassified

park · 4d 15h

008

Spec 0257 §6.3 live acceptance re-run — backend-language-choice, scored via /dr-run-assess

test

park · 1d 15h

All specs

124 shipped · 0 queued · 0 in flight

Spec

Title

Type

Status

Lifetime

Cycle

0260

Make --name double as the run display title by overriding brief.md's first H1

new-feature

deployed

15h 50m

6m 8s

0259

Refactor: delete the dead legacy phase3 runner (orchestrator/phase3.py:run_phase3)

refactoring

deployed

7h 38m

7m 37s

0258

Flag spec citations unreachable from the live entry points at reconcile time

new-feature

deployed

1d 7h

9m 38s

↳ 02570257.2

Fix: raiser self-ADDRESSes its own already-addressed items (group-3 instruction omits the no-self-address prohibition)

bug

deployed

4h 56m

5m 36s

↳ 02570257.1

Refactor: delete the dead legacy standing-items surface (build_standing_items_section + legacy phase2/phase4 runners)

refactoring

deployed

1d 4h

12m 57s

0257

Make the phase-2/4 standing-items surface role-aware so the addressee ADDRESSes and the raiser stops self-addressing and self-resolving

new-feature

deployed

20h 33m

19m 2s

0256

Guard the drafter-delta apply path against anchor failure, resync the drafter on no-op, and tighten the anchor contract (strict, not fuzzy)

new-feature

deployed

12h 49m

17m 20s

0255

Decouple the closeout-urge gate from effective status: fire should_urge_closeout on RAW self-reported AGREED so a spec-0229 addressee-obligation demotion no longer disarms the closeout → ghost-cap escape valve and deadlocks phase 2/4 to the hard cap

new-feature

deployed

14h 24m

18m 47s

0254

Fix: parked spec's activity event renders as QUEUED instead of Parked

bug

deployed

9h 12m

13m 25s

↳ 02530253.1

Fix: supabase detail path lacks the zero-event creation-time floor the list path now has

bug

parked

—

0253

Fix: All-Runs card status diverges from run-detail status for abandoned runs

bug

deployed

8h 19m

17m 49s

↳ 02520252.2

Fix: backfill-critique --push times out re-uploading whole session; push metrics-only

bug

deployed

7h 56m

9m 42s

↳ 02520252.1

Refactor: delete orphaned DesignLanguageButton + ActiveRunChip declarations

refactoring

deployed

14m 49s

11m 45s

0252

All Runs Comments tally + critique backfill CLI, universal chrome, and nav-label wrap fix

new-feature

deployed

24m 25s

0251

Make the carve-out disposition gate executable in the queue picker

new-feature

deployed

23h 11m

17m 28s

0250

Refactor: buffer /dev-next pre-merge telemetry into two batched flushes

refactoring

deployed

22h 50m

12m 51s

0249

Fix: path-filter deploy.yml so only image-affecting commits deploy

bug

deployed

22h 18m

11m 9s

0248

All Runs card refinements — chrome cleanup, avatar menu restore, inline archive tray, note-width fix, rich provider metric bands

new-feature

deployed

21h 30m

49m 21s

↳ 02470247.1

Refactor: make /dev-next step-18 merged buffer survive step-19's force re-detach

refactoring

deployed

20h 37m

13m 1s

0247

Refactor: stop /dev-next pre-merge step-18 --push-to-main commits from racing the merge-commit deploy

refactoring

deployed

20h 14m

14m 33s

↳ 02460246.1

Refactor: reconcile the theme-persistence localStorage key (app `dr.theme` vs spec 0246 mock `dr-theme`)

refactoring

deployed

19h 25m

10m 2s

0246

All Runs landing page — card layout rewrite + summary stats panel

new-feature

deployed

19h 2m

40m 21s

0245

Soft-delete on runs + admin hover-archive on the runs list

new-feature

deployed

16h 45m

41m 4s

0244

Verifier: promote I2.6 (RAISED cross-check), I2.7 (empty-turn retry hardening), I2.8 (turn termination) from reporting → gating, atomic + reversible

new-feature

deployed

12h 8m

21m 21s

0243

Operational guard: dual-research CLI refuses to run inside Claude Code (env-var detection + env-var escape + CLAUDE.md rule + skill amend)

new-feature

deployed

11h 41m

17m 27s

0241

Orchestrator: per-turn liveness — heartbeat thread, per-turn BaseException capture, whole-turn cap wrapping stream consumption; verifier I2.8

new-feature

deployed

23h 33m

18m 49s

0240

Fixture regen machinery + I2.6 verdict flip on the 142625 fixture (unblocks I2.6 → gating promotion)

new-feature

deployed

23h 8m

19m 5s

0239

Orchestrator: bound `empty_turn_detected` retries — fail-fast on byte-identical retry input; hard-cap at N=2 per (agent, phase, round)

new-feature

deployed

19h 55m

22m 40s

0238

Parser: consolidate every specific-literal-heading anchor onto one tolerant `section_heading_re` primitive in markers.py

new-feature

deployed

19h 3m

20m 54s

0237

Refactor: stop losing cycle-start anchor events on the first `--push-to-main` resync by adding a `queue_state flush` subcommand and wiring `/dev-next` to call it at step 12

refactoring

deferred

—

0236

Spec dashboard renders the `disposition` field on each spec card so backfilled-as-`archive` carve-outs are visually distinguishable from `disposition: ship` queue heads

new-feature

deferred

—

0235

Refactor: split `backfill_disposition.py` reason-text by spec-ID era so post-0229 specs no longer carry the misleading `Pre-spec-0229 carve-out` marker

refactoring

deferred

—

0234

Refactor: amend spec 0229.1 §3 acceptance criterion to scope the post-backfill validator claim to disposition-class checks only, aligning the spec text with the test that ships (`test_no_existing_spec_fails_specifically_on_disposition`)

refactoring

deferred

—

0233

Refactor: add `disposition:` + `disposition_reason:` placeholders to all five spec templates so verbatim-template authors don't trip the spec 0229.1 validator gate

refactoring

deployed

1d 13h

14m 38s

0232

Verifier I2.6 — STATUS-RAISED-array event cross-check (reporting)

new-feature

deployed

18h 28m

50m 28s

↳ 02310231.1

Tests: regenerate the four anchor-run `expected.json` baselines post-0231 parser fix and add the missing `test_snapshot_*` assertion for the 20260527-054652 fixture

test

deferred

—

0231

Parser heading tolerance and graceful repair fallback

new-feature

deployed

13h 20m

24m 28s

0230

Reconciler: configurable prefix-skip list (default `cowork/`) treats out-of-tree citations as informational, eliminating the spurious exit-3 class for specs citing external Cowork briefs

new-feature

deployed

12h 30m

10m 50s

↳ 02290229.1

Validator: enforce the `disposition:` + `disposition_reason:` frontmatter fields on dev specs and drafts — operationalises the carve-out-disposition convention shipped doctrinally in spec 0229 §2.5

new-feature

deployed

9h 55m

23m 49s

0229

Enforce addressee-obligation: emit ProtocolViolation on AGREED with open addressed-at-me items; surface those items in closeout requests; promote verifier I2.4 from reporting to gating; codify the carve-out-disposition convention

new-feature

deployed

13m 44s

20m 16s

0228

Emit ProtocolViolation when an op is silently dropped for a state-machine reason; promote verifier invariant I4.4 from reporting to gating

new-feature

deployed

19m 58s

↳ 02270227.1

Refactor: reconciler skips markdown-link display-text citations so external `../cowork/...` href references stop producing spurious exit-3 semantic drift

refactoring

deployed

8m 48s

0227

Refactor: reclassify 4 contract-amending specs (0137 / 0140 / 0218 / 0219) from `bug` + add CLAUDE.md contract-change process rule

refactoring

deployed

22h 6m

9m 9s

0225

Add lifecycle-trace verifier asserting the 0114 unified contract

new-feature

deployed

21h 18m

30m 3s

0224

Fix: CLI registers SIGTERM/SIGHUP/SIGINT cancel handlers so the 0222 tombstone fires on raw signal kills

bug

deployed

20h 32m

11m 12s

0223

Fix: Phase 4 turn cards drop Issues + Comments per-round Δ chips (Q→D→I→C contract violated)

bug

deployed

19h 34m

13m 54s

0222

Fix: liveness tombstone (BaseException + finally-fallback) and remove asymmetric stdout streaming

bug

deployed

18h 59m

23m 33s

0221

Fix: Document modal labels the converged/final draft tab "Output" instead of "Content"

bug

deployed

17h

13m 58s

↳ 02200220.1

Fix: Changelog side-menu version-num span collides with summary on long versions

bug

deployed

14h 29m

9m 51s

0220

In-app Changelog auto-generated from CHANGELOG.md

new-feature

deployed

14h 3m

59m 2s

0219

Fix: §3.2 section-delta drafter contract — reviewer/validator collision, heading-mismatch corruption, REPLACE_SECTION budget overrun, mid-phase resume waste

new-feature

deployed

8h 45m

32m 50s

0218

Fix: phase-4 drafter truncation destroys STATUS — STATUS-first ordering, section-delta drafter contract, and dr_run.py repair-flow wiring

breaking

deployed

26m 43s

↳ 02170217.1

Fix: tighten STATUS.RAISED_THIS_TURN prompt contract to canonical IDs only

bug

deployed

16h 13m

11m 26s

0217

Fix: phase-2/4 ledger reconstructors must honor STATUS.RESOLVED_THIS_TURN / WITHDRAWN_THIS_TURN

bug

deployed

15h 49m

30m 23s

0216

Make raiser self-ADDRESS visible via ProtocolViolation

bug

deployed

13h 27m

16m 31s

0215

Fix in-round partner blindness in list_turns

bug

deployed

13h 2m

13m 23s

0214

Drop Phase 4 agent-emitted SHA-256 gate; converge on version equality

bug

deployed

12h 41m

18m 10s

0213

Refactor: collapse 11-stage timeline to 7 honest spans + decimal sub-spec treatment

refactoring

deployed

11h 37m

23m 18s

↳ 02120212.1

Tests: lock the spec 0212 explicit-cleanup ordering invariant (re-detach must precede `git branch -D`)

test

deployed

10h 53m

10m 12s

0212

Refactor: /dev-next — eliminate the post-merge race window (split `gh pr merge` from branch-cleanup, buffer all post-merge `--push-to-main` writes until step 23)

refactoring

deployed

10h 29m

16m 15s

↳ 02110211.3

Refactor: /dev-next step 20 — survive deploy-main concurrency-group cancellation of the merge-commit run

refactoring

deployed

9h 57m

9m 24s

↳ 02110211.2

Refactor: /dev-next step 20 — capture MERGE_SHA at merge time, not after subsequent --push-to-main commits advance origin/main

refactoring

deployed

1d 9h

16m 15s

↳ 02110211.1

Refactor: scripts/sweep_stale_blues.sh — prefer flyctl over fly so the CI sweep step in deploy.yml actually runs

refactoring

deployed

1d 9h

24m 9s

0211

Refactor: /dev-next step 21 — delegate fly deploy to .github/workflows/deploy.yml (eliminate local-vs-GHA deploy race)

refactoring

deployed

13h 33m

15m 56s

0210

Refactor: /dev-next step 19/20/24 — eliminate worktree cross-conflict on main checkout (push-via-plumbing for handoff + archive, queue worktree never holds main)

refactoring

deployed

13h 14m

22m 34s

0209

Fix: CollapsibleSection bodies stay hidden when open due to legacy how-it-works CSS bleed

bug

deployed

12h 34m

22m 54s

0208

Fix: provider chip dim in critique — promote tone-claude/gpt override into the base Chip primitive

bug

deployed

12h 8m

20m 22s

0207

Fix: link-variant icon missing from registry — evidence/sources badges render as blank blue square

bug

deployed

11h 44m

7m 21s

0206

Tests: canonicalize the JSX-loaded-via-babel test doctrine (Playwright harness or source-pattern as the documented project pattern)

test

deployed

11h 30m

15m 18s

↳ 02050205.2

Refactor: remove dead .kind-tab / .kind-tabs CSS + shared.jsx variant="kind" / variant="kind-tabs" branches (post-spec-0205 cleanup)

refactoring

deployed

11h 7m

18m 20s

↳ 02050205.1

Tests: ItemCard 8-capture parity grid — refresh references against post-0205 anatomy and retroactively attach to spec 0205

test

deployed

10h 44m

39m 43s

0205

Fix: P4 critique card — five visual regressions (sources layout/icon, lifecycle-first ordering, coloured kind filter chips, raiser badge defaulting to System)

bug

deployed

3h 59m

29m 53s

0204

Timeline pane V2 — promote workshop lock to live (T1 card height, T2.a click/modal split, T2.b cost precision parity)

new-feature

deployed

3h 23m

23m 43s

↳ 02030203.2

Fix: cycle-mutable status readers — overlay queue-state.json in current_queue AND queue_drain_supervisor

bug

deployed

2h 54m

12m 32s

↳ 02030203.1

Fix: dashboard live API + L-spec event vocab parity with queue-state.json

bug

deployed

2h 17m

21m 45s

0203

Critique V2 → live promotion (C1–C8 + V2.A/B/C)

new-feature

deployed

1h 51m

1h 2m

0202

Main-commit noise reduction — queue-state file + handoff archival + event consolidation

new-feature

deployed

4m 21s

31m 7s

0201

Branch & safety hygiene — verified deletes, dirty-tree refusal, pre-push branch assertion

refactoring

deployed

10m 57s

0200

Fly deploy simplification — drop bluegreen, no-retry-on-failure rule, codify recovery path

refactoring

deployed

12m 56s

0199

Queue mechanics — decimal sub-numbering, promote-as-next, drop queue_position

new-feature

deployed

22h 47m

22m 41s

0198

Dev-spec quality guardrails — no questions, source traceability, BDD acceptance criteria

new-feature

deployed

22h 4m

20m 14s

0197

Refactor: rewrite dashboard mockup to use canonical DS token names so the parity allowlist shrinks to empty

refactoring

deployed

17h 17m

8m 18s

0196

Fix: authoring funnel `promo_pct` numerator counts all queued specs, blows past 100% when most specs skip the draft step

bug

deployed

17h 3m

8m 21s

0195

Fix: All-Runs renders `0s ago` for abandoned rows that have no transcript

bug

deployed

16h 52m

10m 1s

0194

Fix: build_headless_command passes --cwd which the installed claude CLI rejects

bug

deployed

15m 28s

9m 30s

0193

Stale-blue sweep filter — catch machines on an out-of-release image

new-feature

deployed

16h 39m

11m 13s

0192

L-spec checkpoint budget heuristic — wall-clock session age trigger

new-feature

deployed

16h 26m

8m 30s

0191

Refactor: extract queue-drain supervisor to Python with unit tests

refactoring

deployed

16h 15m

9m 3s

0190

Refactor: TimelineTabs `Tab variant=solid` branch from `.is-active` to `data-active=true`, drop the dual CSS selector

refactoring

deployed

16h 2m

10m 1s

0189

Fix: statusCounts.open misses items in `addressed` state — All ≠ Open + Resolved + Drift

bug

deployed

13h 31m

30m 28s

0188

Route ItemCardIssueBody and ItemCardCommentBody through ItemCardThreadView

new-feature

deployed

12h 58m

9m 11s

0187

Refactor: DS reference §13 ItemCard examples — match post-0173 head, evidence-inline, collapse, and QuestionThread bubbles

refactoring

deployed

12h 47m

15m 17s

0186

Queue-drain session isolation and L-spec checkpointing

new-feature

deployed

18m 31s

0185

Refactor: cycle-time line chart Y-axis cap from fixed 60m to a percentile-based / auto-ranged cap

refactoring

deployed

1d 12h

16m 41s

0184

Tests: pixel + structural fidelity check of the live dashboard against `dashboard-redesign-v3-horizontal.html`

test

deployed

1d 12h

9m 24s

0183

Fix: authoring funnel DRAFTS bucket only counts promoted drafts, omits current backlog under `specs/drafts/`

bug

deployed

1d 11h

21m

0182

Fix: bootstrap timeline shows `—` for completed-stage durations after `/api/data` refresh

bug

deployed

1d 11h

8m 4s

0181

Fix: All-Runs reports `running` for runs that died days ago — add time-based liveness check + new `abandoned` lifecycle status

bug

deployed

1d 11h

18m 52s

0180

Fix: Consumption card V2 anatomy — split combined Total tokens bar into total-input + total-output, add output totals block, move cache-savings line

bug

deployed

1d 10h

23m 44s

0179

Fix: critique-card body redundancies (verdict row, bottom anchor, seen-row) + mandate side-by-side parity grid in PR

bug

deployed

1d 10h

9m 23s

0178

Fix: three-section Input panel reveals empty body — drop inner per-piece default-collapse so one click on outer chevron shows content

bug

deployed

1d 10h

7m 57s

0177

Dashboard redesign v3 — horizontal hero + timeline, full-width counter row, populated Metrics tab, pagination, pastel chart palette, light default

new-feature

deployed

21h 40m

38m

0176

new-feature

deployed

1d 9h

22m 26s

0175

Summary tab v2 — celebratory close-out with verdict, stats, critique outcomes, and markdown download

new-feature

deployed

1d 9h

21m 27s

0174

Fix: /api/data subrequest-limit blowup (502) + drop dashboard poll from 15s to 5s

bug

deployed

20h 28m

11m 52s

0173

Drain deferrals — pulse-info dot, .tab-group-solid rename, segment counts, item-card head/lifecycle/sources/collapse, upstream [object Object] fix

new-feature

deployed

0172

Fix: critique cards render literal ** markdown and re-show cryptic compound IDs

bug

deployed

1d 8h

11m 25s

0171

Fix: Agent Input sub-tab renders two narrow cards causing horizontal scroll in split-pane modals

bug

deployed

1d 8h

8h 14m

0170

/canvas workshop skill — registry-driven pane iteration sandbox with verbatim live + DS snapshots

new-feature

deployed

23h 58m

28m 20s

0169

Dashboard redesign v2 — condensed callouts, tabs, light/dark themes, History total-elapsed banner

new-feature

deployed

22h

50m

0168

Critique pane — item-card frame + head rebuild + expanded lifecycle view + source attribution + affordances

new-feature

deployed

20h 56m

0167

Critique pane — bar2 segmented controls + bar1 drift chip + phase-tab + kind-cluster DS catch-up

new-feature

deployed

20h 45m

10m

0166

Timeline pane — System + Error chip primitives + brief-card refactor + live-state agent-strip wiring + turn-render data-layer fix

new-feature

deployed

20h 31m

16m

0165

Timeline pane — chip polish (identity / activity / category bubble / cost) + light-mode token drift fix

new-feature

deployed

20h 13m

13m 30s

0164

Timeline pane — M3 card chrome + phase header simplification + narrow-view strip equalisation

new-feature

deployed

19h 57m

2h 42m

0163

Push /dev-next events to main during feature-branch phase

new-feature

deployed

16h 38m

33m 30s

0162

Fix: post-deploy sweep for safe_to_destroy blue machines

bug

deployed

15h 50m

17m 28s

0161

Tests: JS test stack for Pages Function and dashboard-bootstrap.js

test

deployed

15h 12m

13m 49s

0160

Dashboard live data via Cloudflare Pages Function

new-feature

deployed

14h 50m

14m 10s

0159

Fix: Fly machines-API mid-rolling-deploy timeout (9-in-a-row)

bug

deployed

14h 22m

4m 20s

0158

Deferred-spec subagent in /dev-next — auto-capture in-flight deferrals as queued specs or drafts

new-feature

deployed

14h 7m

6m 8s

0157

/spec-queue auto-decomposition — conservative bundle-by-default, auto-chained sub-specs

new-feature

deployed

13h 55m

5m 19s

0156

Dashboard live-ness — cycle-started anchor, browser auto-refresh, live elapsed ticker, deploy-pages cleanup

new-feature

deployed

13h 45m

18m 23s

0155

Fix: lifecycle skills' session-title stamping is unimplemented

bug

deployed

13h 21m

7m 55s

0154

Spec workflow hardening — design-system steering at spec time, run-queue-until-empty skill, project CLAUDE.md

new-feature

deployed

13h 6m

0153

Dashboard redesign — design-system primitives, expandable in-flight hero with stage timeline, activity feed

new-feature

deployed

12h 13m

20m 43s

0152

Spec lifecycle system v1 — typed drafts, dev queue, dashboard, session naming

new-feature

deployed

10h 50m

0151

Design-system parity for critique surface + canonical Agent Input grouping + run-ID copy affordance

bug

deployed

7h 16m

—

0150

Legacy-shim sunset + historical input-bundle backfill

refactoring

deployed

2h 53m

—

0149

Post-batch cleanup + Anthropic cache engagement + protocol follow-ups

new-feature

deployed

—

0148

Consumption-card backend follow-ups + protocol-violation UI surface

—

ready

—

0147

Phase 0 critique section grouping + live timeline rendering determinism

—

ready

—

0146

Consumption card visual rework — CcxCard M3 polish + spec-preview rendering

—

ready

—

0145

Canonical prompt-pieces registry + per-attachment token tracking — protocol, persistence, and consumption surfaces

—

ready

—

0144

Sources & provenance — investigation outcome and per-critique-card surface (incl. Phase 4 Issue/Comment patches)

—

ready

—

0143

Cost & token attribution correction + run-detail header total-cost copy affordance

—

ready

—

0142

Prompt capture for full-view modals — Initial Brief persistence and modal hydration

—

ready

—

0141

Critique aggregation invariants and resolved-view integrity

—

ready

—

0140

Phase 4 deadlock — draft extractor body retention + escape-valve breadth for terminal-ledger AGREED

new-feature

ready

—

0138

Active-agent gradient pulse · uniform critique card heights · run-id chip in header row 2

—

ready

—

0137

Substantive-convergence escape valve — canonical-promote when both AGREED with terminal ledger but artifact hashes drift

new-feature

ready

—

0136

Unify run-status derivation — single source of truth across the All-Runs list, the run-detail page, and the orchestrator's exit-code emission

—

ready

—

0135

Phase 0 round cards + side-by-side critique modal — close the round-by-round visibility gap for the brief negotiation

—

ready

—

0134

Migrate hardcoded JSX fontWeight numbers to --md-w-* tokens (close the bypass gap)

—

proposed

—

0133

Diagram skill v2.0.0 — mode-aware split + Material mode + how-it-works regen

—

in-flight

—

0133

Run-detail surface rework — agent chips into Timeline pane, narrow critique compaction, M3 segmented phase progress, timeline card chip slim-down

—

in-flight

—

0132

Design-system pollution scrub (post-arc cleanup)

—

proposed

—

0131

CSS finalization + v1 token block removal + IBM Plex removal (close the migration arc)

—

proposed

—

0130

Remaining JSX (app / errors / compare / auth / search / shared) v1 → v2 token migration

—

proposed

—

0129

design-language.jsx v1 → v2 token migration

—

accepted

—

0128

run-detail.jsx v1 → v2 token migration

—

proposed

—

0127

Design system v2 canonicalization — promote v2 to single source of truth, archive v1

—

proposed

—

0126

Changelog — spec rendered in a full-view modal + working screenshots (or none)

—

proposed

—

0125

Onboarding tour visual rewrite + unified Admin/Settings + server-persisted onboarding state

—

proposed

—

0124

Critique filter header height parity + responsive compaction + timeline card right-alignment

—

in-flight

—

0123

How-It-Works as a full-page route + click-to-enlarge SVG viewer

—

proposed

—

0122

Persist item lifecycle events to the transcript + backfill historical runs

—

proposed

—

0121

How-It-Works overlay + Changelog tab — full content & component rewrite

—

proposed

—

0120

Turn-modal items panel — provider badge, segment labelling, Claims panel removal

—

implemented

—

0119

Badge governance — unified chip primitive, vocabulary, and surface-level rollout

—

proposed

—

0118

Deep Research consumption & cost tracking — collapsed/unfolded redesign, canonical piece aggregation

—

implemented

—

0117

Deep Research artifact naming, how-it-works diagrams, and timeline full-view

—

proposed

—

0116

Turn / Cross-review modal cleanup + Timeline-Critique pane visual harmonisation

—

merged

—

0115

Deep Research UI — timeline, critique cards with sources, appendix, validate-run CLI

—

proposed

—

0114

Deep Research protocol — canonical methodology, lifecycle, and prompts

—

proposed

—

0113

Full-view modals — fixed 92vh height + remove the 360px accordion-body cap

—

merged

—

0112

Agent strip — stop wrapping the model id and activity label when the box is tight

—

merged

—

0111

Critique cards — bucket correctness, expanded-card scroll, badge cleanup, height parity with Timeline

—

merged

—

0110

Modal sizing + phase rail anchoring + button contrast + critique card layout

—

proposed

—

0109

Modal chrome + overlay JSX M3 token migration

—

proposed

—

0108

Run-list page M3 token migration

—

proposed

—

0107

Timeline phase/turn card treatment + RunDetailHeader children M3 token migration

—

proposed

—

0105

M3 chrome + run-detail header JSX wiring (plus restore live agent strip dropped by spec 0099)

—

proposed

—

0104

Loading + States + A11y + Light + Responsive verification sweep (cross-cutting polish + verification of all earlier specs in both themes and three breakpoints)

—

proposed

—

0103

Onboarding tour overlay + ProgressSegs admin (8-step tour over the live app via data-tour-anchor attributes; no redraw of underlying surfaces; ProgressSegs 8-segment per-user track)

—

proposed

—

0102

How It Works overlay + right-side menu + Changelog (full-screen M3 dialog · sticky right menu · How It Works ↔ Changelog toggle · 9 collapsible sub-sections with inline diagrams)

—

proposed

—

0101

Agent input panel + PhaseRail + RoundScrubber M3 anatomy sweep

—

proposed

—

0100

Consumption pane full rework — collapsed + unfolded with sub-rows + uniform width across phases with round chip above card + sticky bottom legend

—

proposed

—

0099

Timeline pane M3 rework — header chrome + vertical phase rail outside column anchored to header centers + tl-turn variants + single dashed top border on unfold + REPAIR row variant with explainer

—

proposed

—

0098

Critique pane M3 rework — Bar 1 (title · phase tabs · totals · drift chip) + Bar 2 (kind tabs · agent · status filters) + collapsible status-grouped sections + Σ Summary state + phase-header sizing taller than card headers

—

proposed

—

0097

QuestionThread + unified item-card family (Q · D · I · C cards with one anatomy — who → when → what → quote, six-word verdict vocabulary, tonal bubble + quote inside bubble, dashed footer)

—

proposed

—

0096

M3 modal primitive (basic 560 dp + rich 1080 dp, shape-xl, elevation-3, agent-tinted left border)

—

proposed

—

0095

M3 tabs + top app bar + chrome compaction (run-list 3→2 headers, top-bar layout when viewing a run)

—

proposed

—

0094

M3 cards + AgentStrip + badge inventory + hover elevation-2 rule + AgentStrip badge sizing/symmetry

—

proposed

—

0093

M3 atoms — buttons (5 variants) + FAB + icon button + chips (4 kinds) + status pills (canonical OK) + switches + segmented buttons

—

proposed

—

0092

Material 3 token & foundation layer (palette, type, shape, elevation, state, motion, density, fonts, icons, base)

—

proposed

—

0091

Phase 4 drafter-engagement gate (close the round-1 sycophantic-APPROVED loophole)

—

proposed

—

0090

Parser robustness for cross-round Q/A/issue linkage + code-fence awareness (data-integrity overhaul)

—

proposed

—

0089

Convergence escape hatches for stuck-AGREED loops (canonical-FSD synthesis, stuck-AGREED escape valve, hard ledger feedback)

—

proposed

—

0088

Stop hiding Phase 3 / Phase 4 timeline rows on errored & deadlocked runs

—

proposed

—

0087

Cross-cutting polish — every remaining gap from the 2026-05-18 tweak-cycle audit

—

in-review

—

0086

Consumption tab rework — phase headers above rows, single card per agent, no duplicate compact bar

—

in-review

—

0085

Agent Input panel — system-prompt fallback, structural hierarchy, modal vertical space

—

in-review

—

0084

Unified LoadingState primitive — one delightful loading visual everywhere

—

merged

—

0083

Robust run-list loading state + spinner visual + hide 5455

—

merged

—

0082

Run-list loading state + server-side hidden-runs filter

—

merged

—

0081

Cache /api/runs/{id} snapshot by (run_id, latest_event_seq)

—

merged

—

0080

Hotfix: disable Fly auto_stop_machines to stop the proxy flap

—

merged

—

0079

Hosted UI modal-load perf — warm machine, gzip, immutable cache headers, server-side LRU

—

merged

—

0078

Fly VM memory bump to fix UI-server OOM-on-boot

—

merged

—

0077

Hotfix: run-detail.jsx parse error (white-screen regression)

—

merged

—

0076

Stable-worktree helper — isolate CLI runs from active orchestrator/feature-branch work

—

merged

—

0075

Consumption tab agent-card restructure — equal-height, data-top-bars-bottom, wider bars

—

merged

—

0074

Agent Input tab rework — rename, reorder, structural restructure

—

merged

—

0073

Question/Disagreement render unification + markdown rendering fix

—

merged

—

0072

Critique pane structural — filter strip, Phase 4 split, summary copy

—

merged

—

0071

Timeline structural pass — PhaseRail, phase headers, card size, collapsibility

—

merged

—

0070

Run-detail header — agent strip equalization, phase-tab info hierarchy, remove blocking banner

—

merged

—

0069

Run-list & chrome polish (status pills, top bar, top tabs, right cluster)

—

merged

—

0068

Brand-icon system + Design-page DNA reskin

—

merged

—

0067

Chip vocabulary + code-cluster expansion

—

merged

—

0061

Onboarding (3-screen first-time flow) + landing demo capsule

—

merged

—

0060

Cross-run dashboards — /compare (two-run side-by-side) + /search (cross-run query)

—

merged

—

0059

Keyboard contract + shortcuts overlay + search palette

—

merged

—

0058

Modal primitive (single + split) + sub-tabs + RoundScrubber + provider-symmetric SourceCard

—

merged

—

0057

Timeline + critique restructure (PhaseRail + ChipCluster + 3-axis filter + DriftCluster + Summary panel + CardHeadline migration)

—

merged

—

0056

Run detail header + chrome restructure + ActiveRunChip

—

merged

—

0055

Run list -- sort + attention promotion + filter Tabs + URL state + /-bound search

—

merged

—

0054

QuestionThread + QuestionRef + AP-01 enforcement

—

merged

—

0053

Tab system (3 variants) + table header

—

merged

—

0052

Primitive vocabulary — Button, StatusBadge, Chip, RunIDChip, ThemeToggle, Card, AgentStrip + today's-component migration

—

merged

—

0051

Consumption tab — content-vs-billing split + output bar + cross-turn lineage

—

proposed

—

0050

Design-system foundation — tokens, base, a11y, MDI icons, emoji removal

—

merged

—

0049

Reconcile-costs reads run-cost data from Supabase (re-enable daily cron)

—

merged

—

0048

Always-on cost verification against provider invoices + pricing-version snapshot

—

merged

—

0047

Run-detail resilience + Phase 4 sibling-key separation for repair turns

—

merged

—

0046

Critique panel + Summary tab + Consumption tab rework + design unification

—

merged

—

0045

Full-view shell standardisation + model pill layout

—

merged

—

0044

Turn-input semantics + per-turn badges + side-by-side framing

—

merged

—

0043

Cross-round ledger + standing-items input + conservative convergence

—

merged

—

0042

Critique data integrity — parser coverage, badge wiring, modal load paths, count reconciliation

—

merged

—

0041

Critique classification + load-time resilience + sentiment paragraph + tighter cards

—

merged

—

0040

Critique rework — Phase 4 answer linkage fix, compact cards, summary tab, timeline-pane re-alignment

—

merged

—

0039

Cost-pipeline integrity — preserve metrics on resume, price the cache tier that was actually used, fold tool spend into the headline

—

merged

—

0038

Web search audit UI + agent-pill alignment fix

—

merged

—

0036

Web search audit foundation + protocol parser fixes + resume hardening + --notion repeatable

—

merged

—

0035

Consumption rework + header-placement fix + app-version chip

—

merged

—

0034

Critique navigation — first-class Q+D, side-by-side rework, sentiment cards, click-to-highlight

—

merged

—

0033

Inputs foundation — universal Input view, Phase 0 split, two-row live header

—

merged

—

0032

Phase-2 hash-drift escape, P2 summaries, live-push flag, dual-research-run skill

—

merged

—

0031

Consumption-tab follow-ups — tier-lookup window, click-to-expand bars, per-phase web-search count

—

merged

—

0030

Timeline UX pass — inline unfold, per-input segments, real context windows, parser repairs

—

merged

—

0029

Token-consumption tab — per-turn context-window visualisation

—

merged

—

0028

Cross-review inline comments — Phase 4 side-by-side modal

—

merged

—

0027

Negotiate inline comments — side-by-side modal with anchored critique cards

—

merged

—

0026

How-it-works restructure — chat-lifecycle diagram, phase accordions, v3.5 process map

—

merged

—

0025

Visualisation foundations — modal pattern, summary cards, preflight tabs, attachment ingest

—

merged

—

0024

Run-detail header pass 2 + compact theme toggle

—

merged

—

0023

Compact run-detail header, "How it works" page, and release notes

—

merged

—

0022

Admin allowlist UI, profile menu, and landing-page redesign

—

merged

—

0021

Google OAuth + email allowlist via Supabase Auth

—

merged

—

0020

Fly.io deployment of the UI server with Supabase-backed aggregator

—

merged

—

0019

Supabase schema + `--push` CLI for hosted-deployment track

—

merged

—

0018

Hosted deployment kickoff handoff package

—

merged

—

0017

Render the last round in deadlocked / errored phase-2 timelines

—

merged

—

0016

Live data fidelity — round counts, disagreement parsing, terminal-status pills, chip math

—

merged

—

0015

Integration kickoff handoff package

—

merged

—

0014

Clearer card stats — Phase 1 badges and explicit labels

—

merged

—

0013

Run-id pill and timeline card stats

—

merged

—

0012

UI polish and navigation

—

merged

—

0011

UI bundle integration

—

merged

—

0010

UI HTTP server with SSE

—

merged

—

0009

UI run aggregator

—

merged

—

0008

Frontend handoff package

—

merged

—

0007

Rate-limit-aware retry + resume from prior session

—

merged

—

0006

Prompt caching to unblock prod-tier rate limit and cut multi-round cost

—

merged

—

0005

Web search wiring + prod-tier full-convergence E2E

—

merged

—

0004

Phases 3 + 4 — drafting, review, and final document emission

—

merged

—

0003

Phase 2 — plan negotiation with caps, repair, and drafter tiebreak

—

merged

—

0002

Orchestrator scaffold + Phase 0/1 end-to-end

—

merged

—

0001

Engineering workflow — specs, branches, PRs, semver

—

merged

—

↓

Cycle time improving

-49% week-over-week

Last 7d: avg 19m 14s, vs 37m 29s the week prior.

⏱

Where time goes

58% in pre-flight

Mean pre-flight = 20m 37s. Next: Implement 5m 17s, Read & plan 4m 30s.

Reconcile drift

2 of 10

20% needed mechanical patches before code landed.

Cycle time over last 22 cycles

Each point is one deployed spec. Dashed = rolling-10 mean. Outliers > 45m (1) clipped to the top.

Cycle timeRolling-10 mean

Where time goes (mean stage durations)

Stacked, last 10 deployed cycles · total mean = 35m 16s

Pre-flightRead & planImplementTestShipDeployHandoff

Throughput · specs deployed per week

last 8 ISO weeks · current week in purple

By type · last 30 days

124 specs total · mean cycle by category

new-feature

54 · 37m 53s

bug

38 · 27m 49s

refactoring

26 · 13m 47s

test

5 · 17m 41s

breaking

1 · 26m 43s

Success rate · last 30 days

deployed vs failed cycles

Deployed · 124Failed · 0

Spec authoring funnel

how ideas reach deploy

Last 30 days · 5 drafts in backlog + 10 promoted · 67% reached queue · 157% of queued shipped