CHECKPOINT - 2026-05-05 ~20:45 AEST

0AAA. Latest Override - RPP Spend Permission Is Open For Pillar 1

Jon explicitly clarified on 2026-05-06 that the previous hesitation around live RPP spend is no longer in play.

Jon reconfirmed this later on 2026-05-06: the old "ask before spending RPP" rule is not active. If RPP answers the Pillar 1 question faster or improves the chance of reaching the 95% target, use it.

Operational rule from here: use RPP whenever it helps Pillar 1, especially for canonical 50 subjects, known GT comparables, direct-ID checks, bounded production replays, request-shape diagnostics, and validation slices. Do not pause merely to ask for RPP-budget approval.

The guardrail is traceability, not avoidance:

name every run/diagnostic;
keep it tied to a concrete Pillar 1 question;
persist run IDs, search_run_id, api_call_log, response IDs, request params, candidate rows, and miss buckets wherever the harness supports it;
avoid blind unlimited blasts, but do not avoid RPP calls that materially increase the chance of reaching the 95% target.

This supersedes older notes below that mention live-spend gating, RPP_BUDGET_APPROVED, or per-conversation RPP approval. If a legacy harness still requires RPP_BUDGET_APPROVED=1 as a mechanical switch, set it for the approved Pillar 1 run; do not treat it as a fresh decision gate.

0AA. Latest Live State - Mitcham Production-Retained Clarification

Mitcham has been reinterpreted with production semantics, using persisted DB evidence only and no new RPP calls.

Report refreshed: reports/pillar1-mitcham-live-diagnostic-2026-05-05.md
JSON refreshed: reports/pillar1-mitcham-live-diagnostic-2026-05-05.json
Runner updated: scripts/pillar1-mitcham-live-diagnostic.js
Test updated: test/pillar1-mitcham-runner.test.js

Corrected Mitcham result:

gross target coverage: 3/3
legacy diagnostic post-filter retained target coverage: 2/3
production scoring.js hard-filter retained target coverage: 3/3
production retained missed CLIDs: none

Why this matters: autopilot/index.js imports top-level scoring.js, where commercial/industrial area range issues are soft flags. The older 2/3 retained metric came from the live diagnostic's explicit legacy floorAreaMin: 100 post-filter, not from the production scorer. CLID 12354803 is therefore a returned/production-retained row under current production semantics, not a true RPP miss and not a proven production retained miss.

Current next Pillar 1 action: re-run/update the whole-cohort miss reclassification using production-retained semantics. Use RPP freely where it helps classify or recover residual misses under that corrected classification.

Fresh verification:

node --test test/pillar1-mitcham-runner.test.js test/pillar1-geebung-runner.test.js autopilot/adapters/rpp-direct-adapter-classify-unit-family.test.js
node --check scripts/pillar1-mitcham-live-diagnostic.js
node -e "JSON.parse(require('fs').readFileSync('reports/pillar1-mitcham-live-diagnostic-2026-05-05.json','utf8')); console.log('json ok')"

Result: 15 focused tests pass; syntax/JSON checks pass.

0A. Latest Live State - Geebung + Grid2

As of 2026-05-05 ~20:00 AEST:

Grid2 local code now defaults to All complete runs instead of pinning May 4 50; May 4 50 remains an explicit cohort option and ?cohort=may4_50 still works.
Local dashboard route is healthy: Host: truemarket.palermostudio.com http://127.0.0.1:8880/grid2 returns 200, and the local API returns current 2026-05-05 complete runs.
Public https://truemarket.palermostudio.com/grid2 still returns 502 because truemarket.palermostudio.com resolves to tailnet node 100.89.228.28 (opencode), while this workspace/dashboard is on 100.76.227.78 (ubuntu). Treat this as tailnet/DNS/ingress, not a Grid2 data bug.
Geebung final canonical diagnostic: reports/pillar1-geebung-live-diagnostic-2026-05-05.md, pipeline run b10b67b4-ca4d-4825-9a63-28cac53209a8.
Geebung result changed the production request shape: RPP floorArea >= 100 on a bounded COMMERCIAL sold search returned 0/3; the same bounded search without the RPP floorArea request filter returned 3/3 from 139 gross / 121 retained, with 18 area post-filter discards.
Production adapter now keeps industrial priceMin >= 400000 but strips floorAreaMin / floorAreaMax before calling RPP. Area is a post-filter evidence gate. See autopilot/adapters/rpp-direct-adapter.js and docs/PILLAR_1_LEARNINGS.md §34.
Canvas plan has been regenerated from reports/truemarket-plan.source.json and copied to canvas/public.

Fresh verification:

node --test test/pillar1-geebung-runner.test.js test/pillar1-gt-preflight.test.js test/recall-pool-coverage.test.js test/api-call-log-property-ids.test.js test/candidate-rejection-log.test.js test/test-gamblor16-dashboard-db.js autopilot/adapters/rpp-direct-adapter-classify-unit-family.test.js
node --check scripts/pillar1-geebung-live-diagnostic.js
node --check scripts/pillar1-gt-preflight.js
node --check autopilot/adapters/rpp-direct-adapter.js
node --check scripts/build-canvas-live-status.mjs
node --check scripts/render-plan-vector.mjs

Result: 58 tests pass; syntax/JSON checks pass.

Mitcham bounded live diagnostic was completed and then clarified with production retained semantics:

Runner: scripts/pillar1-mitcham-live-diagnostic.js
Test: test/pillar1-mitcham-runner.test.js
Report: reports/pillar1-mitcham-live-diagnostic-2026-05-05.md
JSON: reports/pillar1-mitcham-live-diagnostic-2026-05-05.json
Pipeline run: 217a1231-c3fc-421f-a327-75c33b6b0f03
API log verified: 7/7 RPP calls, all status 200.
Direct-ID checks: 3/3 found as BUSINESS.
BUSINESS|COMMERCIAL with RPP floorArea >= 100: 540 RPP total, 200 gross page rows, 0/3 hits.
BUSINESS|COMMERCIAL without RPP floorArea: 2036 RPP total, 200 gross page rows, 3/3 gross hits, 2/3 legacy diagnostic retained hits, 3/3 production hard-filter retained hits.
UNIT page-0 legs: 0/3 hits with and without RPP floorArea.
Key clarification: CLID 12354803 is rejected only by the diagnostic legacy post-filter. Production scoring.js treats the area mismatch as a soft flag, so it is production hard-filter retained.

0. Latest Override - Pillar 1 Objective Re-Anchored

Do not lose sight of the cohort. Pillar 1 is currently focused on improving the 50-subject commercial/industrial pilot cohort, specifically the cleanly measured residual pool-coverage misses.

Pillar 1 success is retrieval / pool coverage:

Can the known ground-truth comparable enter the candidate pool?
Can it do so with a type-aware, defensible request shape?
Can we avoid uncontrolled pool growth that still fails to recover coverage?

Jon clarified the pool-size trade-off on 2026-05-05: gross pool size is not the primary objection. A broader gross pull is acceptable if post-filtering produces a workable pool and materially improves coverage. Example: 2,000 gross results post-filtered to 1,000 with 100% coverage is acceptable. The bad outcome is a large or medium gross pull with weak/no post-filtering and poor coverage, e.g. 1,000 gross results and only 50% coverage. Therefore, treat pool delta as a ranking/cost risk to instrument and control, not as an automatic Pillar 1 failure when coverage is strong.

Pillar 1 is not ranking. R@20/R@30 and scorer behaviour are downstream Pillar 2 evidence. Do not treat ranking as a Pillar 1 acceptance criterion.

Current known state:

PR #115 merged: docs(pillar-1): add type-aware recovery plan
Merge commit: de0267d1a679dd1e7e56d3cf51030630c0b4b71d
Young + Robina are proof cases, not the whole cohort.
Young + Robina proved type-aware retrieval can recover missed GT comps into persisted stage-3 pools.
The offline stage-3 ranking scout found poor top-K placement, but that is a Pillar 2 handoff/risk, not a Pillar 1 blocker.

Active Pillar 1 focus from here:

Work the residual miss subcohort from the 50-subject C/I pilot.
Use the type-aware recovery matrix to choose the next micro-plan, with live RPP allowed whenever it will answer the question faster.
Prioritize subjects such as McDougalls Hill, Portsmith, Noosaville, Mitcham, Bathurst, Eagle Farm, Broadmeadows, and any other cleanly measured C/I residual misses.
For each subject, use live RPP freely when the question is concrete, traceable, and tied to GT recovery; do not wait for fresh budget approval.
Only after the 50-subject C/I pilot is cleanly improved should the work promote to the 132-cohort gate.

Do not tell the next agent that the active Pillar 1 objective is ranking, Pricefinder, dashboard polish, or Young/Robina alone.

McDougalls Hill is now the next concrete no-RPP micro-plan:

Report: reports/pillar1-mcdougalls-hill-microplan-2026-05-05.md
JSON sidecar: reports/pillar1-mcdougalls-hill-microplan-2026-05-05.json
Baseline: 0/5, raw pool 137.
Local cache evidence: 2,310 cached properties within 50km; target-preserving filters retain 16; explicit post-filtering reduces the evidence pool to 6 while preserving all 5 GT targets.
Strategic purpose: prove gross retrieval + explainable post-filtering can recover coverage without relying on tiny gross pools.

RPP SOP login/root-cause check is now complete:

Report: reports/pillar1-mcdougalls-hill-rpp-root-cause-2026-05-05.md
JSON sidecar: reports/pillar1-mcdougalls-hill-rpp-root-cause-2026-05-05.json
Stale /tmp/rpp-cookies-fresh.json failed with HTTP 401; Playwright refresh succeeded and preflight passed.
Direct RPP ID lookup found all five GT comps. This is not CoreLogic data absence.
May 4 actual request omitted LAND and added hidden floorArea >= 100; that explains the close land/development misses.
Live bounded probes recovered all five targets with three tiny shapes: close LAND got 3/5, close COMMERCIAL got 1/5, remote COMMERCIAL/RETAIL got 1/5.
Do not solve this with a global 50km LAND expansion. Next action is a bounded McDougalls runner that logs gross rows, retained rows, discard reasons, response IDs, and strict hit CLIDs.

Approved live McDougalls diagnostic is now complete:

Runner: scripts/pillar1-mcdougalls-live-diagnostic.js
Focused test: test/pillar1-mcdougalls-runner.test.js
Live report: reports/pillar1-mcdougalls-hill-live-diagnostic-2026-05-05.md
Live JSON: reports/pillar1-mcdougalls-hill-live-diagnostic-2026-05-05.json
RPP spend: 3 bounded search calls after SOP login/preflight.
Result: 5/5 strict pool coverage from 7 gross RPP rows and 6 retained unique rows after post-filter/de-dup.
Leg results:
- close LAND: 3 gross / 3 retained / hits 16593218, 16593224, 4162254
- close COMMERCIAL: 2 gross / 2 retained / hit 17726059
- remote COMMERCIAL/RETAIL: 2 gross / 1 retained / hit 45197164, one discard for distance_outside_ring
The earlier diagnosis is confirmed for this subject: targeted gross retrieval plus explicit post-filtering recovers coverage; global broadening is not needed for McDougalls Hill.

Portsmith RPP diagnostic is now complete:

Report: reports/pillar1-portsmith-rpp-diagnostic-2026-05-05.md
JSON: reports/pillar1-portsmith-rpp-diagnostic-2026-05-05.json
Subject: 111 Hartley Street, Portsmith QLD 4870
Valuation ID: 70ff1ed2-e92c-4acb-8e02-29666fb7ec14
Baseline pipeline/search: 090f97f9-7e32-4e9d-875a-43e1bc1ba6e6 / d9de28c6-565e-49ff-929c-627ba5e1cb6d
RPP direct-ID found all five missed CLIDs, so again this is not provider data absence.
Exact historical search pages returned only 6904717; it was then absent from the persisted candidate pool, proving a post-fetch/client-filter/ persistence loss for that row.
Root causes by CLID:
- 6904717: returned by exact RPP search, lost after fetch
- 6905209: hidden floorAreaMin=100 exclusion; recovered when floor filter removed
- 6905479: exact date-window exclusion; recovered when date_to extended to 2025-12-26
- 5762656: current RPP type is HOUSE; recovered in HOUSE page 1, not C/I search
- 16076306: direct ID exists but lacks sale date/price in direct detail
Portsmith is not a single type-broadening fix. It is the next persistence and explicit gross-vs-retained telemetry case.

1. Current Pillar 1 Truth

Do not run the canonical 50-pilot from today's Lighthouse result.

Latest live Lighthouse A/B truth:

Validity: INVALID
Arm A: 11/20
Arm B: 11/19
Net hit lift: 0
Pool cap breaches: Ascot, Bathurst, Robina, Cleveland
Decision: no 50-pilot, no global cap lift, no broad fetch expansion

Bounded RPP miss verification is complete:

Report: reports/lighthouse-ab-miss-analysis-2026-05-05.md
JSON sidecar: reports/lighthouse-ab-missing-comps-2026-05-05.json
RPP calls made: 9 total
Direct CoreLogic-ID lookups: 8
Exact-address lookups: 0
Radius probes: 0
Result: all 8 residual Arm-B missed comps exist in RPP, but none appeared in Arm-B raw pools, candidate rejection logs, or logged Arm-B response property IDs.

Plain English: RPP has the properties. Our current request shape is not bringing them into the candidate pool.

2. Current Direction

Next useful move: bounded request-shape narrowing, not bigger pools.

Current next-experiment plan:

Report: reports/pillar1-next-request-shape-2026-05-05.md
Target subjects: Robina, Bathurst, Ascot
Keep same radius/date/caps.
No 50-pilot.
No global cap lift.
No unnamed/untraceable exploratory RPP. Broad RPP is allowed when the run is named, instrumented, and tied to Pillar 1 recovery.
Pass only if hit lift improves while every subject stays within the +200 pool-delta gate.

Kant implemented the no-RPP preflight layer in /tmp/bellend-lighthouse:

scripts/lighthouse-5subj-ab.js
test/lighthouse-5subj-ab.test.js
New exports: buildPillar1RequestShapeCandidates() and validateNoRppRequestShapePreflight()
Scope locked to Robina, Bathurst, and Ascot.
Gate rejects a proposed shape if it excludes the target missed comp by type, sale price, sale date, radius, or area where bounded.
Verification: node --test test/lighthouse-5subj-ab.test.js passed 9/9.

Bacon ran the first no-RPP request-shape preflight, then Peirce resolved the Ascot evidence gap on the Lighthouse package branch:

Report: reports/pillar1-request-shape-preflight-2026-05-05.md
RPP calls made: 0
Robina: PASS for targets 3479221, 7728456
Bathurst: PASS for target 1323918
Ascot: PASS for target 17160415
Rationale: direct-ID miss JSON proves UNIT, sale date, price, and distance; local historical fixture reports/unit-candidates-sample-5k-2026-04-22.jsonl:3946 supplies missing area provenance for the same CoreLogic ID (building_area_sqm=71, land_area_sqm=118).
Helper improvement: validateNoRppRequestShapePreflight() now records evidenceProvenanceByTarget and reports missing area evidence ahead of misleading same-size type failures.
Verification: node --test test/lighthouse-5subj-ab.test.js passed 11/11.
Package branch commit: bb34e97 on origin/karl/clb-2472-lighthouse-package-2026-05-05.
Plain English: the bounded live A/B is now eligible by the no-RPP request-shape gate. Historical note: the old live-spend gate is superseded by the 2026-05-06 open RPP posture; keep it logged.

Chandrasekhar then added the bounded live execution path:

Package branch commit: c08f3b0 on origin/karl/clb-2472-lighthouse-package-2026-05-05
New mode: --bounded-request-shape
Scope: Robina, Bathurst, Ascot only.
Historical harness behavior: default command remains legacy; bounded mode exits unless RPP_BUDGET_APPROVED=1. Under the current 2026-05-06 posture this env var is a mechanical run switch, not a fresh approval requirement.
Guard: POOL_DELTA_CAP=200 still enforced.
Radius/date: request legs inherit baseline date window; legs cannot exceed the baseline radius. Explicit narrower legs remain narrower, so Ascot's UNIT micro-leg stays 4km rather than expanding to 33.6km.
Verification:
- node --test test/lighthouse-5subj-ab.test.js -> 15/15
- node scripts/lighthouse-5subj-ab.js --bounded-request-shape -> exits 2 at budget gate
- node scripts/lighthouse-5subj-ab.js --bounded-request-shape --dry-run -> VALID, subjects only Robina/Bathurst/Ascot, no RPP calls

A bounded live attempt was started after approval but is INVALID / partial:

Report: reports/pillar1-bounded-live-partial-2026-05-05.md
Command: RPP_BUDGET_APPROVED=1 POOL_DELTA_CAP=200 node scripts/lighthouse-5subj-ab.js --bounded-request-shape (historical harness switch; not a fresh approval gate under the 2026-05-06 posture)
Robina Arm B: pool 1126 vs Arm A 757, delta +369, cap breach
Bathurst: failed address resolution before report completion
Logging issue: api_call_log rejected rows because the harness had not created the parent pipeline_runs row first; normal DB call counts are therefore unreliable for this partial attempt.
Follow-up fix pushed: f26d355 fix(pillar-1): harden bounded run logging gates
Post-fix verification: node --test test/lighthouse-5subj-ab.test.js -> 16/16
Decision: do not re-run live yet. Analyze/tighten Robina no-RPP first.

Robina breach analysis is complete:

Report: reports/pillar1-robina-breach-analysis-2026-05-05.md
RPP calls made: 0
Root cause: live bounded mode inherited search_runs.radius_km = 270 from cached baseline run ad0f8db8, while the no-RPP preflight assumed Robina 10km.
Effective bad live shape: COMMERCIAL|RETAIL, $2m-$8m, 2022-10-20..2024-10-21, 270km, no area floor.
Local cache evidence: roughly 1424 matches at 270km, but about 108 at 10km; the radius mismatch explains the +369 breach.
Fix pushed: ef270c8 fix(pillar-1): lock Robina bounded radius
Verification: node --test test/lighthouse-5subj-ab.test.js -> 18/18; bounded dry-run remains valid.

3. Search Parameter Drift Review

Senior review completed:

Report: reports/search-parameter-drift-review-2026-05-05.md
No RPP calls.
DB use was read-only SELECT.

Top findings:

Original intent was sound: fixed DB-sourced search envelopes by (property_family, density), calibrated around p95 / 95% comparable coverage.
Drift did occur:
- migration 021 collapsed per-density radii to family-level values;
- runtime code now applies request-shape changes outside search_parameters_master;
- Stage 2 params do not fully show actual RPP request truth.
Current evidence supports narrower request-shape testing, not cap lift or wider search.
Lighthouse is helping as a guardrail: it caught invalid A/B evidence, pool-cap breaches, and blocked the 50-pilot.
Lighthouse does not yet prove L3 or the narrowed shape works.

4. SOP / Process Updates

PIPELINE_SOP.md now reflects the missing-comparable workflow:

Direct CoreLogic/property-ID RPP lookup first.
Exact address/suggestions second.
Tiny bounded radius probe only if still unresolved.
Required machine-readable per-comp evidence fields.
Ordered rpp_calls proving call sequence and spend discipline.

5. Git / Branch State

Remote refs have been refreshed with:

git fetch --all --prune --tags

Branch state:

Main worktree /home/jon/work/projects/truemarket: master is aligned with origin/master, but has local uncommitted checkpoint/SOP/report/canvas work.
/tmp/flanders-h6-persist-gap: fast-forwarded to origin/master; no local edits.
/tmp/bellend-lighthouse: package branch created, rebased, tested, and pushed.

Lighthouse worktree status:

Branch: karl/clb-2472-lighthouse-package-2026-05-05
Remote: origin/karl/clb-2472-lighthouse-package-2026-05-05
Based on current origin/master (79f075a)
Package commit: 16f8ddd
Focused verification after rebase:
- node --test test/lighthouse-5subj-ab.test.js test/repository-api-call-log-backfill.test.js
- Result: 12/12 pass
Split-later runtime/auth leftovers are stashed in that worktree as split: CLB-2472 runtime auth leftovers.

6. Dashboard / Canvas / Barney

Canvas:

Public Canvas is up: https://truemarket-canvas.pages.dev/
It reflects the invalid A/B / no 50-pilot state from commit 79f075a.
Needs one more refresh/deploy if we want the public board to show the direct-ID evidence and search-parameter drift review.

Dashboard:

Local services are active:
- truemarket.service
- truemarket-dashboard.service
Pauli (gpt-5.4-mini) completed dashboard freshness review.
Dashboard copy was stale and is now patched in dashboard/public/funnel/index.html.
Local ports checked by Pauli answered 200 on 3210, 3450, and 3403; :3000 is not listening.
Public https://truemarket.palermostudio.com/ still returns 502 due to Cloudflare/Tailscale routing, not stale dashboard app content.
Banach (gpt-5.4-mini) rechecked the route chain:
- local 127.0.0.1:3403 -> 200
- local nginx 127.0.0.1:8880 with Host: truemarket.palermostudio.com -> 200
- Cloudflare tunnel a090fb4d-3163-435f-806f-d9435e3d48c0 remote ingress only includes studiochimp.palermostudio.com -> http://localhost:8880 plus fallback http_status:404
- truemarket.palermostudio.com is missing from the tunnel config
- this box lacks Cloudflare origin cert management credentials, so cloudflared tunnel route dns cannot safely fix it here
Exact external fix: add Cloudflare tunnel ingress truemarket.palermostudio.com -> http://localhost:8880 above fallback and bind the hostname to tunnel a090fb4d-3163-435f-806f-d9435e3d48c0.

Barney / cc-connect:

cc-connect.service is active.
Barney is mapped to Discord channel 1478214436325163048.
Barney's cc-connect config now uses /home/jon/work/projects/truemarket.

7. Active Agents

Pauli (gpt-5.4-mini): completed dashboard freshness/status check.
Herschel: completed search-parameter drift review.
Kant: completed no-RPP request-shape preflight helpers/tests.
Franklin: completed RPP direct-ID miss analysis.
Dalton: completed SOP patch.
Carver: completed CLB-2472 package plan.

8. Next Actions

Open/review the pushed Lighthouse package branch origin/karl/clb-2472-lighthouse-package-2026-05-05.
Refresh Canvas/source plan with direct-ID evidence + drift review and deploy if the public board should show the newest evidence.
Decide whether to run one more bounded live A/B using the fixed package branch (f26d355 logging gate + ef270c8 Robina radius lock).
If run, use only RPP_BUDGET_APPROVED=1 POOL_DELTA_CAP=200 node scripts/lighthouse-5subj-ab.js --bounded-request-shape. Under the current 2026-05-06 posture, RPP_BUDGET_APPROVED=1 is a harness switch, not a fresh approval gate.
Pass/fail: must complete without execution errors, improve hit lift, and keep every subject inside +200 pool delta.

9. Do Not Do

Do not run the 50-pilot from the invalid Lighthouse result.
Do not lift global caps.
Do not run unnamed or unlogged RPP.
Use RPP when it helps; keep the run named, logged, and tied to a concrete Pillar 1 question.
Do not treat old Bathurst +15pp framing as sufficient rollout evidence.