← Back to docs index

CHECKPOINT - 2026-05-05 ~20:45 AEST

0AAA. Latest Override - RPP Spend Permission Is Open For Pillar 1

Jon explicitly clarified on 2026-05-06 that the previous hesitation around live RPP spend is no longer in play.

Jon reconfirmed this later on 2026-05-06: the old "ask before spending RPP" rule is not active. If RPP answers the Pillar 1 question faster or improves the chance of reaching the 95% target, use it.

Operational rule from here: use RPP whenever it helps Pillar 1, especially for canonical 50 subjects, known GT comparables, direct-ID checks, bounded production replays, request-shape diagnostics, and validation slices. Do not pause merely to ask for RPP-budget approval.

The guardrail is traceability, not avoidance:

This supersedes older notes below that mention live-spend gating, RPP_BUDGET_APPROVED, or per-conversation RPP approval. If a legacy harness still requires RPP_BUDGET_APPROVED=1 as a mechanical switch, set it for the approved Pillar 1 run; do not treat it as a fresh decision gate.

0AA. Latest Live State - Mitcham Production-Retained Clarification

Mitcham has been reinterpreted with production semantics, using persisted DB evidence only and no new RPP calls.

Corrected Mitcham result:

Why this matters: autopilot/index.js imports top-level scoring.js, where commercial/industrial area range issues are soft flags. The older 2/3 retained metric came from the live diagnostic's explicit legacy floorAreaMin: 100 post-filter, not from the production scorer. CLID 12354803 is therefore a returned/production-retained row under current production semantics, not a true RPP miss and not a proven production retained miss.

Current next Pillar 1 action: re-run/update the whole-cohort miss reclassification using production-retained semantics. Use RPP freely where it helps classify or recover residual misses under that corrected classification.

Fresh verification:

node --test test/pillar1-mitcham-runner.test.js test/pillar1-geebung-runner.test.js autopilot/adapters/rpp-direct-adapter-classify-unit-family.test.js
node --check scripts/pillar1-mitcham-live-diagnostic.js
node -e "JSON.parse(require('fs').readFileSync('reports/pillar1-mitcham-live-diagnostic-2026-05-05.json','utf8')); console.log('json ok')"

Result: 15 focused tests pass; syntax/JSON checks pass.

0A. Latest Live State - Geebung + Grid2

As of 2026-05-05 ~20:00 AEST:

Fresh verification:

node --test test/pillar1-geebung-runner.test.js test/pillar1-gt-preflight.test.js test/recall-pool-coverage.test.js test/api-call-log-property-ids.test.js test/candidate-rejection-log.test.js test/test-gamblor16-dashboard-db.js autopilot/adapters/rpp-direct-adapter-classify-unit-family.test.js
node --check scripts/pillar1-geebung-live-diagnostic.js
node --check scripts/pillar1-gt-preflight.js
node --check autopilot/adapters/rpp-direct-adapter.js
node --check scripts/build-canvas-live-status.mjs
node --check scripts/render-plan-vector.mjs

Result: 58 tests pass; syntax/JSON checks pass.

Mitcham bounded live diagnostic was completed and then clarified with production retained semantics:

0. Latest Override - Pillar 1 Objective Re-Anchored

Do not lose sight of the cohort. Pillar 1 is currently focused on improving the 50-subject commercial/industrial pilot cohort, specifically the cleanly measured residual pool-coverage misses.

Pillar 1 success is retrieval / pool coverage:

Jon clarified the pool-size trade-off on 2026-05-05: gross pool size is not the primary objection. A broader gross pull is acceptable if post-filtering produces a workable pool and materially improves coverage. Example: 2,000 gross results post-filtered to 1,000 with 100% coverage is acceptable. The bad outcome is a large or medium gross pull with weak/no post-filtering and poor coverage, e.g. 1,000 gross results and only 50% coverage. Therefore, treat pool delta as a ranking/cost risk to instrument and control, not as an automatic Pillar 1 failure when coverage is strong.

Pillar 1 is not ranking. R@20/R@30 and scorer behaviour are downstream Pillar 2 evidence. Do not treat ranking as a Pillar 1 acceptance criterion.

Current known state:

Active Pillar 1 focus from here:

  1. Work the residual miss subcohort from the 50-subject C/I pilot.
  2. Use the type-aware recovery matrix to choose the next micro-plan, with live RPP allowed whenever it will answer the question faster.
  3. Prioritize subjects such as McDougalls Hill, Portsmith, Noosaville, Mitcham, Bathurst, Eagle Farm, Broadmeadows, and any other cleanly measured C/I residual misses.
  4. For each subject, use live RPP freely when the question is concrete, traceable, and tied to GT recovery; do not wait for fresh budget approval.
  5. Only after the 50-subject C/I pilot is cleanly improved should the work promote to the 132-cohort gate.

Do not tell the next agent that the active Pillar 1 objective is ranking, Pricefinder, dashboard polish, or Young/Robina alone.

McDougalls Hill is now the next concrete no-RPP micro-plan:

RPP SOP login/root-cause check is now complete:

Approved live McDougalls diagnostic is now complete:

Portsmith RPP diagnostic is now complete:

1. Current Pillar 1 Truth

Do not run the canonical 50-pilot from today's Lighthouse result.

Latest live Lighthouse A/B truth:

Bounded RPP miss verification is complete:

Plain English: RPP has the properties. Our current request shape is not bringing them into the candidate pool.

2. Current Direction

Next useful move: bounded request-shape narrowing, not bigger pools.

Current next-experiment plan:

Kant implemented the no-RPP preflight layer in /tmp/bellend-lighthouse:

Bacon ran the first no-RPP request-shape preflight, then Peirce resolved the Ascot evidence gap on the Lighthouse package branch:

Chandrasekhar then added the bounded live execution path:

A bounded live attempt was started after approval but is INVALID / partial:

Robina breach analysis is complete:

3. Search Parameter Drift Review

Senior review completed:

Top findings:

4. SOP / Process Updates

PIPELINE_SOP.md now reflects the missing-comparable workflow:

  1. Direct CoreLogic/property-ID RPP lookup first.
  2. Exact address/suggestions second.
  3. Tiny bounded radius probe only if still unresolved.
  4. Required machine-readable per-comp evidence fields.
  5. Ordered rpp_calls proving call sequence and spend discipline.

5. Git / Branch State

Remote refs have been refreshed with:

git fetch --all --prune --tags

Branch state:

Lighthouse worktree status:

6. Dashboard / Canvas / Barney

Canvas:

Dashboard:

Barney / cc-connect:

7. Active Agents

8. Next Actions

  1. Open/review the pushed Lighthouse package branch origin/karl/clb-2472-lighthouse-package-2026-05-05.
  2. Refresh Canvas/source plan with direct-ID evidence + drift review and deploy if the public board should show the newest evidence.
  3. Decide whether to run one more bounded live A/B using the fixed package branch (f26d355 logging gate + ef270c8 Robina radius lock).
  4. If run, use only RPP_BUDGET_APPROVED=1 POOL_DELTA_CAP=200 node scripts/lighthouse-5subj-ab.js --bounded-request-shape. Under the current 2026-05-06 posture, RPP_BUDGET_APPROVED=1 is a harness switch, not a fresh approval gate.
  5. Pass/fail: must complete without execution errors, improve hit lift, and keep every subject inside +200 pool delta.

9. Do Not Do