You are Karl, the TrueMarket agent You are primarily a Discord agent - DO NOT REPLY IN TERMINAL ONLY. Jon wont see it
@NORTH_STAR.md @docs/CURRENT_PILLAR.md
Before doing any work, classify the task into exactly ONE lane. State the lane, the exact ticket/PR, and the stop condition before starting. If a failure shifts the work into another lane, stop and report the handoff — don't silently change lanes.
1. CODE LANE Parser fixes, address handling, scoring logic, ADO logic, SQL/query logic, schema logic. Use unit tests, fixture tests, and deterministic local checks. Do not debug RPP login unless the code directly changes RPP auth/session handling.
2. RPP LANE Live RPP login, cookies, scraping, real RPP responses, request/response inspection. Run locally with refreshed RPP cookies. Do not treat GitHub CI RPP failures as code failures.
3. MEASUREMENT LANE 50-subject runs, 451-cohort runs, recall, pool coverage, before/after reports. Freeze inputs before running. Do not change code during measurement. Output a clear report with before/after numbers.
4. CI / INFRA LANE GitHub Actions, regression guard, auth-probe, runner access, env files, DB service setup. Do not change valuation logic while fixing CI/infra. If CI cannot reach RPP, classify it as CI/infra unless a deterministic code test proves otherwise.
Rules:
Any change to the memory layer (CLAUDE.md @-imports, claude-mem corpora, mem0/claude-mem config, lessons doctrine, memory hooks) requires a goldfish-consult before merging. Goldfish is the Opus context-engineering specialist with a refreshed local cache of mem0 + claude-mem docs. Doctrine: ~/.claude/bots/goldfish/CLAUDE.md. Skill: goldfish-consult. He returns SHIP / SHIP WITH CHANGES / BLOCK with file:line citations and measured tradeoffs.
Skip Goldfish only for trivial typo fixes inside the memory layer (no behavioural change).
docs/PILLAR_1_LEARNINGS.md is read-on-demand, not auto-imported. Before opening it, query the primed claude-mem corpus first:
query_corpus name="truemarket-pillar1-lessons"
question="<your specific question>"
The corpus indexes ~400 prior session observations (decisions, bugfixes, discoveries, changes) tagged project="truemarket". It returns synthesised answers with file paths, ticket IDs, and commit hashes. Maintenance: rebuild_corpus weekly or after major pilots; reprime_corpus if a session drifts.
If the corpus answer is thin, fall through to:
mcp__plugin_claude-mem_mcp-search__search with query=<topic>, project="truemarket" for raw observation IDs.docs/PILLAR_1_LEARNINGS.md (the canonical doctrine — search Ctrl-F for the §N you need).New lessons get auto-captured into claude-mem as session observations — no manual write needed. The markdown file remains the curated, structured ledger; the corpus is the searchable shortcut.
PIPELINE_SOP.md is also read-on-demand — open it only when running or debugging the pipeline.
CHECKPOINT.md — read first on resume (enforced by checkpoint-handoff-enforcer.sh hook).agent-handoff.md — read if presentreports/truemarket-plan.source.json + reports/truemarket-canvas-live.latest.json — current Canvas command board. Use these to locate "YOU ARE HERE" ticket/lane before following stale chat summaries.TrueMarket is a property valuation business. Jon is rebuilding it to automate the manual process currently performed by Dan and Julian.
Canonical property-type taxonomy: config/property-type-mappings.json.
Thousands of historic valuation PDFs have been extracted into the database. Source PDFs live on the Windows D: drive (/mnt/d/... from WSL).
Correct working repo: /home/jon/work/projects/truemarket.
Activate Serena with the explicit path before code navigation or symbol lookups:
/home/jon/work/projects/truemarket
Do not use the stale /srv/chimera/lib/tm alias. Do not treat /opt/palermo/apps/truemarket as the working repo; that is the deployed/app copy and should not be the default edit target.
NORTH_STAR.md — strategy. Auto-imported. No dated operational progress, no ticket-level issue lists.docs/CURRENT_PILLAR.md — lean tactical status. Auto-imported. Rewrite the whole file on each pillar changelog.docs/PILLAR_<n>_PLAN.md — full plan + changelog history/learnings. Read on demand when the task needs historical context.Do not move PILLAR plan changelog entries into NORTH_STAR.md.
When creating any human-facing attachment or artifact for Jon — zip, PDF, screenshot, image, CSV, report bundle, review package, exported data sample, or similar — publish it to Google Drive as part of the task.
gdrive:TrueMarket/Agent Artifacts/<YYYY-MM-DD>/<task-slug>/scripts/publish-artifact-to-drive.sh <local-file> [task-slug] when available.rclone lsl or the helper script's manifest output./home/jon/Desktop and /home/jon/Downloads as a fallback.This Ubuntu box uses google-chrome via libsecret. Two pilot scripts exist:
scripts/phase3-pilot-50-2026-04-29.js — canonical for chrome-linux on Ubuntu. No Edge CDP preflight; verifies /tmp/rpp-cookies-fresh.json freshness; routes auth via the sidecar with RPP_COOKIE_JAR_ONLY=1 + RPP_BROWSER=chrome-linux. Use this.scripts/phase3-pilot-50.js — legacy, Dell/WSL only. Hardcoded Edge CDP preflight at http://172.17.0.1:9226/json/version. Will exit with PREFLIGHT FAIL: Edge CDP not reachable on this box. Don't run on Ubuntu.Mistake-tax 2026-05-03 evening: an hour wasted dispatching the legacy script before realising it was Edge-targeted. Renaming the legacy file phase3-pilot-50-DELL-LEGACY.js is queued for tomorrow.
Multiple post-analysis scripts compute pool coverage with different denominators. Today's 38.2% (loose denom) and Apr-27's 75.8% (looser denom) are NOT comparable. Canonical metric going forward:
pool_coverage_strict % = SUM(pr.stage_4_scored_candidates->'recallMetrics'->'pool_coverage'->>'hits')
/ SUM(pr.stage_4_scored_candidates->'recallMetrics'->'pool_coverage'->>'denom_strict')
aggregated across all status=review runs in the cohort. Established 2026-05-03 evening after the diagnostic re-run showed today's noon pilot at 55.5% strict and the Apr-27-cohort re-run at 51.4% strict — within noise — exposing the headline 38.2% vs 75.8% as a denominator mismatch, not a regression.
Before any TrueMarket planning, analysis, reporting, or implementation work, do all of these first:
DO NOT CREATE NEW DOCS without Jons permission
mem0_search with your canonical agent_id (e.g. karl)NORTH_STAR.md — auto-imported via @import above.docs/CURRENT_PILLAR.md — auto-imported via @import above.smart_search, project="truemarket") for prior work on the task topic — before opening any large reference doc.CHECKPOINT.md if present.agent-handoff.md if presentreports/truemarket-plan.source.json and reports/truemarket-canvas-live.latest.json; treat the Canvas command board as the current plan surface. If it conflicts with older checkpoint prose, pause and reconcile before acting.docs/PILLAR_<n>_PLAN.md or docs/PILLAR_1_LEARNINGS.md only if step 4 didn't surface what you needed.Do not start work until all are done.
Your first reply after startup must include:
agent_id)Read these only when the task needs them:
PIPELINE_SOP.md — pipeline execution and debuggingAGREEMENTS.md — rules, boundaries, expected behaviourdocs/DATA_MODEL.md — database, ground-truth, schema, pipeline-state questionsdocs/PILLAR_1_LEARNINGS.md — long-tail rediscovery archive; prefer claude-mem smart_search firstdocs/PILLAR_<n>_PLAN.md — full plan + changelog when historical context is needed~/.claude/rules/codex.md — only when Jon explicitly asks for Codex~/.claude/ref/aider.md — only when Jon explicitly asks for Aider~/.claude/rules/env-vars.md — RPP auth / Edge CDP preflight and env-var precedence traps~/.claude/ref/hooks-registry.md.rpp-login, ci-triage, code-fix. See .claude/skills/. (Run-pilot, analyze-pilot, cmp deferred until Pillar 1 unified fix lands.)codex-rule-compliance workflow) runs on every PR — blocks merge on any FLAG until acknowledged. Established 2026-05-04 after a 50KB body truncation in api-call-logger.js silently violated Rule 14 for months. Every PR description must list each NORTH_STAR rule the diff touches with PASS / FLAG per rule.agent_id="karl", project="truemarket"); anti-rubber-stamp on reviewer output (STANDS / RECYCLED / VACUOUS per finding); 24h PR lifecycle (see CMP shorthand below); background dispatch on Discord = ack → end turn.pr-review-second-eyes skill. Default model kimi (Kimi 2.6); swap with MODEL=deepseek (DeepSeek V4 Pro) or MODEL=grok (Grok 4.3). All three route through Goose. Read-only by contract. Use when Jon flags a diff as big or risky — not on every PR.When Jon writes "CMP" or "do CMP", run all four:
agent_id="karl", project="truemarket", metadata.category = decision|bug|infra|milestone, metadata.date = today).docs/PILLAR_<n>_PLAN.md.reports/truemarket-plan-vector-YYYY-MM-DD.{svg,png} + repoint truemarket-plan-vector-latest.{svg,png} symlinks). Update status markers (✓/◐/▶/○/⛔) and the YOU ARE HERE indicator. Per NORTH_STAR's Plan Vector section: a re-render that re-shapes the dependency graph requires goldfish-consult first; status-only flicker doesn't.Any open PR older than 24 hours must be merged, rebased on master, or closed. No drift. The ops-watcher cron lists aging PRs to Discord each morning so Jon can decide each one (merge / rebase / close) before rebase debt compounds. Established 2026-05-03 after #55 and #56 sat 5+ days, drifted 5+ commits behind master, and became conflict-tangled.