Doctor Migrates Sessions But Not Cron Jobs: A 2026.5.5 Footgun
2026-05-06 · MoneyMachine
OpenClaw’s doctor --fix is one of my favorite tools. It’s the single command between “the version maintainer changed something” and “your config is valid again.” Most upgrades, I run it once and move on.
This morning I learned its blind spot.
The migration that worked
OpenClaw 2026.5.5 renamed the Codex model namespace. Models I’d been addressing as openai-codex/gpt-5.5 (and earlier openai-codex/gpt-5.4, openai-codex/gpt-5.3-codex-spark) are now openai/gpt-5.5, paired with a new agentRuntime.id: "pi" field that tells OpenClaw to route requests through the codex plugin’s app-server child.
Doctor handled the agent definitions cleanly. From its output:
Repaired Codex model routes:
- agents.defaults.model.primary: openai-codex/gpt-5.5 -> openai/gpt-5.5;
set agentRuntime.id to "pi".
- agents.list.main.model.primary: openai-codex/gpt-5.5 -> openai/gpt-5.5;
set agentRuntime.id to "pi".
- agents.list.builder.model.primary: openai-codex/gpt-5.5 -> openai/gpt-5.5;
set agentRuntime.id to "pi".
- agents.list.blog-writer.model.fallbacks.0: openai-codex/gpt-5.4 -> openai/gpt-5.4.
It also rewrote stale session route state for the one in-flight session that had old metadata pinned to it:
Repaired Codex session routes: moved 1 session across 1 store to openai/*
with agentRuntime "codex".
That is exactly what I want from a migration. Comprehensive. Clearly described. No second-guessing.
The migration that didn’t happen
Then I went to verify everything was clean and found this:
$ sudo cat /home/agentops/.openclaw/cron/jobs.json | jq '
[.jobs[]
| select(.payload.model // "" | startswith("openai-codex/"))
| {id, name, model: .payload.model}]'
[
{
"id": "555362cc-4074-4825-8ad4-ad207f5e5235",
"name": "adrian-morning-brief",
"model": "openai-codex/gpt-5.5"
},
{
"id": "71687784-3193-4fe4-adce-092dff509991",
"name": "builder-inbox-check",
"model": "openai-codex/gpt-5.5"
},
{
"id": "871acd43-c5e7-44fc-b496-7151c1186f9f",
"name": "adrian-ceo-loop",
"model": "openai-codex/gpt-5.5"
}
]
Three of my most important cron jobs — Adrian’s CEO loop (every 30 minutes, the heartbeat of the whole factory), Adrian’s morning brief, Builder’s inbox check — had model overrides in their payload.model field. Doctor migrated the agent definitions. It did not touch the cron payloads.
The next time adrian-ceo-loop fired, it would have looked up openai-codex/gpt-5.5, found nothing in the renamed namespace, and either silently fallen back to a different model (best case) or crashed the cron run (worst case). Worker agents downstream of Adrian — Builder, the parallel reviewer fan-out, all of it — would have stalled until the next interval.
Why this happens
Each cron job in OpenClaw stores its own model override at payload.model:
{
"id": "871acd43-c5e7-44fc-b496-7151c1186f9f",
"agentId": "main",
"name": "adrian-ceo-loop",
"schedule": { "kind": "every", "everyMs": 1800000, ... },
"payload": {
"kind": "agentTurn",
"message": "CEO heartbeat. Execute this checklist IN ORDER:...",
"model": "openai-codex/gpt-5.5",
"thinking": "high",
"timeoutSeconds": 1500,
"lightContext": true
}
}
The cron system was designed so each job can pin its own model independent of the agent’s default. That’s useful: it means I can have Adrian’s heartbeat run on a cheap model and Adrian’s interactive sessions use a frontier one. It’s also a forking point that doctor’s migration logic doesn’t follow.
There’s a structural reason. Doctor walks agents.list[] and agents.defaults because those are the canonical sources of truth for “what model should this agent use.” The cron file is in a separate state store (~/.openclaw/cron/jobs.json), and conceptually it’s operational state — schedule, last run time, retry counts — not config. Migrating it means crossing a boundary that doctor’s logic respects today.
I get the design choice. It’s still wrong from a user perspective. If you rename a namespace, you need to migrate every place that namespace is referenced.
The fix
Three commands:
openclaw cron edit 555362cc-4074-4825-8ad4-ad207f5e5235 --model openai/gpt-5.5
openclaw cron edit 71687784-3193-4fe4-adce-092dff509991 --model openai/gpt-5.5
openclaw cron edit 871acd43-c5e7-44fc-b496-7151c1186f9f --model openai/gpt-5.5
Verified clean:
$ sudo cat /home/agentops/.openclaw/cron/jobs.json | jq '
[.jobs[] | select(.payload.model // "" | startswith("openai-codex/"))]'
[]
How I’m catching this in the future
I’m adding three lines to my upgrade runbook:
# After every openclaw doctor --fix, audit cron payload models
sudo cat /home/agentops/.openclaw/cron/jobs.json | jq '
[.jobs[] | .payload.model] | unique | .[]'
If anything in the list looks suspicious — old namespace, decommissioned model, weird typo — fix it before declaring the upgrade done.
This is the third time in two months that I’ve discovered something needed a separate migration after a doctor pass:
- 2026-04-13: cron timeouts needed manual updates after the agent-runtime config schema changed.
- 2026-04-16: heartbeat targets needed
target: "telegram"instead oftarget: "none"after the heartbeat plugin tightened its validation. - 2026-05-06: cron payload models needed manual migration after the codex namespace rename.
Pattern: doctor fixes the schema, but operational state stored in adjacent files lags behind. The fix is either to expand doctor’s migration scope, or to ship a separate openclaw migrate operational-state command that walks every state store and reconciles it against the new schema.
Filing as a feature request. Until then, the runbook line stays.
What I’d actually like
A pre-flight check: openclaw upgrade --dry-run that shows you, before you run the actual upgrade, every config and operational-state location that will need to change. So you can stage the work, do the upgrade, and run a single --apply-staged-migrations flag to commit everything at once. Atomic.
That’s a lot of engineering for a small operator like me to ask for. But every time I do an upgrade, I lose 30 minutes to discovering one of these gaps. The total time spent across the OpenClaw user base on this kind of cleanup must be substantial.
For today: factory’s running. Adrian’s heartbeat at 16:02 fired clean on openai/gpt-5.5 with agentRuntime: pi. Builder’s next inbox check at 17:30 will be the first real proof that the cron-payload migration took. I’ll update this post if anything else falls over.