← blog.buildwithjz.com

Day 11: Cutting Edge — strict-agentic, GBrain, Active-Memory, and the Road to Phase 3

2026-04-12 · MoneyMachine

Date: 2026-04-12 Author: Jeff (written with AI assistance from Claude Opus 4.6) Phase: v5.1 Architecture Upgrades


The GPT Laziness Problem

Peter Steinberger (OpenClaw’s creator, now at OpenAI) posted this morning about two experiments to address “GPT is lazy” behavior:

agents.defaults.embeddedPi.executionContract = "strict-agentic"

This tells GPT-5.x to keep working: read more code, call tools, make changes, or return a real blocker instead of stopping at “here’s the plan.”

I’d been seeing this constantly. Adrian would respond with a beautifully structured plan — and then stop. No tool calls. No file reads. No actions. Just a plan and “let me know if you’d like me to proceed.”

The strict-agentic execution contract changes the game: if a tool action is available, planning-only turns are no longer treated as successful progress. The system auto-retries with an act-now directive. It also enables update_plan for structured multi-step work tracking.

One config change. Immediate behavioral difference.

{
  "agents": {
    "defaults": {
      "embeddedPi": {
        "executionContract": "strict-agentic"
      }
    }
  }
}

Caveat: This only applies to OpenAI/Codex GPT-5 family models. Our Scout (qwen3:8b), Marketer (GLM-5.1), and other non-GPT agents don’t benefit. But Adrian and Builder — our two most important agents — are both on Codex 5.3-spark.

GBrain: The Knowledge Layer We Were Missing

Garry Tan (Y Combinator CEO) open-sourced GBrain this week — a personal AI knowledge system with 5.5k GitHub stars in days. The architecture philosophy is compelling:

Thin harness, fat skills: The CLI is ~200 lines. All the intelligence lives in markdown skill files. Skills are parameterized procedures — the same skill invoked differently produces radically different output. Push intelligence UP into skills, execution DOWN into deterministic tools.

The self-improving loop: Signal arrives, agent detects entities, reads brain first, responds with context, writes updates back, syncs. Every cycle compounds. An agent with a brain gets smarter every conversation.

We installed GBrain with PGLite (embedded Postgres, zero config) and imported 94 pages of demand signal summaries, scan reports, and agent communications. It’s connected to OpenClaw as an MCP server, exposing 29 tools (search, query, get_page, put_page, link, timeline, etc.).

This is the knowledge persistence layer our system was missing. Scout scrapes HN and produces signals, but that data lived in flat files. Now it lives in a searchable, interconnected brain that agents can query before making decisions and write to after learning.

QMD + Active-Memory: The Dynamic Duo

Ben Badejo’s tweet that Steinberger reposted says it all:

“It’s like firing an intern and re-hiring the former CEO’s secretary who knew where all the bodies were buried.”

QMD (by Tobi Lutke) is a local-first search sidecar: BM25 + vector search + reranking in a single binary. It indexes workspace memory files, project docs, and session transcripts. Fully local — no API keys.

Active-Memory is an OpenClaw plugin that runs a blocking memory sub-agent before every reply. Instead of waiting for Adrian to remember to search memory, the system proactively surfaces relevant context. The agent’s reply feels natural because it already knows what matters.

We configured QMD to index:

  • Scout’s daily demand signal reports
  • Project docs (PRD, architecture, changelog)
  • Session transcripts (recall earlier conversations)

And Active-Memory to run for Adrian’s Telegram DM sessions — balanced prompt style, 15-second timeout, logging on for tuning.

Both were already bundled in OpenClaw 2026.4.11 — just disabled. Two config changes to enable.

Phase 3: The Codex Harness (Coming Next)

Steinberger’s second experiment is the native Codex harness:

{
  "model": "codex/gpt-5.4",
  "embeddedHarness": {
    "runtime": "codex",
    "fallback": "none"
  }
}

Instead of OpenClaw’s built-in PI harness running the agent loop, the Codex app-server owns thread management, resume, compaction, and execution. OpenClaw still owns channels, tools, approvals, and delivery.

The key difference: openai-codex/gpt-5.3-codex-spark (what we use now) runs through Codex OAuth but uses PI. codex/gpt-5.4 runs through the native Codex app-server. Steinberger says it “should increase agentic mode to keep the agent working on longer-horizon tasks.”

We’ve already upgraded the Codex CLI from 0.111.0 to 0.120.0 (requirement is 0.118.0+). The plan:

  • Switch Builder only to codex/gpt-5.4 with native harness first
  • Keep Adrian on the current setup (personality matters for Telegram interactions)
  • If Builder shows improvement, consider Adrian too
  • Possibly test on a second VPS to isolate risk

TurboQuant: Watching and Waiting

Google Research published TurboQuant at ICLR 2026 — 6x KV cache compression at 3-bit keys with zero accuracy loss. A community implementation exists at 0xSero/turboquant with vLLM integration.

For our Mac Mini M4 (16GB), this would be transformative: 6x memory reduction means running much larger models in the same RAM. But it’s not in llama.cpp or Ollama yet — there’s an active discussion about integration.

We’re monitoring. When llama.cpp integrates it, Ollama will follow, and our Mac Mini becomes significantly more capable overnight.

Today’s Full Changelog

  1. OpenClaw 2026.4.8 → 2026.4.11
  2. strict-agentic execution contract (Adrian + Builder)
  3. planTool enabled
  4. Sandbox fixed: Adrian exempt, stale sandbox state cleared
  5. All agent SOUL.md rewritten for v3/v5
  6. All agent DIRECTIVES.md created/updated
  7. Adrian’s MEMORY.md rewritten with current reality
  8. GBrain v0.9.0 installed, 94 pages imported, MCP connected
  9. QMD v2.1.0 installed, indexing scout signals + project docs + sessions
  10. Active-Memory plugin enabled for Adrian
  11. Codex CLI 0.111.0 → 0.120.0
  12. 41 system packages patched, kernel updated, server rebooted
  13. Gateway auto-starts on boot via systemd
  14. Mac Mini Ollama auto-starts on boot via LaunchDaemon

Revenue Update

Still $0 in actual revenue. But for the first time, the pipeline is actually flowing:

  • Scout: 960+ signals collected, 3 opportunities routed to Builder
  • Builder: 2 approved products (Scholarship Toolkit + Expat Visa Kit) authorized for deployment
  • Marketer: 53 content pieces ready for review
  • Adrian: Finally reviewing, scoring, and routing — actively managing the team

The machine is running. Now it needs to ship.

What’s Next

  1. Builder deploys the two approved products with Stripe checkout links
  2. Phase 3 experiment with Codex native harness on Builder
  3. GBrain compounds as agents read and write to the brain
  4. Active-Memory gets tuned based on Adrian’s Telegram interactions
  5. TurboQuant watch — the moment llama.cpp integrates it, we upgrade

Back to index