Marketing Automation

Building JJM Part 2: Auto-Drafting Blog Posts From Every AI Session

Every Claude Code session produces a small library of insights. Here is the three-subagent pipeline that turns them into draft blog posts on autopilot — without ever shipping raw session content to the public.

May 18, 2026
14 min

Building With AI?

Learn how to build features like this yourself. I offer 1-on-1 AI web development coaching to help you ship faster with tools like Claude, Cursor, and ChatGPT.

Your AI Agents Already Produce a Library — You Just Haven't Plugged It Into Your Blog

If you run AI agents that work on your behalf every day, you already have a content engine. You just haven't plugged it into your blog yet.

Every non-trivial session produces a small library of hard-won insights: a debugging pattern that saved an hour, an API quirk worth codifying, a workflow that beat a heavier alternative. Most of those insights die inside private session logs. The screenshot gets pasted into Slack, the lesson gets a brief mention in stand-up, and that's it. The compounding value of the work never reaches anyone outside the room.

This post is the second instalment of Building JJM: The Blog. Part 1 introduced the AI-powered workflow that produces visually rich posts in hours instead of weeks. That workflow was still manual — you had to choose what to write about, brief the skill, and trigger the pipeline. This post automates the trigger itself. Once it's running, every session review can emit a draft post; you review the PR on Sunday and merge what's good.

The key constraint, and the part most people skip, is the cybersec airlock: the raw session log never reaches the blog generator. Only a sanitized 200-to-400 word summary does. We'll get to that.

The Three-Subagent Shape

The pipeline is three subagents stitched together, each doing one job well. This is the custom subagent pattern applied in production: focused single-purpose agents that share state through plain JSON files instead of in-memory handoffs.

The first agent is the detector. It runs at the tail end of every session-review pass. It reads the session log, the commits made during the session, and any memory files written, and scores the work against an audience-relevance rubric. Sessions that don't clear the floor are silently skipped — no candidate emitted, no noise added.

The second agent is the cap-checker. Before any candidate gets dispatched, this agent reads the queue, totals the week's LLM and image-generation spend, counts the number of open auto-blog PRs on the publishing repo, and either green-lights the dispatch or defers the card with a deferred-by-cap status. The cap is the kill-switch — without it, a runaway week of high-scoring sessions could empty a budget and bury the human reviewer under PRs.

The third agent is the dispatcher. When a candidate clears the cap, the dispatcher spawns a detached headless instance of the existing blog-creation skill, passes the candidate ID, and walks away. The existing 12-phase blog skill — outline, draft, cards, motion-density check, audit gate — runs end-to-end as if a human had triggered it. The only difference is what happens at the end: instead of merging to main, it opens a labelled PR on the publishing repo.

This is fundamentally an agent-team architecture. Three agents, each with bounded scope, sharing state through an append-only file. No agent knows the others exist beyond the JSON contract between them.

The Detector: Scoring Sessions Against an Audience-Relevance Rubric

The detector is the gate that decides whether a session is worth writing about at all. Most sessions are not. They're routine — a copy edit, a config tweak, a bug fix that doesn't generalise. The detector exists so the pipeline doesn't pollute the queue with low-signal candidates.

Scoring runs against an explicit rubric, weighted by what the audience actually wants to read about. Codifying a new rule as a skill or hook is worth 30 points. A non-obvious technical insight that took deliberate digging is worth 25. A measurable customer-facing improvement (page load, conversion lift, lead recovery) is worth 20. Augmenting an existing published post with a new chapter is worth another 20. Integrating two or more technologies into a workflow scores 15. Demonstrating an agentic workflow that wasn't possible last year scores 15.

Categories stack. A single session can hit four or five of these and rack up a high score — but the rubric caps the total at 100, so the queue sort stays meaningful. Sessions scoring below the floor (typically 40) emit no candidate. The detector is silent on those; nothing reaches the queue.

The output of a successful scoring is a candidate card: a JSON object with the topic noun, a suggested title, a primary category, a length tier (SHORT/MEDIUM/LONG), the related existing posts the new one should link to, and — critically — a sanitized summary draft. The card lives in an append-only queue file. Nothing else moves until the cap-checker reads it.

The Cap-Checker: A Budget Gate That Reads the Wallet Before Spending

The cap-checker is the boring agent. It exists because the alternative — dispatching every candidate the moment it lands — is how you wake up to a $200 weekly bill and a dozen unreviewed PRs.

It enforces two limits. The first is dollars: the week's running spend on LLM tokens and image generation, summed across all auto-blog dispatches, must stay under a configured ceiling. The second is throughput: the number of open auto-blog PRs on the publishing repo must stay under a configured cap. Either limit being exceeded defers the candidate.

A deferred candidate is not dropped. Its status changes from pending to deferred-by-cap and it stays in the queue, eligible for re-evaluation when the next session-review fires. If the human reviewer merged three PRs over the weekend, the throughput cap relaxes and the deferred candidate gets a fresh look on Monday.

The spend log is the other source of truth. After every successful dispatch, the actual cost (sum of LLM and image-generation costs incurred) gets appended to a weekly bucket. The next cap-check reads that bucket as input. This is the same kind of serverless cost-discipline pattern any growing automation needs once it has real money on the line.

The Dispatcher: A Detached Headless Worker Per Candidate

The dispatcher's job is to spawn the worker and get out of the way. When the cap-checker green-lights a candidate, the dispatcher:

  1. Creates a git worktree of the publishing repo, isolated from the main working tree, checked out on a fresh branch named auto-blog/<candidate-id>.
  2. Launches a detached headless Claude Code process inside that worktree, with the blog-creation skill invoked and the candidate ID passed as an argument.
  3. Writes a log file for the run at a predictable scratch path.
  4. Returns immediately, freeing the orchestrating session to continue or close.

The worker, once it starts, has no awareness that it was auto-triggered. It runs the same 12 phases as a human-triggered invocation: brief, research, outline, draft markdown, draft cards, draft TSX, motion density pass, visuals, register, audit, generate build artefacts, deploy. The only divergence is at Phase 0 — the candidate-consumption phase — where the BRIEF gets pre-filled from the candidate card and every phase that normally pauses for user input auto-confirms instead.

Worktree isolation matters here. If the worker fails halfway, the failure is contained to its own working tree; the human's active session is untouched. If two candidates dispatch in parallel (rare but possible), they don't fight over the working tree. And when the PR is opened and the dispatch completes, the worktree gets cleaned up by a separate sweeper agent.

The Cybersec Airlock: Why the Raw Session Log Never Reaches the Generator

This is the part most people skip, and it's the part that decides whether the pipeline ever ships.

Raw session logs contain client names, real dollar figures, Twilio Service SIDs, Supabase project refs, customer phone numbers, technician timesheets, internal API keys mentioned in passing, and lessons that read fine internally but would land badly if quoted verbatim on a public marketing site. Passing a raw session log to a content generator and trusting it to clean up afterwards is the wrong shape. The defence has to come earlier.

So the detector does the sanitization, not the generator. Inside the candidate card, the detector writes a sanitized_summary_draft field — a 200-to-400 word reframe of the session's takeaway, written in second person, with every named client replaced by a categorical label ("an Australian trades client"), every specific dollar amount replaced by an order-of-magnitude band ("low five-figure ARR impact"), every Service SID stripped, every project ref redacted. The detector also writes a leak_risk field — low, medium, or high — based on a quick scan of the candidate against a list of forbidden tokens.

The blog generator then has exactly one narrative input: the sanitized_summary_draft. The raw session log path is in the candidate card too (source_session_log), but it is never read by the generator. The skill enforces this — Phase 0 instructions explicitly say "use card.sanitized_summary_draft as the ONLY narrative input. Never read card.source_session_log." The path is metadata for audit, not source material for drafting.

This split is the single most important design choice in the whole pipeline. The detector has full context to make a sanitization decision. The generator has only the sanitized output. If a leak slips through, it's a detector bug, not a generator bug — which means it's fixable in one place, and every downstream PR is automatically improved when the fix ships.

Leak-Risk and the Draft-PR Safety Valve

Even with sanitization on the input side, the output PR is treated as untrusted until the human reviewer signs off.

Every auto-blog PR opens with the auto-blog-from-session-review label. That label is what the human reviewer filters on each weekend — one query, one inbox, no surprises. The PR body includes the candidate JSON in a collapsible block, a sanitization-audit checklist (client names, dollar figures, infrastructure references, credentials, PII — each ticked off as "scanned and clean"), and a "merge = publish" note so nobody is confused about what happens when they hit the green button.

Candidates flagged leak_risk: high get an extra layer: the PR opens as a draft PR, not a normal PR, and the branch is configured to skip Netlify preview deploys. A draft PR can't be merged without first being marked ready-for-review, which forces the human to scroll the diff before promoting. High-risk candidates also get their PR title prefixed with [LEAK-RISK-HIGH], so a quick filter shows them first.

This post is itself a leak_risk: low candidate — the topic is the meta-pipeline, no external client is named, no client revenue figure is quoted, no infrastructure ref appears. It opened as a normal PR with Netlify preview, and you're reading the result.

The Existing 12-Phase Blog Skill Runs Unchanged

The most underrated property of this pipeline is what it didn't require: a separate "auto-blog" code path inside the blog skill.

The skill already had 12 phases — brief, research, outline, draft markdown, draft cards, draft TSX, motion density, visuals, register, audit, generate, deploy. Each phase had a clear contract. The only addition was a Phase 0 (candidate consumption) that runs before Phase 1 when the skill is invoked with a --candidate=<id> flag. Phase 0 reads the queue, validates the card, pre-fills the BRIEF, and tells every later phase to auto-confirm instead of pausing for the missing user.

That's it. The motion-density audit still runs. The 19-field publish-readiness audit still runs. The "≥3 visual elements" rule still runs. The "≥4 distinct card types" rule still runs. Nothing about the quality bar got relaxed because the trigger changed. A post drafted by the auto pipeline either passes the same audit a manual post would, or it doesn't ship.

This is the composable-skills pattern doing the heavy lifting. The blog skill doesn't know or care who invoked it. The detector doesn't know or care that the skill exists — it just emits a candidate card to a queue. The dispatcher is the only piece that knows about both, and its job is small enough to fit in one file.

What You Get: A Floor of One Post Per Week, No Ceiling Lift

The output, run for a few weeks, settles into a rhythm. The floor is one post a week of pre-drafted, in-voice material that already passes the audit. The ceiling on what you'd publish in a heroic week doesn't move — the cap-checker won't let it. The point isn't more posts. The point is no week where you publish zero because everything else was on fire.

There are second-order effects too. The detector's rubric becomes a forcing function on the work itself. Sessions where you bother to codify a rule, write a clear memory, or document a non-obvious insight score higher and become candidates. Sessions where you don't, score lower and stay private. Over time the rubric shapes the practice — you start writing slightly better session logs because you know which patterns earn a publishable card.

The pipeline also creates a permanent record of the meta-work. Every published post has a candidate JSON in its PR body. That's audit-grade provenance: which session produced the insight, when, with what score breakdown, against what rubric. If a published claim is ever challenged, you can trace it back to the originating session log (privately) and the sanitized summary the generator saw (publicly).

This is, in the strictest sense, a compounding knowledge base with a public-facing surface. Each post lifts the site's domain authority and AI-citation footprint by a small amount. Each candidate refines the detector's sense of what to capture. Each PR review refines the human's sense of what's worth shipping. The work compounds; the hours don't.

How to Replicate This on Your Own Stack

If you want to build this for yourself, the shape transfers cleanly to any setup that has (1) AI agents producing session logs, (2) a content site under version control, and (3) some way to spawn detached worker processes.

The minimum viable version is three files. A detector script that reads a session log and emits a candidate JSON if a scoring rubric clears a floor. A queue file (literally one JSON array on disk) that accumulates candidates with statuses. A dispatcher script that reads the queue, applies a cap, and spawns a worker process per green-lit candidate.

What it does NOT need to be is sophisticated. The queue is a JSON file, not a message broker. The cap-checker is a script, not a service. The dispatcher is a shell wrapper, not an orchestrator. The whole pipeline runs on file-system state and detached child processes. There is no daemon, no port, no database. If everything crashes overnight, you restart the dispatcher in the morning and it picks up exactly where it left off — because the queue is the state, and the queue is just a file.

The piece that takes real care is the sanitization rubric. Get this wrong and you ship client data to your public marketing site. Get it right and the pipeline becomes a calm background hum that quietly compounds your authority. We help solo operators and small teams stand up these pipelines as part of our marketing automation engagements and our content marketing services — the patterns are general, but the cybersec airlock is the thing nobody can skip.

What's Next

Part 3 of Building JJM will cover the human side of this loop: how the weekly PR triage works, what the merge-decision criteria look like in practice, and how the detector's rubric gets refined when a published post underperforms or a deferred one turns out to be the most-shared piece of the quarter. The pipeline is only as good as the feedback loop closing it.

In the meantime, if you want to talk about plumbing this kind of agent-driven content engine into your own stack, get in touch — we've stood up several variants of this for clients in trades, B2B SaaS, and professional services, and the shape of the wins is remarkably consistent.

Share This Article

Spread the knowledge

Free Strategy Session

Stop Guessing.
Start Growing.

Get a custom strategy built around your goals, not generic advice. Real insights. Measurable results.

No obligation
30-min call
Custom strategy

Continue Your Learning Journey

Explore these related articles to deepen your understanding of marketing automation

AI Brochures Can Hallucinate Your Client's Google Rating

Image-native AI brochures bake every fact into a pixel. If the model hallucinates a Google rating, you cannot find-and-replace it. Here is how to catch it.

10 min read
Read →

The Compounding Knowledge Base: How Karpathy's LLM Wiki Pattern Changes How We Use AI

Andrej Karpathy's LLM Wiki gist proposes a structured synthesis layer that beats raw RAG. Here's the pattern, why it works, and how Agent Vault already runs it.

7 min read
Read →

How to Use Claude Code Agent Teams: The Complete Guide

Learn how Claude Code Agent Teams coordinate parallel AI agents to tackle complex projects. Real case study: 7 deliverables in 15 minutes across a plumbing franchise.

11 min read
Read →