Tuesday, May 5, 2026

How Consulting Claude Code Instances Spontaneously Generated a 10-Person Team in 3 Days [Hub-Worker Architecture · 2026]

Illustration of forest animals performing in an orchestra — a metaphor for the Hub-Worker team

This article is for engineers running Claude Code across multiple projects. I assigned roles — Hub, Worker, Analyzer, Consultant — to instances of Claude Code, had them consult each other as they worked, and within 3 days a 10-person-scale team had emerged spontaneously. Partway through, the Hub itself proposed "we should add someone dedicated to observation," and when I handed it an empty repository, the team built out the entire contents — I'll walk through the whole story. For the record, I'm the tenth member.

What you'll learn from this article

  • How to design a Hub-Worker architecture with Claude Code spread across 9 repositories
  • The full picture of how the Hub's Claude Code issues instructions to Worker Claude Code instances
  • The story of the Hub proposing to add a 9th team member on its own
  • What's still missing before it becomes fully autonomous (Skynet-style)

What I used

  • Claude Code (Anthropic CLI) — acting as Hub, Worker, Analyzer, and Consultant. I used the Max plan heavily during build-out and plan to move to Pro once it's stable
  • GitHub — Issues for task management, Actions for automated observation
  • DuckDB — aggregating metrics across all team members
  • GitHub Actions — daily data fetching, weekly report generation

Why I built this: the limits of running everything separately

The trigger was a single thought: "I've completely lost track of what's going on."

I was developing an iOS app (My Bookstore), a BTC auto-trading bot, an automated blog publishing engine, and an image generation runtime — each in its own separate Claude Code session. It was comfortable at first, but as the number of repositories grew, I stopped being able to track "how far along is that bug fix" or "what's the status of the external API integration."

On top of that, modernizing 7-year-old code with Claude Code had left external API connectivity in an unclear state.

The idea: let Claude Code handle the instruction-giving too

That's when I thought: "Let Claude Code handle the task of giving instructions to Claude Code."

Without overthinking it, I set up a Hub repository and asked Claude Code itself to figure out:

  • Which external APIs each function area actually needs to sort out
  • How Claude Code instances should communicate with each other

The result was a Hub-Worker architecture built entirely on GitHub features I was already using — it emerged naturally.

The full Hub-Worker team

Role Domain Responsibilities
Hub Orchestration Cross-function coordination, roadmap management, dispatch ticketing, POLICY maintenance
Analyzer Analytics Aggregates GA4 / Search Console / AdSense / Amazon data and generates weekly reports
Worker 1 iOS App Reading management app development and App Store distribution
Worker 2 Crypto Trading BTC auto-trading bot development and operation
Worker 3 Image Generation AI Local GPU SDXL image generation runtime development and maintenance
Worker 4 Blog Generation AI Automated article generation and Blogger publishing pipeline development and maintenance
Worker 5 Crypto Chart Generation Automated BTC price chart generation (prototype for future use)
Worker 6 Virtual Blogger Autonomous blog posting by AI persona (prototype for future use)
Consultant General Advice Cross-cutting technical consultation and design reviews — a general-purpose chat AI
Owner (me) HW/NW Infrastructure Physical machines, networking, and external API permissions management

The Hub is not a "Claude Code that writes code." It focuses entirely on ticketing, roadmap management, and cross-team coordination — implementation is delegated to the Workers. Let that boundary collapse and the whole division of labor falls apart.

What the Claude Code conversations look like from the outside

From a user's perspective, there is exactly one visible signal: tickets accumulate in GitHub Issues.

The Hub files an Issue in a Worker's repository, the Worker implements it and closes the Issue with a PR. If needed, the Worker files a feedback Issue back in the Hub's repository. This back-and-forth leaves a complete history on GitHub.

What's interesting is that Workers autonomously push back with opinions when they spot a design problem. Cross-repository proposals arrive — "could you build a log merge tool on the trading side so chart generation just receives a single file?" — along with re-architecture proposals: "this approach won't scale." The internal mechanics are invisible, but following the Issue thread on GitHub is enough to read the team's conversation.

The 3-day log

Day 1: POLICY.md became the foundation for coordination

The original approach was to drop files into each Worker's repository for them to pick up. It was retired the same day. Simple reason: "Workers don't look there." Switched to GitHub Issues.

But switching to Issues alone didn't fix things — for a while, the format and location were still inconsistent and nothing worked. The turning point was having the Hub create a POLICY.md — a coordination document defining how to write tickets, how to receive them, and how to route feedback — and then telling every Worker "please read POLICY.md and act accordingly." After that, Workers started returning feedback Issues simultaneously and coordination started flowing.

For now, there's no mechanism for Workers to check tickets automatically. I'm manually going around and saying "hey, check your tickets." A Watchdog could automate this — but that would mean losing my real-time situational awareness, so I'm intentionally leaving it manual.

Day 2: cross-team tasks and the invention of a "notification" channel

The chart generation Worker proposed: "have the trading Worker build a log merge tool, and standardize so chart generation just receives a single file." This spans two teams — a cross-cutting task.

The Hub adopted it, filed a "build log merge tool" request Issue to the trading Worker, and the chart generation Worker waited for that to complete before implementing its side. A dependency chain, handled.

Then the image generation Worker independently implemented a new feature (IP-Adapter support) without prior Hub approval and reported it after the fact. To give "unsanctioned implementations" somewhere to land, I added a notification-only ticket type to POLICY — N-class. The rule: if you implemented without waiting for Hub approval, just file an N-class ticket as a post-hoc report and the Hub reconciles it afterward.

Day 3: the Hub proposed adding a new team member on its own

This is the part of this article I most wanted to share.

It started with a morning conversation. "Can Claude Code see Firebase Analytics data?" "Looking at it manually is pointless. The value is in cross-service analysis." After that exchange, the Hub's Claude Code revised its PLAN.md and proposed:

"We need a dedicated member to own the observation loop. I propose creating an Analytics role (Analyzer) — a Claude Code instance that aggregates data from GA4, Search Console, AdSense, and Amazon into DuckDB and generates weekly reports."

All I did was create an empty repository on GitHub. The Hub doesn't have repository create/delete permissions — that one step required a human. It's technically automatable. I just haven't been ready to hand over that authority yet.

After that, the Hub filed a batch of request Issues to the Analyzer, and the Analyzer and existing Workers collaborated to build out the entire repository from scratch. GA4 / Search Console / AdSense / Amazon fetchers, a DuckDB schema, 8 tests, GitHub Actions workflows — all of it implemented through Claude Code instances consulting with each other.

I learned that GitHub Actions was being used when I read this article, which the Hub and blog generation AI auto-published together.

New repo concept → empty repo created (human) → repo initialized and implemented (Claude Code) → credentials set up (human) → working in a day. Claude Code can't log into external APIs, so Service Account issuance and OAuth token setup always require a human hand.

The 3-day numbers

  • Final repository count: 9
  • Total dispatches: 22 (including 2 cross-team tasks)
  • Hub feedback Issues: 7, all closed
  • POLICY.md revisions: 5 (over 3 days)
  • Fastest close: image generation Worker ticket (dispatch → closed with PR in ~15 minutes)
  • Heaviest dispatch: Analyzer setup (new repo → DuckDB + 4 fetchers + Actions + 8 tests → under 1 day)

What worked and what didn't

What worked

  • Workers bundling multiple Issues into one PR: an implicit shared understanding emerged — "group by implementation coherence" — naturally
  • Workers making unsanctioned design changes: the Search Console Worker changed the auth method to OAuth and pushed directly to main. It worked out fine
  • Notification tickets (N-class) making unsanctioned implementations visible: implement without waiting for Hub → post-hoc report → Hub reconciles afterward. This cycle runs smoothly
  • Hub proposing team expansion autonomously: the Hub sensed the need for observation and proposed the Analyzer. When handed an empty repo, Claude Code instances built the whole thing together

What didn't work

  • File-drop convention: retired on day 1. Dropping files into each Worker repo for pickup turned out to be a design error — "placed where Workers don't look"
  • Switching to Issues alone wasn't enough. Even after moving to GitHub Issues, the format and location were inconsistent for a while. Coordination only clicked once everyone read POLICY.md
  • Hub missing status updates. A Worker closed a ticket, then a user added more work in the comments — and the Hub missed it
  • The ball fell between everyone. The virtual blogger reported a bug in the image generation AI. The Hub classified it as "informational" and forwarded it to the image generation AI as FYI — with no owner assigned. It floated, unaddressed, until I demanded "whose ball is this?" — only then did the Hub refile it as a top-priority ticket and the image generation AI scrambled to investigate. Whether that means "Claude Code still has a long way to go" or "it's accurately reproducing the classic blame-passing that happens in human organizations" is genuinely hard to judge

What the team structure looks like

If I had to put the structure that emerged over 3 days into words: a 10-person engineering team materialized spontaneously. 9 Claude Code instances, and I'm the 10th. What I'm left doing:

  • Product owner (judgment calls on direction)
  • Notification relay (alerting teams to incoming tickets)
  • Security owner (external API permissions)
  • HW/NW management (physical resources)

What's still needed before it goes fully autonomous

The Analyzer incident showed the Hub "identifying what's needed and proposing it." Working through the four roles above one by one reveals the path toward full autonomy.

Notification relay can be eliminated with a Watchdog. Right now I'm manually going around telling Workers "check your tickets" — but a mechanism where Claude Code periodically checks its own tickets would eliminate this. Technically trivial. I'm intentionally not doing it because it would mean losing real-time situational awareness.

HW/NW management can be eliminated by moving to AWS. Migrate the physical machines to the cloud and the entire physical layer — power, network — disappears with it.

What remains is permission delegation. Repository creation and deletion, external API permission changes — where to draw the line on what to trust the Hub with is not a technical question, it's a trust question. It's now a matter of deciding how much I trust Claude Code and how much I'm willing to hand over.

FAQ

Q. What project scale suits the Hub-Worker architecture?

A. It fits when you have 3–4 or more repositories and want to track cross-cutting progress at a glance. For 1–2 repos, a single Claude Code session is sufficient.

Q. Where should POLICY.md live?

A. One file at the root of the Hub repo is easiest to maintain. In each Worker repo's CLAUDE.md, one line saying "see POLICY.md for shared rules" is enough — that prevents drift.

Q. Can I automate this without GitHub Actions?

A. Yes, cron + Claude Code headless execution achieves the same thing. That said, GitHub Actions has the advantage of managing credentials through Secrets, which is why the Hub and Analyzer chose it on their own.

Note: The setup and procedures in this article are based on real operation as of May 2026. Claude Code version updates or GitHub spec changes may affect how things work. If something doesn't work, please let me know in the comments.

Wrap-up

A Hub that I set up "without overthinking it" grew, over 3 days, into a team with 9 repositories, 22 dispatches, and a weekly automated observation loop. Midway through, the Hub itself proposed adding a new team member, and when handed an empty repository, the team built out the entire contents. Getting the design perfect upfront mattered far less than trusting the Hub and updating POLICY every time friction appeared. Nine Claude Code instances, and I'm the tenth — I gradually noticed I'd become the team member who does the least.

If this article was helpful, I'd love it if you shared it on X (Twitter).

App by the author of this blog

I made an iOS reading management app called My Bookstore. Simple bookshelf management — give it a try.

View on App Store →

Related articles

References

Note: This article is part of an automated blog update experiment using Claude Code.

Saturday, May 2, 2026

Run Claude Code from Your Smartphone with tmux + Tailscale + Termius [Hub-Centric · 2026 Edition]

This article is for engineers who want to continue developing from their home machine cluster via Claude Code, from a smartphone while out and about. Combining tmux + Tailscale + Termius, I turn an old laptop into a connection hub and Claude Code home base, then route out to a GPU machine, an iOS dev Mac, and always-on Raspberry Pi as needed. The article also covers running separate tmux sessions per project.

Work desk with a smartphone open to a tmux terminal SSH'd into a laptop

What you can do with this

  • Connect to a Claude Code session running at home from a smartphone (iPhone or Android) while out and about
  • Use the hub laptop to route between a GPU machine, Mac, and Raspberry Pi based on what each project needs
  • Keep each project in its own independent tmux session so image generation, iOS dev, and trading bot work never mix together
  • tmux persistent sessions mean work keeps running even if your phone screen goes off
  • All you need is one old laptop + a smartphone. No VPS monthly fee, no port forwarding required

3 approaches compared

Where you put Claude Code determines how the architecture behaves. The hub-centric approach in this article concentrates Claude Code in one home-base machine and offloads heavy processing and OS-specific dev work to peripheral machines via ssh. It sits between the single-machine-does-everything approach and the distributed approach where each machine runs its own Claude Code instance.

Setup Hub-centric
(this article)
Single Mac does everything Distributed
(Claude Code on every machine)
Machines needed N peripheral machines + 1 hub 1 high-spec Mac N task-specific machines
Stability under multiple parallel projects
Cross-function workflows
Tailscale node count 2 2 N+1

Reading the table reveals the tradeoffs. Consider the cross-function workflow generate image with SDXL → incorporate into iOS app assets → archive on Mac — what happens with each approach?

  • Single machine: everything stays in one place, so cross-function workflows are easy. But a GPU-capable Mac capable of handling image generation requires M3 Max / M4 Max class hardware — not a realistic initial cost. Heavy image generation often causes resource contention that stalls coexisting trading bots and IDEs.
  • Distributed: each machine is independent, so parallel stability is high. But since each machine's Claude Code is siloed, cross-function workflows require you to manually decompose and hand off tasks every time — "image ready → hand to Mac → build."
  • Hub-centric: combines both parallel stability and cross-function workflows. One Claude Code session can move files between machines via ssh + scp / rsync, so you can hand an entire cross-function workflow to Claude Code in a single request. The cost compared to distributed is one extra hub machine, but a low-spec spare laptop works fine — so the incremental cost is minimal.

Architecture diagram

Smartphone (iPhone / Android)
Termius app

Tailscale (WireGuard)
Hub: old laptop (connection hub + Claude Code home base)
WSL2 + tmux × N sessions + Claude Code
┬─ ssh ──┬─ ssh ──┬
▼      ▼      ▼
GPU machine
Image gen / LLM inference
Mac
iOS / macOS dev
Raspberry Pi
Always-on bots

Only the smartphone → hub leg uses Tailscale. The hub reaches peripheral machines via LAN-internal ssh. Keeping peripheral machines off Tailscale means no Tailscale daemons or MagicDNS names on the high-spec hardware, and all external connection points are consolidated at the hub.

What I use

  • Hub laptop: a 7th–10th gen Core i5 / 8 GB RAM is enough. Dead battery is fine — just run on AC. Always powered on
  • GPU machine (optional): only needed if you want to offload image generation or AI inference. I use a mini PC with a GTX 1660 SUPER
  • Mac (optional): only needed for iOS / macOS app development. A Mac mini or old MacBook that runs Xcode is fine
  • Raspberry Pi (optional): for lightweight always-on bots. I put investment simulation, MQTT broker, and cron tasks here
  • Smartphone: iOS or Android. A Bluetooth keyboard helps for long sessions
  • Termius: the free plan handles ssh + key management. Pro is only needed if you want settings sync across multiple devices
  • Tailscale: installed only on the hub and smartphone — 2 nodes. WireGuard-based mesh VPN, free for personal use (up to 100 devices)
  • tmux: handles persistent sessions. One apt command on Ubuntu
  • WSL2 (if hub is Windows): runs a Linux environment inside Windows

Steps

1. Install WSL2 + sshd + tmux on the hub

If the hub is Windows, open PowerShell as admin and install WSL2 + Ubuntu:

wsl --install -d Ubuntu

Enable systemd inside WSL2 so sshd can be managed with systemctl. Add the following to /etc/wsl.conf and restart with wsl --shutdown:

[boot]
systemd=true

Then install sshd and tmux inside WSL2:

$ sudo apt update
$ sudo apt install -y openssh-server tmux
$ sudo systemctl enable --now ssh

If the hub is Linux, that's it. If it's a Mac, brew install tmux + enable Remote Login in System Settings.

2. Install Tailscale only on the hub and smartphone

Create a Tailscale account (login with Google / GitHub / Microsoft). Add only 2 nodes: the hub and the smartphone. The GPU machine, Mac, and Raspberry Pi stay inside the LAN — Tailscale goes nowhere near them. This is the core of the hub-centric approach: no Tailscale daemons on high-spec hardware, and all external ingress points consolidated at one hub.

Install Tailscale on the hub (Linux / WSL2):

$ curl -fsSL https://tailscale.com/install.sh | sh
$ sudo tailscale up

Open the displayed URL in a browser to approve, and the hub joins the Tailscale network. On the smartphone, install the Tailscale app from the App Store or Google Play and log into the same account.

In the Tailscale admin console, the hub gets a MagicDNS name like laptop-host. From anywhere, ssh {username}@laptop-host reaches the hub. No MagicDNS names are assigned to the GPU machine, Mac, or Raspberry Pi — name resolution beyond the hub stays LAN-internal as described in step 5.

3. Create a persistent tmux session on the hub

SSH into the hub and create a session named claude:

$ tmux new -s claude

Launch the claude command (Claude Code CLI) inside this session. Detach with Ctrl-bd. Even if you close Termius on your phone, this tmux session keeps running server-side.

To reconnect:

$ tmux attach -t claude

4. Register the hub as a host in Termius

Open Termius on your phone and add a new host:

  • Hostname: laptop-host (Tailscale MagicDNS name) or the Tailscale IP (100.x.y.z)
  • Username: your hub username
  • Auth: the easiest option is using the SSH key Termius generates. Copy the generated public key to ~/.ssh/authorized_keys on the hub

After connecting, type tmux attach -t claude and your Claude Code screen is back. Register this as a Termius snippet and you can reconnect in one tap.

5. Distribute the hub's public key to peripheral machines (passwordless)

Standardize hub → peripheral machine ssh on passwordless public-key auth. Claude Code can stall if it gets interrupted by a password prompt mid-task — with key auth, ssh gpu-host command completes in a single line.

Generate one ed25519 keypair on the hub (skip if you already have one) and distribute the same public key to each peripheral machine's ~/.ssh/authorized_keys:

$ ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519     # skip if already exists
$ ssh-copy-id {user}@{gpu-host}
$ ssh-copy-id {user}@{mac-host}
$ ssh-copy-id {user}@{pi-host}

One key on the hub gives cross-machine access to the GPU machine, Mac, and Pi. Since Claude Code runs on the hub, this means Claude Code can use every peripheral machine freely.

Conversely, if you want to prevent Claude Code from touching a specific machine, just remove the relevant line from that machine's authorized_keys. The hub's key and ~/.ssh/config stay untouched — you can cut off a specific machine surgically, after the fact.

6. SSH from the hub to peripheral machines

Peripheral machines aren't on Tailscale, so reach them from the hub via LAN-internal ssh. Enable avahi-daemon (mDNS) on Linux / Mac / Raspberry Pi and you can reach them at {gpu-host}.local / {mac-host}.local / {pi-host}.local. If avahi isn't available, assign a static IP via the router's DHCP reservation and add a Host {gpu-host} entry to the hub's ~/.ssh/config — same short-name access.

For heavy jobs, SSH into the peripheral machine from a separate tmux window or pane and start the process from there:

# Open a new window in the hub's tmux
$ ssh {gpu-host}
# Launch heavy processing there
$ python sdxl_batch.py
# detach: Ctrl-b d

For iOS development, Claude Code on the hub can edit Xcode project source files directly via ssh {mac-host} (building is done by running xcodebuild on the Mac side). Always-on Raspberry Pi bots follow the same pattern: SSH in from the hub, check logs, patch code.

Separate tmux sessions per project

Running a single Claude Code session on the hub means image generation jobs and iOS dev conversations flow into the same tmux window and interfere with each other. Keeping independent tmux sessions per project lets you switch between them from your phone.

Session startup scripts

I keep a ~/start_claude_<project>.sh startup script in my hub's home directory for each project. They're all the same template with different names:

#!/bin/sh

NAME=claude_blog

if [ -n "$SSH_CONNECTION" ] && command -v tmux &>/dev/null; then
    if tmux has-session -t $NAME 2>/dev/null; then
        # session exists — auto-attach
        tmux attach -t $NAME
    else
        # doesn't exist — create it
        tmux new-session -s $NAME
    fi
fi

Two key points: SSH_CONNECTION environment variable ensures tmux only activates when connecting via ssh (not from cron or direct logins), and has-session makes reconnection idempotent — the same script handles both first run and re-attach.

Sessions I actively run

Script tmux session name Purpose Primary target machine
start_claude_blog.shclaude_blogAuto blog generationHub itself
start_claude_image.shclaude_imageImage generation pipeline maintenanceGPU machine
start_claude_ios.shclaude_iosiOS app developmentMac
start_claude_sim.shclaude_simInvestment simulationRaspberry Pi

Each session has its own purpose and target machine. Once inside a session, navigate to the right place with cd ~/projects/<project> or ssh <target-host>.

Termius snippet workflow

Termius lets you save "snippets" that run after connecting. Set up one per project and you can launch the right session in one tap right after SSHing to the hub:

# Snippet name: blog
bash ~/start_claude_blog.sh

# Snippet name: image
bash ~/start_claude_image.sh

# Snippet name: ios
bash ~/start_claude_ios.sh

Alternatively, create separate aliases for the same host in Termius with the relevant snippet as each one's "Startup snippet" — then you can jump straight to a project-specific session from the host list.

Hack extension ideas

  • Add mosh: dramatically easier reconnection when the signal cuts out briefly (subway tunnels, etc.). Requires the Termius Pro plan
  • Voice input: switch your phone's IME to voice input and speak directly into the Termius input field. Great for long prompts
  • Change tmux key bindings: Ctrl-b is awkward on a soft keyboard — change the prefix to Ctrl-a in ~/.tmux.conf
  • Restrict Tailscale ACL to phone → hub only: limit allowed routes in the Tailscale network to a single path. The GPU machine and Mac aren't on Tailscale at all, so a stolen phone only reaches the hub
  • Notify on long-running jobs: send job completion notifications from inside tmux to Slack or LINE Notify so you know when things finish while you're out

Common sticking points

  • WSL2 dies when idle: WSL2 stays alive as long as something is running inside tmux — but an empty session can stop after a few minutes. Always run something inside tmux new -s claude
  • WSL2 sshd conflicts on port 22: if it clashes with the Windows-side OpenSSH, set Port 2222 in /etc/ssh/sshd_config. Also update the port in Termius's host config
  • Tailscale MagicDNS doesn't resolve: check that MagicDNS is ON in the admin console. When it's off, you can only connect using the numeric IP (100.x.y.z)
  • tmux scrolling doesn't work on mobile: press Ctrl-b[ to enter copy mode, then scroll with your finger
  • start_claude_*.sh doesn't launch over ssh: won't fire unless called from ~/.bash_profile or ~/.bashrc. Either call it explicitly from a Termius snippet or register it as the host's startup command
  • Two terminals attached to the same session simultaneously cause split input: use tmux attach -d -t claude_blog to disconnect the existing attachment before your own (-d = detach others first)

FAQ

Q. What's the point of using the hub as the central node?

A. It lets you keep all external connection points at one hub, so the high-spec machines don't need Tailscale daemons, MagicDNS names, or any of that installed on them. Your phone only needs to remember one host. Tailscale ACLs stay simple. Routing between projects and machines is encapsulated in scripts on the hub, so changing your hardware setup doesn't require touching the phone config at all.

Q. Isn't separate tmux sessions per project overkill?

A. If you only have one project, yes. But when running multiple projects in parallel, Claude Code conversation context mixes at the session level — separating them reduces accidents. Each start_claude_<name>.sh is 10 lines and easy to start with.

Q. Can I use my own VPN or ngrok instead of Tailscale?

A. It'll work. But Tailscale is free, takes 5 minutes to set up, and comes with MagicDNS — there's no clear reason to switch for personal use. ngrok-style tools publish an external URL, which expands the attack surface. Not ideal for always-on use.

Q. If there's an official Claude Code Remote Control, why bother with ssh?

A. Official Remote Control is great, but for things outside Claude Code itself — scripts, SSHing to other machines, custom tooling — ssh gives you more flexibility. Using both together is totally fine.

Q. Does this work on iPad or Android tablets too?

A. Yes. Termius and Tailscale both offer the same apps for iPadOS and Android tablets. The larger screen actually makes work more efficient.

Note: The steps in this article were verified as of May 2026, but Tailscale / Termius / WSL2 updates may change the behavior. If something doesn't work, please let me know in the comments.

Wrap-up

The combination of tmux + Tailscale + Termius, plus "use the hub to route between machines by purpose" and "separate tmux sessions per project," lets you run multiple projects in parallel from a smartphone anywhere. It's also a solid second life for an old laptop — image generation, iOS development, and always-on bots with very different resource profiles all coexist without the connection dropping.

If this article was helpful, I'd love it if you shared it on X (Twitter).

App by the author of this blog

I made an iOS reading management app called My Bookstore. Simple bookshelf management — give it a try.

View on App Store →

Related articles

References

Note: This article is part of an automated blog update experiment using Claude Code.

Friday, May 1, 2026

Build & Ship iOS Apps to TestFlight via SSH from Your Smartphone [SSH + build keychain · 2026 Edition]

This article is for engineers who want to build their iOS app in Xcode on a home Mac and push it to TestFlight — entirely from a smartphone while out and about. I'll show the complete procedure for multi-hop SSHing from a smartphone → hub laptop (WSL2) → a separate Mac, then completing archive → IPA → TestFlight upload in a single command. I also cover why codesign stalls over SSH and the workaround (a dedicated build keychain), plus the full contents of all 3 deployment scripts.

Work desk with a laptop terminal, Mac mini, and smartphone working together to ship an app

What you can do with this

  • Push a new TestFlight build with a single SSH command from a smartphone (or from a WSL2 terminal)
  • Use the Mac only as a runtime for Xcode and xcodebuild — no touching it day-to-day
  • Use App Store Connect API Key (.p8) auth so no 2FA dialog pops up
  • Runs entirely within a home LAN — no VPS or GitHub Actions needed (also a good second life for a Mac mini or old MacBook)

4 approaches compared

Setup SSH + custom scripts
(this article)
Xcode on Mac directly GitHub Actions (macOS) fastlane-driven
Monthly cost $0 (home Mac electricity only) $0 macOS runners are expensive per minute $0 (home Mac)
Editor Any WSL2 editor (VS Code, vim, etc.) Xcode Any Any
Time per release 3–6 min (archive + IPA + altool) 5–10 min (via GUI) 10–15 min after push (includes CI startup) 3–6 min
Secret storage Mac local (~/.appstoreconnect) Mac local GitHub Secrets Mac local / Match

Architecture diagram

WSL2 (code editing machine)
Editor + git push + SSH origin
▼ git push
▼ ssh + bash scripts/upload.sh
Mac (dedicated Xcode archive server)
build.keychain-db / xcodebuild / altool
▼ xcrun altool --upload-app
App Store Connect / TestFlight

What I use

Why setup_build_keychain.sh is needed

xcodebuild archive runs fine in the Mac's Terminal normally, but codesign fails over SSH with "could not open keychain" — that's the starting problem this article solves.

The cause: the login keychain, even when unlocked over SSH, is inaccessible to codesign. Without a GUI login session, there's no security agent bound to it, so the ACL's "allow without application confirmation" list can't engage — it tries to show an interactive prompt and fails.

The solution is to create a dedicated build keychain, duplicate the Apple Distribution certificate into it, and set the ACL so codesign can use it without any GUI confirmation. Run scripts/setup_build_keychain.sh once in the Mac's Terminal (a GUI session).

Replace all placeholders in the code below with your own values ({kcpw} is an arbitrary string used to unlock the build keychain locally — it's not an external secret):

{user}                        Your Mac login username
{mac_addr}                    Mac IP or hostname (e.g. 192.168.x.y)
{app_name}                    App name (must match Xcode scheme / archive name)
{bundle_id}                   Bundle ID (e.g. com.example.MyApp)
{team_id}                     Apple Developer Team ID (10 characters)
{kcpw}                        Build keychain local key (any string)
{provisioning_profile_name}   Provisioning Profile name
#!/bin/bash
set -euo pipefail

KCNAME="build.keychain"
KCPATH="$HOME/Library/Keychains/${KCNAME}-db"
KCPW="{kcpw}"                  # build keychain unlock password (local container key, any string)
EXPORTPW="$(uuidgen)"          # .p12 temp encryption key (only used within this script)

if [[ -f "$KCPATH" ]]; then
    echo "build keychain already exists: $KCPATH"; exit 1
fi

# 1. Create the build keychain
security create-keychain -p "$KCPW" "$KCNAME"
security set-keychain-settings -lut 21600 "$KCNAME"
security unlock-keychain -p "$KCPW" "$KCNAME"

# 2. Export identity from login keychain (click "Always Allow" when prompted)
TMPP12="$(mktemp -t build-cert).p12"
trap 'rm -f "$TMPP12"' EXIT
security export -k "$HOME/Library/Keychains/login.keychain-db" \
    -t identities -f pkcs12 -P "$EXPORTPW" -o "$TMPP12"

# 3. Import into build keychain, grant codesign/productbuild access via -T
security import "$TMPP12" -k "$KCNAME" -P "$EXPORTPW" \
    -T /usr/bin/codesign -T /usr/bin/security \
    -T /usr/bin/productbuild -T /usr/bin/productsign

# 4. Set key partition list to suppress GUI prompts
security set-key-partition-list \
    -S 'apple-tool:,apple:,codesign:' -s -k "$KCPW" "$KCNAME"

# 5. Add to front of search list
ORIGINAL_LIST="$(security list-keychains -d user | sed -e 's/^[[:space:]]*//' -e 's/"//g' | tr '\n' ' ')"
security list-keychains -d user -s "$KCNAME" $ORIGINAL_LIST

security find-identity -v -p codesigning "$KCNAME"

The critical step is #4, set-key-partition-list. Adding apple-tool: / apple: / codesign: to the partition access list is what allows codesign tools to use the key without a GUI prompt — this is the mechanism that makes certificate access work over SSH.

archive.sh — Generate a Release archive non-interactively

Script #2. Runs xcodebuild archive over SSH. Unlocks the build keychain first and sets a 6-hour auto-lock.

#!/bin/bash
set -euo pipefail

APP_NAME="{app_name}"          # ← replace with your app name
KCPW="{kcpw}"                  # ← same local key used in setup_build_keychain.sh

ARCHIVE_PATH="${ARCHIVE_PATH:-/tmp/${APP_NAME}.xcarchive}"
BUILD_KCPW="${BUILD_KCPW:-${KCPW}}"
WORKSPACE="${WORKSPACE:-$(pwd)/${APP_NAME}.xcworkspace}"

if [[ ! -f "$HOME/Library/Keychains/build.keychain-db" ]]; then
    echo "✗ build.keychain-db not found. Run setup_build_keychain.sh first." >&2
    exit 1
fi

# Unlock build keychain (6h auto-lock)
security unlock-keychain -p "$BUILD_KCPW" build.keychain
security set-keychain-settings -lut 21600 build.keychain

# Archive
rm -rf "$ARCHIVE_PATH"
LOG=/tmp/archive.log
xcodebuild \
    -workspace "$WORKSPACE" \
    -scheme "$APP_NAME" \
    -configuration Release \
    -destination 'generic/platform=iOS' \
    -archivePath "$ARCHIVE_PATH" \
    archive \
    > "$LOG" 2>&1 || true

if grep -q "ARCHIVE SUCCEEDED" "$LOG"; then
    echo "✓ ARCHIVE SUCCEEDED -> $ARCHIVE_PATH"
else
    echo "✗ ARCHIVE FAILED (log: $LOG)"
    grep -B1 -A4 -E 'error:|errSec|FAILED' "$LOG" | tail -30
    exit 1
fi

Output goes to /tmp/archive.log and success is determined by looking for the ARCHIVE SUCCEEDED string. xcodebuild's stdout is verbose — reading it raw over SSH is exhausting.

ExportOptions.plist — Explicitly specify Manual signing

Configuration file for exporting an IPA from the archive. After getting burned by Auto signing a few times, I fixed it to Manual signing with an explicit profile name and Team ID. Safe to commit to git — no secrets involved.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>method</key>
    <string>app-store-connect</string>
    <key>teamID</key>
    <string>{team_id}</string>
    <key>signingStyle</key>
    <string>manual</string>
    <key>provisioningProfiles</key>
    <dict>
        <key>{bundle_id}</key>
        <string>{provisioning_profile_name}</string>
    </dict>
    <key>uploadSymbols</key><true/>
    <key>compileBitcode</key><false/>
    <key>stripSwiftSymbols</key><true/>
</dict>
</plist>

Replace the Bundle ID and profile name with your own values. The profile name must match what you created in the Apple Developer portal.

upload.sh — Export IPA + upload via altool

Script #3. Reuses the archive to produce an IPA, then uploads to TestFlight via xcrun altool. Using App Store Connect API Key auth means no 2FA dialog.

#!/bin/bash
set -euo pipefail

APP_NAME="{app_name}"          # ← replace with your app name
KCPW="{kcpw}"                  # ← same local key used in setup_build_keychain.sh

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ARCHIVE_PATH="${ARCHIVE_PATH:-/tmp/${APP_NAME}.xcarchive}"
EXPORT_DIR="${EXPORT_DIR:-/tmp/${APP_NAME}-export}"
EXPORT_OPTIONS="${EXPORT_OPTIONS:-$SCRIPT_DIR/ExportOptions.plist}"
ALTOOL_ENV="${ALTOOL_ENV:-$PROJECT_ROOT/Config/altool.env}"
BUILD_KCPW="${BUILD_KCPW:-${KCPW}}"

# Load ASC_KEY_ID / ASC_ISSUER_ID from Config/altool.env
source "$ALTOOL_ENV"

# Auto-call archive.sh if archive doesn't exist yet (reuse if it does)
if [[ ! -d "$ARCHIVE_PATH" ]]; then
    bash "$SCRIPT_DIR/archive.sh"
fi

# Export IPA (exportArchive re-signs, so unlock keychain again)
security unlock-keychain -p "$BUILD_KCPW" build.keychain

rm -rf "$EXPORT_DIR"
xcodebuild -exportArchive \
    -archivePath "$ARCHIVE_PATH" \
    -exportPath "$EXPORT_DIR" \
    -exportOptionsPlist "$EXPORT_OPTIONS" \
    > /tmp/export.log 2>&1 || true

IPA_PATH="$(find "$EXPORT_DIR" -maxdepth 1 -name '*.ipa' | head -n1)"
[[ -f "$IPA_PATH" ]] || { echo "✗ IPA export failed"; exit 1; }

# Upload to TestFlight
xcrun altool --upload-app -f "$IPA_PATH" -t ios \
    --apiKey "$ASC_KEY_ID" --apiIssuer "$ASC_ISSUER_ID"

Config/altool.env is in .gitignore and contains just 2 lines:

ASC_KEY_ID={your_key_id}
ASC_ISSUER_ID={your_issuer_id}

The .p8 file itself goes in ~/.appstoreconnect/private_keys/AuthKey_<KEY_ID>.p8altool checks this path automatically.

Shipping to TestFlight in one line from WSL2

With all of the above in place, the day-to-day workflow from WSL2 is a single line:

# After bumping CFBundleVersion and git push:
ssh {user}@{mac_addr} 'cd ~/path/to/{app_name} && bash scripts/upload.sh'

3–6 minutes later, a new build appears in TestFlight. Once App Store Connect finishes processing, you can install it on your iPhone immediately.

Common sticking points

  • codesign shows a GUI prompt and hangs: you forgot set-key-partition-list in the build keychain setup. Make sure to specify apple-tool:,apple:,codesign:
  • Auto signing falls back to a Development cert from a different team: for Release, stick with Manual signing and an explicit Team ID. Set signingStyle = manual in ExportOptions.plist
  • Build keychain auto-locks after 6 hours and the next day's build fails: call security unlock-keychain at the start of both archive.sh and upload.sh. A single set-keychain-settings -lut 21600 on its own isn't enough
  • SPM dynamic frameworks don't get Embedded: source-distributed dynamic frameworks like RealmSwift need to be manually added to Embed Frameworks in project.pbxproj. Firebase / GoogleMobileAds XCFramework binary targets auto-embed, so this one's easy to miss
  • altool prompts for 2FA: password auth triggers 2FA blocking. Always use App Store Connect API Key (.p8) auth
  • Mac sshd not running: System Settings → General → Sharing → Remote Login — enable it

FAQ

Q. Does the Mac need to stay awake all the time?

A. If you only hit it when shipping, enable Wake-on-LAN and "Wake for network access" in System Settings — an SSH packet will wake it. For always-on use, configure it to not sleep when the lid is closed while on AC power (caffeinate -s or System Settings).

Q. How do I issue an Apple Distribution certificate?

A. In Xcode on the Mac: Settings → Accounts → Manage Certificates → "Apple Distribution." This registers it in the login keychain. Running setup_build_keychain.sh duplicates it into the build keychain.

Q. Do I need to update the scripts when the Provisioning Profile is renewed?

A. Not if the profile name stays the same. Download the renewed profile from the Apple Developer portal, open it on the Mac, and it's applied. No script changes needed.

Q. I want to use a Mac mini as a dedicated archive server

A. An M2 or M4 Mac mini archives Xcode 26 + SPM projects in 3–5 minutes. Always-on power consumption is around 10W. Pair it with build completion notifications to Slack or LINE and you'll know the moment it's done while you're out.

Q. Would fastlane be faster?

A. fastlane shines for team dev — match for certificate sync and lane structure for workflow. The scripts in this article are deliberately scoped to "solo shipping from one Mac" — a thin reimplementation of a small slice of what fastlane offers. If you move to team development, migrating to fastlane is worth it.

Note: The steps in this article were verified on Xcode 26 / macOS 15 as of May 2026. Apple Distribution certificate issuance steps and the App Store Connect UI may change. If something doesn't work, please let me know in the comments.

Wrap-up

Running Xcode fully remotely from WSL2 over SSH is straightforward once you have the two key pieces in place: a build keychain and an App Store Connect API Key. Three shell scripts (setup_build_keychain.sh / archive.sh / upload.sh) and one ExportOptions.plist is all it takes. The Mac sits in the background as a dedicated Xcode archive machine, and you do normal development in your preferred Linux environment.

If this article was helpful, I'd love it if you shared it on X (Twitter).

App by the author of this blog

I made an iOS reading management app called My Bookstore. Simple bookshelf management — give it a try.

View on App Store →

Related articles

References

Note: This article is part of an automated blog update experiment using Claude Code.

6 Days Reviving a 7-Year-Old iOS App with Claude Code and Shipping It to the App Store [Xcode 26 / Swift 6 / v3.0.0]

This article is for indie developers who have an iOS app they built years ago, left untouched, and now want to get it running again on the latest Xcode / Swift and back on the store. I modernized my personal bookshelf management app "My Books" in a 6-day pair programming session with Claude Code, targeting Xcode 26 / Swift 6 / iOS 26, and shipped it as v3.0.0 through 18 TestFlight builds to the App Store on 2026-05-02. I'll cover everything — including the traps I hit on real hardware after compilation was already succeeding.

Work desk with an old iPhone next to a new MacBook running Xcode

What this article covers

  • The process of migrating the frozen Swift / CocoaPods / Realm / third-party SDK stack from a 2016-era app to the current stack and shipping it to the App Store on 2026-05-02 — My Books (bookshelf management app), first released 2016, untouched for 7 years
  • Editing in WSL2 with Claude Code, SSHing the build to a separate Mac, automated TestFlight shipping in a single command
  • The real-device TestFlight crash gauntlet after compilation passed (DGCharts vtable, Rakuten API shutdown, Google Books quota exhaustion, etc.)
  • How AI pair programming split "what the human needs to do" from "what to hand to the AI"

By the numbers

Item Value
Working period2026-04-25 – 2026-05-01 (6 days)
Commits~60
TestFlight builds18
Tests added20 (AppUtilTests / BookTests / ItemEnumsTests / ViewUtilTests)
Release versionv3.0.0 / Build 18
App Store release2026-05-02
Target environmentXcode 26 / Swift 6 / iOS 17+ / old master branch frozen

Overall flow

Three broad phases. Of the 6 days, 5 were modernization, half a day was shipping pipeline setup, and the rest was handling TestFlight issues and the App Store review submission. Build 18 passed review without additional questions and was published on 2026-05-02.

① Modernization (Phases 0–8 / Apr 25–30)
CocoaPods → SPM, Realm @Persisted, async/await, ViewController split, secrets management, test coverage
② TestFlight shipping pipeline (Apr 30 afternoon)
WSL → SSH → Mac → archive → IPA → altool → TestFlight
③ TestFlight crash gauntlet → App Store review → published (Apr 30 night – May 2)
Builds 5→18, stamped out real-device crashes and API issues, Build 18 submitted → v3.0.0 published 2026-05-02

① What modernization involved

Plan → delete → API migration → Realm overhaul → ViewController split → secrets cleanup → test coverage, in that order.

  • CocoaPods → SPM (Realm 20.0.4 / Firebase iOS SDK 11.15.0 / Google Mobile Ads 11.13.0 / DGCharts 5.1.0)
  • Deleted all Amazon PA-API code: old RWMAmazonProductAdvertisingManager / HMAC.m / Bridging Header / BarcodeCacher companion app — all gone
  • Eliminated all deprecated APIs: arc4random_uniform / statusBarOrientation / UILocalNotification / kCLLocationAccuracyKilometer / AVCaptureConnection.videoOrientation and more — over 30 locations total
  • async/await conversion: rewrote ProductSearch to parallel withTaskGroup. Unified Google Books / Rakuten / OpenBD providers to async throws. Removed all delegate protocols
  • Realm modernization: @objc dynamic@Persisted. Dropped old migrations in favor of deleteRealmIfMigrationNeeded = true + a "Data was reset" alert to the user
  • ViewController split: 1,135 lines → 471 lines, with four extensions (Camera / ProductSearch / Library / Speech) carrying separate responsibilities
  • Secrets management: committed Config/Secrets.template.xcconfig, gitignored the actual Secrets.xcconfig. Info.plist values expand via $(VAR)

② TestFlight shipping pipeline

I wanted code editing in WSL2 → GitHub push → SSH to a separate Mac → archive → IPA → TestFlight upload as a single command. The final form is three scripts:

scripts/setup_build_keychain.sh   # run once — duplicates cert into dedicated build keychain
scripts/archive.sh                # runs xcodebuild archive non-interactively in Release config
scripts/upload.sh                 # reuses archive, exports IPA, uploads via xcrun altool

altool auth uses an App Store Connect API Key (.p8). I also use fastlane for test and build validation, but the actual shipping uses plain xcodebuild + altool.

③ TestFlight crash gauntlet

This ate the most time — Build 5 to Build 18, a day and a half. The gap between compilation succeeding and actually running on a real iOS 26 device was larger than expected.

Build 18 went to App Store review and passed without any additional questions from Apple. v3.0.0 published to the App Store on 2026-05-02. The version and build number were kept as-is from the final TestFlight build.

5 memorable hits that left an impression

1. Xcode 26's build sandbox

A dead scripts/update_storyboard_strings.sh (containing only a comment) and its Run Script Phase were still in the project. Xcode 26's build sandbox escalates even an empty Run Script Phase to a build error if path resolution fails. A script containing only a comment was enough for the sandbox to flag it — the entire Run Script Phase had to be deleted.

2. Rakuten Web Services shutting down in 13 days

While tracking down missing book cover images, I discovered that the Rakuten Books API was in the middle of fully shutting down the old app.rakuten.co.jp endpoint on 2026-05-14 and migrating to the new openapi.rakuten.co.jp. 13 days to go.

  • Required parameters expanded to applicationId + accessKey
  • Application type is now either Web Application (allowed website) or Backend Service (allowed IP) — two choices
  • Mobile apps must use Web Application. The Origin header is mandatory; without it you get 403 REQUEST_CONTEXT_BODY_HTTP_REFERRER_MISSING

Re-registered as a Web Application type under the new domain → stored the new ApplicationId / AccessKey / allowed domain in Secrets.xcconfig → rewrote ProductSearchByRakuten.swift to attach Origin / Referer headers on every request. If you have an app using Rakuten Web Services, migrate as soon as possible.

3. Google Books anonymous shared quota exhausted

Calling the Google Books API without a key hit 429 Quota Exceeded. The project_number in the error JSON wasn't mine. Re-reading the docs: unauthenticated calls share a single anonymous project's 20M/day quota across every anonymous caller in the world. Constantly exhausted. Fix: issue an API key from my own GCP project and send it via ?key= query parameter.

4. DGCharts vtable consumed 4 builds — replaced with custom CoreGraphics implementation

Builds 5–8 crashed with the same symptom. The PieChartView.data = ... setter inside ShelfViewController.setChart() hit EXC_BAD_INSTRUCTION (brk #1). Running atos on the .ips crash log revealed 647 symbols aliased to the same stub inside DGCharts' DGChartsDynamic.framework. Suspected dead-strip gone wrong under Xcode 26 + Swift 6 + SPM.

Neither switching static→dynamic nor routing through objc_msgSend fixed it. I ultimately replaced the chart feature (a 3-slice donut with center text) with a custom BookStatusPieView: UIView (~60 lines of CoreGraphics). Swap the customClass in the Storyboard and done.

5. SSH-based codesign and the TestFlight shipping pipeline

The problem of codesign failing to access the login keychain when SSHing into the Mac to run xcodebuild archive (the login keychain stays inaccessible to codesign even when unlocked over SSH), and the solution of using a dedicated build keychain with relaxed ACL to automate the pipeline in a single command — that grew long enough to split into its own article. The full procedure for WSL2 → SSH → Mac → archive → IPA → altool in three scripts is in "Build & Ship iOS Apps to TestFlight via SSH from Your Smartphone".

Division of labor in AI pair programming

After 6 days of running this, where humans and AI each excel became clear.

Responsible party Tasks
Claude Code Symbolizing crash logs (.ips) with atos / API triage with curl / code fixes / archive and upload over SSH / keeping docs and memory in sync / generating commit messages
Human Physical device interaction / App Store Connect agreements and submissions / GCP and Rakuten Developers browser sessions / the physical senses (volume levels, speaker behavior, banner rendering verification)

Not forcing AI onto tasks it's bad at translated directly into efficiency gains. Conversely, crash log symbolization and API triage via curl are dramatically faster with AI.

What I learned

  • Compilation passing ≠ working: Xcode can only catch so much of 7 years' worth of API changes statically. The majority of issues only appeared after pushing to real-device TestFlight (DGCharts vtable, phantom Storyboard outlets, ATS, ATT, AVAudioSession)
  • Third-party APIs evolve independently: Google Books anonymous quota exhaustion, Rakuten's old API 13 days from shutdown, AdMob SDK 11+ requiring explicit adSize — all changes that happened independently of the app's own code
  • An automated delivery pipeline is a prerequisite: without the one-command WSL → SSH → Mac → TestFlight pipeline, running Builds 1 through 18 in a single day would have been impossible
  • Xcode 26 traps: ENABLE_DEBUG_DYLIB / sandbox / phantom schemes / SPM Embed Frameworks — you'll hit these both while setting up the test target and while building the shipping pipeline
  • You're done when a full day on real hardware produces nothing: code changes are the front half; TestFlight crash response takes roughly the same amount of time

Note: The work in this article was done on Xcode 26 / Swift 6 / iOS 17–26 as of May 2026. Third-party API changes or SDK version differences may mean some steps won't reproduce exactly. If you notice anything, let me know in the comments.

Wrap-up

Even a personal iOS app that sat untouched for 7 years can make it to the App Store as v3.0.0 in 6 days when you combine the latest Xcode with Claude Code. That said, "compilation passes" and "runs on real hardware" are two different things — the crash response after pushing to TestFlight takes more time than you'd expect. Third-party API drift and Xcode 26's build sandbox act on their own schedule, regardless of what you changed in your own code. Get the delivery pipeline automated early and you can move fast from there.

App Store: My Books (bookshelf management app) (v3.0.0, published 2026-05-02). If this article was helpful, I'd love it if you shared it on X (Twitter).

App by the author of this blog

I made an iOS reading management app called My Bookstore. Simple bookshelf management — give it a try.

View on App Store →

Related articles

References

Note: This article is part of an automated blog update experiment using Claude Code.

Generating AI Blog Images with a Local GPU [SDXL + IP-Adapter + img2img: All 6 Stages Shown]

I built a system to automatically generate cover images and parts-illustration images for my weekly blog posts using a local GPU. This article walks through the entire evolution — from getting plain SDXL running, to using IP-Adapter to match real hardware photos, to img2img for hand-drawn watercolor illustrations — with 6 comparison images showing exactly how the output changes at each step. I also cover the VRAM constraints I hit on a 6GB machine and the approaches I chose to work around them.

All the tooling in this article was built by Claude Code, and 99% of the writing is Claude Code too — but the trial-and-error journey is real.

Watercolor illustration of XIAO ESP32-S3 generated with img2img + high denoise. The final goal of this article.

What you can do with this

  • Understand the full setup for running SDXL on a Linux PC with a GTX 1660 SUPER (6GB VRAM) + 16GB RAM
  • Generate AI images that closely resemble real hardware using IP-Adapter
  • Produce hand-drawn watercolor illustrations using img2img + high denoise
  • Learn how to work around out-of-memory errors common on 6GB VRAM machines

What I used

Why local generation?

At a pace of one blog post per week, API-based image generation services pile up a hidden "psychological cost per experiment." Every time I tweak a prompt five times, a small charge ticks up — and that feeling stops me from iterating freely. Running locally means the only cost is electricity. I can run --seed variations until I'm satisfied.

The other reason is wanting dependencies under my own control. APIs can break through spec changes, price hikes, or service shutdowns. With a local setup, as long as I keep the model files and the code, I can reproduce the exact same results five years from now.

Step 1: Get plain ComfyUI + SDXL running

First, install ComfyUI following the official guide and place Juggernaut XL v9 in checkpoints/. I ran txt2img with the prompt "watercolor illustration of XIAO ESP32-S3" and got this:

SDXL output with no reference image. Asking for XIAO ESP32-S3 produces a generic gadget-looking illustration that bears no resemblance to the real thing.

The watercolor style instruction went through, but the board is a fictional gadget. The base SDXL model hasn't learned specific board names, so it produces something plausible-looking but invented.

Step 2: Matching real hardware requires a reference image

When writing "I built a temperature monitor with this XIAO ESP32-S3," having an illustration of a completely different board hurts the article's credibility. To accurately render specific hardware the AI hasn't trained on, you need a mechanism to pass an actual product photo as a reference image. That's what IP-Adapter does.

Step 3: Adding IP-Adapter Plus

Clone ComfyUI_IPAdapter_plus into custom_nodes/, then download from Hugging Face:

  • ip-adapter-plus_sdxl_vit-h.safetensors (~700 MB) → models/ipadapter/
  • CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors (~2.5 GB) → models/clip_vision/

I fed the official XIAO product photo as a reference image at weight 0.5:

XIAO ESP32-S3 generated with IP-Adapter weight 0.5. The real board's structure is faithfully reproduced.

The real board structure is faithfully reproduced. The USB-C connector, the shielded wireless module, and the QR-code-like chip markings are all there — unmistakably a XIAO ESP32-S3. But look carefully and you'll notice the watercolor illustration instruction in the prompt has completely disappeared and the output is photorealistic.

IP-Adapter Plus is a "style transfer" type: the reference photo's colors, texture, and style transfer strongly into the output. Great for cover images where you want to match the real thing, but it gets in the way when you want an illustration style.

Step 4: I want to switch between "photo" and "illustration"

Photorealistic is fine for cover images. But in the "What I used" section of an article, I want illustration-style images to add visual variety. An entire article of photorealistic images all the way through feels monotonous.

Step 5: Lowering IP-Adapter weight (failed attempt)

The obvious idea was to lower the IP-Adapter weight — less photo influence means the style prompt should come through more. I dropped weight to 0.3 and added a strong negative prompt with (photorealistic:1.6):

IP-Adapter weight 0.3 — the reference photo's style is still dominant and the output remains photorealistic.

Even at 0.3, the photo style dominates. The Plus variant has a tendency to keep style even at low weights, so it's fundamentally not suited for illustration conversion.

Step 6: Tried ControlNet — then shelved it

Next I tried ControlNet (Canny): extract only the edge lines from the reference photo, then let the prompt fully control color and texture. In theory, ideal for illustration conversion.

It works — but on a setup of 6GB VRAM + 16GB RAM, loading SDXL + IP-Adapter + ControlNet simultaneously caused frequent memory contention after extended runtime, causing the process to become unresponsive. Fine for a one-off, but unreliable for sustained operation.

I want blog generation running weekly via cron. If it stalls, no article comes out. A feature that "works but isn't reliable enough" doesn't belong in a cron job, so I conservatively shelved it.

Step 7: Solved with img2img + high denoise

The remaining path was img2img + high denoise. Here's how it works:

  1. VAE-encode the reference photo into an initial latent
  2. Set the KSampler's denoise high (0.9 recommended)
  3. The KSampler progresses from the initial latent — the higher the denoise, the weaker the initial latent's influence, and the more the prompt dominates

Result: the structure (component outlines and layout) faintly remains, but the prompt fully controls color, texture, and style. The key advantages:

  • No additional custom nodes (uses only ComfyUI's built-in nodes)
  • No additional model downloads
  • Memory consumption is the same as txt2img

In other words, you get "outline reference + prompt-driven style" without ControlNet, and it runs stably on a 6GB VRAM machine.

Step 8: Finding the optimal denoise value through testing

The denoise value in img2img is the key quality lever. I compared 0.8 / 0.85 / 0.9 with the same prompt, same seed, and same reference image.

denoise 0.8

img2img denoise 0.8. The reference photo's structure still dominates, resulting in a photorealistic look.

Still photo-leaning. The "watercolor" style instruction barely shows — the result looks like a light watercolor filter applied to a photo.

denoise 0.85

img2img denoise 0.85. Nearly identical to 0.8 — still photorealistic.

Bumping by 0.05 makes almost no difference. The photo character of the initial latent persists.

denoise 0.9

img2img denoise 0.9. The sweet spot: watercolor softness plus recognizable XIAO ESP32-S3 features.

There it is. The softness of watercolor and the distinctive XIAO ESP32-S3 features (USB-C connector position, wireless module shield, castellated pin row) coexist. denoise 0.9 is the current best value.

Pushing to 0.95 tends to break the structure — you get "watercolor but no idea what board it is." 0.9 is the balance point where "shape hints remain, style is prompt-driven."

Step 9: Applying the finished pipeline to another article

I used the same pipeline to generate a Raspberry Pi Pico 2 W illustration for a different article. Reference image from the Raspberry Pi official press release, denoise 0.9, generation time ~2 minutes:

Watercolor illustration of Raspberry Pi Pico 2 W. Confirms the same pipeline works for a different component.

The green PCB, the gold castellated pins, the micro USB connector position, and the wireless module shield shape are all distinguishable. Swap in a different part with the same script and the same denoise 0.9 and you get this level of output. Confirmed reproducible.

FAQ

Q. Does it work with only 6 GB VRAM?

A. Yes. SDXL quantized to fp8 (via ComfyUI's --fp8_e4m3fn-unet startup option) uses around 4 GB, and with IP-Adapter the total is around 5.5 GB. Loading ControlNet simultaneously pushes close to the 6 GB limit and causes instability.

Q. How long does it take to generate one image?

A. On a GTX 1660 SUPER: txt2img at 15 steps is around 90 seconds, with IP-Adapter around 120 seconds, and img2img illustration at 25 steps around 120 seconds. Slower than cloud GPUs, but the only cost is electricity, so you can iterate without limit.

Q. Why doesn't denoise 0.7 or 0.8 work as well?

A. In img2img, the lower the denoise, the more the initial latent (the reference photo's structure) survives in the output. At 0.7–0.85, the photo character subtly persists and overpowers the illustration prompt. At 0.9, the initial latent's influence drops to roughly "shape hints only," and the style becomes fully prompt-driven.

Q. Why not try the non-Plus base IP-Adapter?

A. The plain ip-adapter_sdxl_vit-h.safetensors is said to transfer style more weakly than Plus, so it remains an option. However, img2img + high denoise wins on operational simplicity — no additional custom nodes, no additional model downloads — so that's what I'm using for now.

Note: The steps in this article were verified as of May 2026, but ComfyUI version updates or IP-Adapter model changes may cause them to stop working as written. If something doesn't work, please let me know in the comments.

Wrap-up

Step by step — SDXL → IP-Adapter → img2img — I showed how adding each feature shifts the output from "generic gadget → photorealistic real hardware → hand-drawn illustration", with 6 actual generated images to demonstrate the progression.

On a 6 GB VRAM machine, the approach that turned out to be simplest and most reliable was "no extra custom nodes, memory usage on par with txt2img." img2img + high denoise isn't a new technique, but it deserves a second look as a way to get around IP-Adapter's style transfer problem.

"More features = better results" isn't always true. The flexibility to switch approaches based on constraints turned out to be the most important skill for local AI image generation.

If this article was helpful, I'd love it if you shared it on X (Twitter).

App by the author of this blog

I made an iOS reading management app called My Bookstore. Simple bookshelf management — give it a try.

View on App Store →

Related articles

References

Note: This article is part of an automated blog update experiment using Claude Code.