Overview
Blog
Running Parallel Autonomous Claude Code Agents with Git

Running Parallel Autonomous Claude Code Agents with Git

archyn archyn
February 10, 2026
15 min read

What This Post Covers

This is a practical guide to running multiple Claude Code instances in parallel on a shared codebase, fully autonomous, with git as the synchronization layer. You’ll get:

  • A loop that keeps Claude Code running autonomously
  • Docker containers for isolation
  • Git-based synchronization between agents
  • A simple task-locking mechanism so agents don’t step on each other
  • Scripts to manage the whole thing

The approach is intentionally minimal. No orchestration framework, no custom tooling beyond bash scripts and Docker.

How It Works

The architecture has four parts:

  1. A bash loop restarts Claude Code after every session. Each session picks up one task, completes it (or documents why it couldn’t), and exits. The loop gives it a fresh context window for the next task.
  2. Docker containers isolate each agent. Every agent gets its own container with its own clone of the repo. No shared filesystem, no stepping on each other’s files.
  3. A bare git repo acts as the synchronization layer. Agents push and pull just like developers would. Every container mounts this repo as a volume.
  4. Lock files in git prevent two agents from working on the same task. Before starting work, an agent commits a lock file. If another agent already claimed it, the push gets rejected.

Project Structure

Here’s everything you’ll create. The numbered steps below walk through each file.

your-project/
├── run-agent.sh # Step 1: Agent loop with backoff
├── AGENT_PROMPT.md # Step 2: What Claude does each session
├── TODO.md # Task list agents read and update
├── current_tasks/ # Lock files for active tasks
│ └── .gitkeep
├── session_logs/ # Per-session summaries
│ └── .gitkeep
├── Dockerfile # Step 3: Container image
├── entrypoint.sh # Step 3: Container startup
├── setup-upstream.sh # Step 4: Bare repo initialization
├── .env # Step 5: API key
└── docker-compose.yml # Step 5: Multi-agent orchestration

Step 1: The Agent Loop

Create run-agent.sh in the root of your project. This is the foundation — a bash loop that restarts Claude Code every time it finishes.

run-agent.sh
#!/bin/bash
PROMPT_FILE="${PROMPT_FILE:-AGENT_PROMPT.md}"
BACKOFF=0
MAX_BACKOFF=300
while true; do
if [ "$BACKOFF" -gt 0 ]; then
echo "[$(date)] Waiting ${BACKOFF}s before retry..." >> agent_logs/loop.log
sleep "$BACKOFF"
fi
COMMIT=$(git rev-parse --short=6 HEAD)
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
LOGFILE="agent_logs/agent_${TIMESTAMP}_${COMMIT}.log"
mkdir -p agent_logs
claude --dangerously-skip-permissions \
-p "$(cat "$PROMPT_FILE")" \
--model claude-opus-4-6 &> "$LOGFILE"
EXIT_CODE=$?
echo "[$(date)] Session ended with code $EXIT_CODE" >> agent_logs/loop.log
if [ $EXIT_CODE -ne 0 ]; then
BACKOFF=$(( BACKOFF == 0 ? 5 : BACKOFF * 2 ))
BACKOFF=$(( BACKOFF > MAX_BACKOFF ? MAX_BACKOFF : BACKOFF ))
else
BACKOFF=0
fi
done
Terminal window
$ chmod +x run-agent.sh

--dangerously-skip-permissions is what makes it autonomous — Claude won’t stop to ask for confirmation on file edits, shell commands, or anything else. This is why you run it in a container and not on your actual machine.

The --model flag is optional. It defaults to whatever you have configured. Opus is the most capable for complex tasks but also the most expensive. Sonnet works fine for smaller scoped work.

Each session gets logged with a timestamp and the current commit hash, so you can trace what Claude did and when.

The backoff logic handles API outages and rate limits gracefully. If Claude exits with an error, the loop waits 5 seconds, then 10, 20, 40… up to 5 minutes. On a successful session, the delay resets to zero. Without this, a loop can burn through hundreds of failed sessions in minutes during an outage.

Step 2: The Agent Prompt

Create AGENT_PROMPT.md in the root of your project. This is the prompt Claude receives at the start of every session.

AGENT_PROMPT.md
# Task
You are an autonomous agent working on [project description].
## Current State
Read TODO.md for the list of remaining tasks. Check session_logs/
for recent session summaries to understand what other agents have
been doing and what the current state of the project looks like.
## Instructions
1. Pull the latest changes from upstream
2. Read TODO.md and recent session logs
3. Pick ONE task that isn't locked by another agent (check current_tasks/)
4. Create a lock file: current_tasks/your-task-name.txt with your agent ID and timestamp
5. Commit and push the lock file immediately so other agents can see it
6. Work on the task until it's done or you're stuck
7. Run the test suite to make sure nothing is broken
8. Update TODO.md
9. Write a session summary to session_logs/ (see below)
10. Commit your changes, pull from upstream, resolve any conflicts, push
11. Remove your lock file and stop
## Session Summary
Before finishing, write a short summary to
session_logs/YYYY-MM-DD-HHMMSS-agent-id.md containing:
- What task you worked on
- What you changed
- Whether it was completed or if it's still in progress
- Any issues you ran into or things the next agent should know
## Rules
- One task per session. Complete it or document why you couldn't, then stop.
- Do not break existing tests
- Keep commits small and focused
- If you're stuck on something for more than 3 attempts, document
the issue in STUCK.md and move on
- Always pull before pushing

Each session handles exactly one task. Claude picks it up, works on it, writes a summary, and exits. The loop restarts it with a fresh context window for the next task. This prevents context degradation — Claude gets less effective the longer a session runs, so short focused sessions produce better results than long ones.

The key elements:

  • TODO.md tracks what needs to be done. Every session starts by reading it, so each Claude instance knows what’s left.
  • session_logs/ captures what happened in each session — including failed attempts and current project state. The next agent reads recent logs to understand context, learn from what didn’t work, and see how the project has evolved.
  • current_tasks/ is the locking mechanism. Before starting a task, an agent creates a file here and pushes it. If two agents try to lock the same task simultaneously, the second agent’s push gets rejected — it pulls, sees the lock, and picks something else.
  • The instruction to document failures in STUCK.md prevents agents from spinning on the same problem forever.

You’ll want to tune this prompt for your specific project. The more concrete your instructions, the better Claude stays on track.

Now create the directories agents expect and add a TODO.md with your tasks:

Terminal window
$ mkdir -p current_tasks session_logs
$ touch current_tasks/.gitkeep session_logs/.gitkeep
$ git add run-agent.sh AGENT_PROMPT.md current_tasks/.gitkeep session_logs/.gitkeep
$ git commit -m "init: add agent loop, prompt, and coordination directories"

Create a TODO.md with tasks for the agents to work on:

TODO.md
# TODO
- [ ] Task one description
- [ ] Task two description
- [ ] Task three description
Terminal window
$ git add TODO.md
$ git commit -m "init: add task list"

Step 3: Docker Setup

Each agent runs in its own container with a clone of the repo. Create the Dockerfile:

Dockerfile
FROM node:22-bookworm
RUN apt-get update && apt-get install -y \
git \
curl \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Install Claude Code
RUN curl -fsSL https://claude.ai/install.sh | bash
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

And the entrypoint.sh that clones the repo and starts the loop:

entrypoint.sh
#!/bin/bash
AGENT_ID="${AGENT_ID:-agent-$(hostname | head -c 8)}"
# Clone from the shared bare repo
git clone /upstream /work
cd /work
git config user.name "$AGENT_ID"
git config user.email "${AGENT_ID}@agents.local"
echo "[$AGENT_ID] Starting agent loop..."
exec ./run-agent.sh
Terminal window
$ chmod +x entrypoint.sh

The base image is node:22-bookworm because Claude Code requires Node.js. The entrypoint clones from a shared bare git repo (mounted at /upstream) into /work, configures git identity using the agent ID, and hands off to the loop.

Step 4: The Shared Repository

The bare git repo is how agents share code. Every container mounts it as a volume. Create setup-upstream.sh:

setup-upstream.sh
#!/bin/bash
UPSTREAM_DIR="./upstream.git"
if [ ! -d "$UPSTREAM_DIR" ]; then
git init --bare "$UPSTREAM_DIR"
echo "Created bare repo at $UPSTREAM_DIR"
fi
# Push your project into it
git remote add upstream "$UPSTREAM_DIR" 2>/dev/null
git push upstream main
echo "Pushed main branch to upstream"
Terminal window
$ chmod +x setup-upstream.sh

This creates a bare repository and pushes your current project (including run-agent.sh, AGENT_PROMPT.md, TODO.md, and the coordination directories) into it. Every container will clone from this repo, and every push goes back to it.

Step 5: Docker Compose

Create a .env file with your Anthropic API key:

.env
ANTHROPIC_API_KEY=sk-ant-...

Then the docker-compose.yml that ties everything together:

docker-compose.yml
services:
agent-1:
build: .
environment:
- AGENT_ID=agent-1
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
agent-2:
build: .
environment:
- AGENT_ID=agent-2
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
agent-3:
build: .
environment:
- AGENT_ID=agent-3
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
agent-4:
build: .
environment:
- AGENT_ID=agent-4
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream

Each service gets its own container, its own agent ID, your API key, and a shared mount to the bare repo.

Task Locking

Worth understanding how the locking works before you run this. When an agent picks a task, it creates a file in current_tasks/ and pushes it immediately:

current_tasks/
├── fix-parser-error.txt # locked by agent-1
├── add-type-checking.txt # locked by agent-2
└── .gitkeep

Each lock file contains the agent ID and a timestamp:

current_tasks/fix-parser-error.txt
agent: agent-1
started: 2026-02-16T14:30:00Z
description: Fixing the parser error on nested function calls

When the agent finishes, it removes the lock file, commits, and pushes. The next agent that pulls will see the task is available again if it wasn’t completed, or gone from TODO.md if it was.

This isn’t bulletproof. If two agents try to lock the same task simultaneously, the second agent’s push gets rejected by git. It then has to pull, at which point it sees the lock and picks a different task. Git’s push rejection handles most cases, and Claude is smart enough to resolve these situations.

Running It

With all the files in place, the full startup is three commands:

Terminal window
$ ./setup-upstream.sh
$ docker compose build
$ docker compose up

To watch what’s happening:

Terminal window
# Follow a specific agent's logs
$ docker compose logs -f agent-1
# See what tasks are currently locked
$ git --git-dir=upstream.git show HEAD:current_tasks/
# Check the latest session log
$ git --git-dir=upstream.git log -1 --oneline -- session_logs/

Scale down to fewer agents by commenting out services in the compose file, or scale up with the script in the next section.


Everything below this line is optional. The core setup above is complete — you have agents running, coordinating, and pushing code. The sections that follow cover scaling, specialized roles, testing strategies, guardrails, and monitoring.


Scaling with a Script

If you want more than a handful of agents, generating the compose file by hand gets tedious. A script handles it:

spawn-agents.sh
#!/bin/bash
NUM_AGENTS="${1:-4}"
COMPOSE_FILE="docker-compose.yml"
cat > "$COMPOSE_FILE" <<EOF
services:
EOF
for i in $(seq 1 "$NUM_AGENTS"); do
cat >> "$COMPOSE_FILE" <<EOF
agent-${i}:
build: .
environment:
- AGENT_ID=agent-${i}
- ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
EOF
done
echo "Generated $COMPOSE_FILE with $NUM_AGENTS agents"
echo "Run: docker compose up --build"
Terminal window
$ chmod +x spawn-agents.sh
$ ./spawn-agents.sh 16
Generated docker-compose.yml with 16 agents
Run: docker compose up --build

Specialized Agents

Not every agent needs the same prompt. You can dedicate agents to different roles — some fixing bugs, others refactoring, one maintaining documentation — using different prompt files. The PROMPT_FILE environment variable in run-agent.sh controls which prompt each agent uses.

spawn-specialized.sh
#!/bin/bash
# Generate a compose file with specialized roles
cat > docker-compose.yml <<EOF
services:
EOF
# Main workers
for i in $(seq 1 8); do
cat >> docker-compose.yml <<EOF
worker-${i}:
build: .
environment:
- AGENT_ID=worker-${i}
- ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
EOF
done
# Specialist agents with custom prompts
cat >> docker-compose.yml <<EOF
quality:
build: .
environment:
- AGENT_ID=quality
- PROMPT_FILE=prompts/quality.md
- ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
docs:
build: .
environment:
- AGENT_ID=docs
- PROMPT_FILE=prompts/docs.md
- ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
tests:
build: .
environment:
- AGENT_ID=tests
- PROMPT_FILE=prompts/tests.md
- ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
volumes:
- ./upstream.git:/upstream
EOF
echo "Generated docker-compose.yml with 8 workers + 3 specialists"
echo "Run: docker compose up --build"

A quality agent prompt might look like:

prompts/quality.md
# Task
You are a code quality agent. Your job is to review recent commits
and improve code quality.
## Instructions
1. Pull latest changes
2. Look at the last 10 commits for code smells, duplication, or bugs
3. If you find issues, fix them
4. Do not change functionality — only improve structure and clarity
5. Run tests to make sure nothing breaks
6. Commit and push

Commit the prompt files to your repo so agents can access them after cloning.

Writing Good Tests

This is the most important part. Autonomous agents will solve whatever the tests tell them to solve. If your tests are wrong or incomplete, Claude will confidently produce the wrong thing.

Some things that help:

Keep test output short. Claude’s context window is finite. A test suite that dumps 10,000 lines of output will drown out useful information. Print a summary, log details to a file.

run-tests.sh
#!/bin/bash
LOGFILE="test_results/$(date +%Y%m%d-%H%M%S).log"
mkdir -p test_results
# Run tests, capture full output to log
./test-suite.sh > "$LOGFILE" 2>&1
# Print only the summary
TOTAL=$(grep -c "^TEST" "$LOGFILE")
PASSED=$(grep -c "^TEST.*PASS" "$LOGFILE")
FAILED=$(grep -c "^TEST.*FAIL" "$LOGFILE")
echo "Tests: $PASSED/$TOTAL passed, $FAILED failed"
if [ "$FAILED" -gt 0 ]; then
echo ""
echo "Failures:"
grep "^TEST.*FAIL" "$LOGFILE" | head -20
echo ""
echo "Full log: $LOGFILE"
fi

Include a fast mode. A full test suite might take 30 minutes. Claude doesn’t need to run the whole thing every iteration. A --fast flag that runs a random 10% sample keeps things moving while still catching regressions.

Use deterministic sampling. Each agent should run the same subset consistently (so it can identify regressions), but different agents should cover different subsets. Seed the random sample with the agent ID.

Prefix errors consistently. ERROR: description on a single line makes it easy for Claude to grep for problems. Don’t scatter error information across multiple lines or formats.

Guarding the Repo

With multiple agents pushing code autonomously, it’s easy for one bad commit to cascade. A pre-receive hook on the bare repo runs the fast test suite before accepting any push. If tests fail, the push is rejected and the agent has to fix it first.

upstream.git/hooks/pre-receive
#!/bin/bash
WORK_DIR=$(mktemp -d)
trap "rm -rf $WORK_DIR" EXIT
# Check out the incoming code
while read oldrev newrev refname; do
git --work-tree="$WORK_DIR" checkout -f "$newrev" 2>/dev/null
done
# Run the fast test suite
cd "$WORK_DIR"
if [ -f "./run-tests.sh" ]; then
./run-tests.sh --fast > /tmp/pre-receive-test.log 2>&1
if [ $? -ne 0 ]; then
echo "ERROR: Tests failed. Push rejected."
echo ""
tail -20 /tmp/pre-receive-test.log
exit 1
fi
fi
Terminal window
$ chmod +x upstream.git/hooks/pre-receive

This is the single highest-value addition to the whole setup. Without it, an agent pushes broken code, three other agents pull it, build on top of it, and now you have four agents all working on a broken foundation.

Stale Lock Cleanup

If an agent crashes mid-task — container dies, API error, whatever — its lock file stays in current_tasks/ forever. Other agents see the lock and avoid the task, so it never gets picked up again.

A simple cleanup script removes locks older than a threshold:

cleanup-locks.sh
#!/bin/bash
MAX_AGE_HOURS="${1:-2}"
REPO_DIR="${2:-./upstream.git}"
WORK_DIR=$(mktemp -d)
trap "rm -rf $WORK_DIR" EXIT
git clone "$REPO_DIR" "$WORK_DIR" 2>/dev/null
cd "$WORK_DIR"
CLEANED=0
for lock in current_tasks/*.txt; do
[ -f "$lock" ] || continue
# Extract the timestamp from the lock file
STARTED=$(grep "^started:" "$lock" | awk '{print $2}')
if [ -z "$STARTED" ]; then
continue
fi
# Compare with current time (GNU date)
LOCK_EPOCH=$(date -d "$STARTED" +%s 2>/dev/null)
NOW_EPOCH=$(date +%s)
AGE_HOURS=$(( (NOW_EPOCH - LOCK_EPOCH) / 3600 ))
if [ "$AGE_HOURS" -ge "$MAX_AGE_HOURS" ]; then
echo "Removing stale lock: $lock (${AGE_HOURS}h old)"
git rm "$lock"
CLEANED=$((CLEANED + 1))
fi
done
if [ "$CLEANED" -gt 0 ]; then
git commit -m "cleanup: remove $CLEANED stale lock(s)"
git push origin main
fi

This script uses GNU date -d, so run it on a Linux host or inside a container. Run it on a cron every 30 minutes or so:

Terminal window
$ crontab -e
*/30 * * * * /path/to/cleanup-locks.sh 2 /path/to/upstream.git

Two hours is a reasonable default. Most tasks finish well within that. If you have longer-running tasks, bump the threshold.

Slack Notifications

A post-receive hook on the bare repo posts to Slack whenever an agent pushes. Lets you passively monitor progress from your phone without watching logs.

upstream.git/hooks/post-receive
#!/bin/bash
SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
REPO_NAME=$(basename $(pwd) .git)
while read oldrev newrev refname; do
BRANCH=$(echo "$refname" | sed 's|refs/heads/||')
AUTHOR=$(git log -1 --format="%an" "$newrev")
MESSAGE=$(git log -1 --format="%s" "$newrev")
CHANGED=$(git diff --stat "$oldrev" "$newrev" 2>/dev/null | tail -1)
PAYLOAD=$(cat <<EOF
{
"text": "*${REPO_NAME}* — ${AUTHOR} pushed to \`${BRANCH}\`\n>${MESSAGE}\n\`\`\`${CHANGED}\`\`\`"
}
EOF
)
curl -s -X POST -H "Content-Type: application/json" \
-d "$PAYLOAD" "$SLACK_WEBHOOK" > /dev/null
done
Terminal window
$ chmod +x upstream.git/hooks/post-receive

You get a message per push with the agent name, commit message, and a quick diffstat. Enough to know things are moving without opening a terminal.

Monitoring

A monitoring script gives you the full picture at a glance:

monitor.sh
#!/bin/bash
UPSTREAM="upstream.git"
echo "=== Agent Status ==="
echo ""
for container in $(docker compose ps -q); do
NAME=$(docker inspect --format '{{.Name}}' "$container" | sed 's/^\/\+//')
STATUS=$(docker inspect --format '{{.State.Status}}' "$container")
LAST_LOG=$(docker logs --tail 1 "$container" 2>&1)
echo "$NAME [$STATUS]: $LAST_LOG"
done
echo ""
echo "=== Git Status ==="
echo "Commits in last hour: $(git --git-dir="$UPSTREAM" log --since='1 hour ago' --oneline | wc -l | tr -d ' ')"
echo "Latest commit: $(git --git-dir="$UPSTREAM" log -1 --oneline)"
echo ""
echo "=== Active Tasks ==="
git --git-dir="$UPSTREAM" show HEAD:current_tasks/ 2>/dev/null \
| grep -v .gitkeep || echo "None"

Run it in a watch loop:

Terminal window
$ watch -n 30 ./monitor.sh

Cost

This burns through API credits fast. Some rough numbers based on Opus 4.6 pricing:

  • A single agent session runs maybe 10-30 minutes depending on the task
  • Each session uses roughly 50-200k input tokens and 5-20k output tokens
  • With 4 agents running continuously for 8 hours, expect $200-800 depending on task complexity

Sonnet is significantly cheaper if your tasks don’t need Opus-level reasoning. For straightforward bug fixes, refactoring, or test writing, Sonnet handles it fine at a fraction of the cost.

Limitations

Worth being honest about where this falls apart:

  • Merge conflicts get messy. Claude handles simple conflicts fine. Complex ones — especially in the same function — can produce broken merges. More agents means more conflicts.
  • No real communication between agents. The shared files (TODO.md, session logs) are a rough approximation. Agents can’t discuss a design decision or coordinate on an approach.
  • Claude can get stuck in loops. Without a human to course-correct, an agent might spend an entire session trying the same failing approach repeatedly. The STUCK.md convention helps, but doesn’t fully solve this.
  • Tests are the bottleneck. The quality of autonomous output is directly limited by the quality of your test suite. No tests, no guardrails.
  • Context window resets. Each new session starts fresh. Claude has to re-read project files every time, which wastes tokens and time. Session logs mitigate this but don’t eliminate it.

Wrapping Up

The core setup is five files: an agent loop, a prompt, a Dockerfile, a setup script, and a compose file. Everything else — scaling scripts, specialized agents, pre-receive hooks, monitoring — builds on top of that foundation.

The hard part isn’t the infrastructure — it’s writing good tests and prompts that keep agents productive without supervision. Most of your time should go into the test harness and the agent prompt, not the scaffolding.

Start small. Run one agent on a well-tested project and watch what it does. Add more agents once you’re confident the tests catch regressions. Scale up the ambition from there.

References