Running Parallel Autonomous Claude Code Agents with Git

What This Post Covers

This is a practical guide to running multiple Claude Code instances in parallel on a shared codebase, fully autonomous, with git as the synchronization layer. You’ll get:

A loop that keeps Claude Code running autonomously
Docker containers for isolation
Git-based synchronization between agents
A simple task-locking mechanism so agents don’t step on each other
Scripts to manage the whole thing

The approach is intentionally minimal. No orchestration framework, no custom tooling beyond bash scripts and Docker.

How It Works

The architecture has four parts:

A bash loop restarts Claude Code after every session. Each session picks up one task, completes it (or documents why it couldn’t), and exits. The loop gives it a fresh context window for the next task.
Docker containers isolate each agent. Every agent gets its own container with its own clone of the repo. No shared filesystem, no stepping on each other’s files.
A bare git repo acts as the synchronization layer. Agents push and pull just like developers would. Every container mounts this repo as a volume.
Lock files in git prevent two agents from working on the same task. Before starting work, an agent commits a lock file. If another agent already claimed it, the push gets rejected.

Project Structure

Here’s everything you’ll create. The numbered steps below walk through each file.

your-project/
├── run-agent.sh          # Step 1: Agent loop with backoff
├── AGENT_PROMPT.md       # Step 2: What Claude does each session
├── TODO.md               # Task list agents read and update
├── current_tasks/        # Lock files for active tasks
│   └── .gitkeep
├── session_logs/         # Per-session summaries
│   └── .gitkeep
├── Dockerfile            # Step 3: Container image
├── entrypoint.sh         # Step 3: Container startup
├── setup-upstream.sh     # Step 4: Bare repo initialization
├── .env                  # Step 5: API key
└── docker-compose.yml    # Step 5: Multi-agent orchestration

Step 1: The Agent Loop

Create run-agent.sh in the root of your project. This is the foundation — a bash loop that restarts Claude Code every time it finishes.

1
#!/bin/bash
2

3
PROMPT_FILE="${PROMPT_FILE:-AGENT_PROMPT.md}"
4
BACKOFF=0
5
MAX_BACKOFF=300
6

7
while true; do
8
    if [ "$BACKOFF" -gt 0 ]; then
9
        echo "[$(date)] Waiting ${BACKOFF}s before retry..." >> agent_logs/loop.log
10
        sleep "$BACKOFF"
11
    fi
12

13
    COMMIT=$(git rev-parse --short=6 HEAD)
14
    TIMESTAMP=$(date +%Y%m%d-%H%M%S)
15
    LOGFILE="agent_logs/agent_${TIMESTAMP}_${COMMIT}.log"
16

17
    mkdir -p agent_logs
18

19
    claude --dangerously-skip-permissions \
20
           -p "$(cat "$PROMPT_FILE")" \
21
           --model claude-opus-4-6 &> "$LOGFILE"
22

23
    EXIT_CODE=$?
24
    echo "[$(date)] Session ended with code $EXIT_CODE" >> agent_logs/loop.log
25

26
    if [ $EXIT_CODE -ne 0 ]; then
27
        BACKOFF=$(( BACKOFF == 0 ? 5 : BACKOFF * 2 ))
28
        BACKOFF=$(( BACKOFF > MAX_BACKOFF ? MAX_BACKOFF : BACKOFF ))
29
    else
30
        BACKOFF=0
31
    fi
32
done

$ chmod +x run-agent.sh

--dangerously-skip-permissions is what makes it autonomous — Claude won’t stop to ask for confirmation on file edits, shell commands, or anything else. This is why you run it in a container and not on your actual machine.

The --model flag is optional. It defaults to whatever you have configured. Opus is the most capable for complex tasks but also the most expensive. Sonnet works fine for smaller scoped work.

Each session gets logged with a timestamp and the current commit hash, so you can trace what Claude did and when.

The backoff logic handles API outages and rate limits gracefully. If Claude exits with an error, the loop waits 5 seconds, then 10, 20, 40… up to 5 minutes. On a successful session, the delay resets to zero. Without this, a loop can burn through hundreds of failed sessions in minutes during an outage.

Step 2: The Agent Prompt

Create AGENT_PROMPT.md in the root of your project. This is the prompt Claude receives at the start of every session.

1
# Task
2

3
You are an autonomous agent working on [project description].
4

5
## Current State
6

7
Read TODO.md for the list of remaining tasks. Check session_logs/
8
for recent session summaries to understand what other agents have
9
been doing and what the current state of the project looks like.
10

11
## Instructions
12

13
1. Pull the latest changes from upstream
14
2. Read TODO.md and recent session logs
15
3. Pick ONE task that isn't locked by another agent (check current_tasks/)
16
4. Create a lock file: current_tasks/your-task-name.txt with your agent ID and timestamp
17
5. Commit and push the lock file immediately so other agents can see it
18
6. Work on the task until it's done or you're stuck
19
7. Run the test suite to make sure nothing is broken
20
8. Update TODO.md
21
9. Write a session summary to session_logs/ (see below)
22
10. Commit your changes, pull from upstream, resolve any conflicts, push
23
11. Remove your lock file and stop
24

25
## Session Summary
26

27
Before finishing, write a short summary to
28
session_logs/YYYY-MM-DD-HHMMSS-agent-id.md containing:
29

30
- What task you worked on
31
- What you changed
32
- Whether it was completed or if it's still in progress
33
- Any issues you ran into or things the next agent should know
34

35
## Rules
36

37
- One task per session. Complete it or document why you couldn't, then stop.
38
- Do not break existing tests
39
- Keep commits small and focused
40
- If you're stuck on something for more than 3 attempts, document
41
  the issue in STUCK.md and move on
42
- Always pull before pushing

Each session handles exactly one task. Claude picks it up, works on it, writes a summary, and exits. The loop restarts it with a fresh context window for the next task. This prevents context degradation — Claude gets less effective the longer a session runs, so short focused sessions produce better results than long ones.

The key elements:

TODO.md tracks what needs to be done. Every session starts by reading it, so each Claude instance knows what’s left.
session_logs/ captures what happened in each session — including failed attempts and current project state. The next agent reads recent logs to understand context, learn from what didn’t work, and see how the project has evolved.
current_tasks/ is the locking mechanism. Before starting a task, an agent creates a file here and pushes it. If two agents try to lock the same task simultaneously, the second agent’s push gets rejected — it pulls, sees the lock, and picks something else.
The instruction to document failures in STUCK.md prevents agents from spinning on the same problem forever.

You’ll want to tune this prompt for your specific project. The more concrete your instructions, the better Claude stays on track.

Now create the directories agents expect and add a TODO.md with your tasks:

$ mkdir -p current_tasks session_logs
$ touch current_tasks/.gitkeep session_logs/.gitkeep
$ git add run-agent.sh AGENT_PROMPT.md current_tasks/.gitkeep session_logs/.gitkeep
$ git commit -m "init: add agent loop, prompt, and coordination directories"

Create a TODO.md with tasks for the agents to work on:

1
# TODO
2

3
- [ ] Task one description
4
- [ ] Task two description
5
- [ ] Task three description

$ git add TODO.md
$ git commit -m "init: add task list"

Step 3: Docker Setup

Each agent runs in its own container with a clone of the repo. Create the Dockerfile:

1
FROM node:22-bookworm
2

3
RUN apt-get update && apt-get install -y \
4
    git \
5
    curl \
6
    build-essential \
7
    && rm -rf /var/lib/apt/lists/*
8

9
# Install Claude Code
10
RUN curl -fsSL https://claude.ai/install.sh | bash
11

12
COPY entrypoint.sh /entrypoint.sh
13
RUN chmod +x /entrypoint.sh
14

15
ENTRYPOINT ["/entrypoint.sh"]

And the entrypoint.sh that clones the repo and starts the loop:

1
#!/bin/bash
2

3
AGENT_ID="${AGENT_ID:-agent-$(hostname | head -c 8)}"
4

5
# Clone from the shared bare repo
6
git clone /upstream /work
7
cd /work
8

9
git config user.name "$AGENT_ID"
10
git config user.email "${AGENT_ID}@agents.local"
11

12
echo "[$AGENT_ID] Starting agent loop..."
13
exec ./run-agent.sh

$ chmod +x entrypoint.sh

The base image is node:22-bookworm because Claude Code requires Node.js. The entrypoint clones from a shared bare git repo (mounted at /upstream) into /work, configures git identity using the agent ID, and hands off to the loop.

Step 4: The Shared Repository

The bare git repo is how agents share code. Every container mounts it as a volume. Create setup-upstream.sh:

1
#!/bin/bash
2

3
UPSTREAM_DIR="./upstream.git"
4

5
if [ ! -d "$UPSTREAM_DIR" ]; then
6
    git init --bare "$UPSTREAM_DIR"
7
    echo "Created bare repo at $UPSTREAM_DIR"
8
fi
9

10
# Push your project into it
11
git remote add upstream "$UPSTREAM_DIR" 2>/dev/null
12
git push upstream main
13
echo "Pushed main branch to upstream"

$ chmod +x setup-upstream.sh

This creates a bare repository and pushes your current project (including run-agent.sh, AGENT_PROMPT.md, TODO.md, and the coordination directories) into it. Every container will clone from this repo, and every push goes back to it.

Step 5: Docker Compose

Create a .env file with your Anthropic API key:

ANTHROPIC_API_KEY=sk-ant-...

Then the docker-compose.yml that ties everything together:

1
services:
2
  agent-1:
3
    build: .
4
    environment:
5
      - AGENT_ID=agent-1
6
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
7
    volumes:
8
      - ./upstream.git:/upstream
9

10
  agent-2:
11
    build: .
12
    environment:
13
      - AGENT_ID=agent-2
14
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
15
    volumes:
16
      - ./upstream.git:/upstream
17

18
  agent-3:
19
    build: .
20
    environment:
21
      - AGENT_ID=agent-3
22
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
23
    volumes:
24
      - ./upstream.git:/upstream
25

26
  agent-4:
27
    build: .
28
    environment:
29
      - AGENT_ID=agent-4
30
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
31
    volumes:
32
      - ./upstream.git:/upstream

Each service gets its own container, its own agent ID, your API key, and a shared mount to the bare repo.

Task Locking

Worth understanding how the locking works before you run this. When an agent picks a task, it creates a file in current_tasks/ and pushes it immediately:

current_tasks/
├── fix-parser-error.txt      # locked by agent-1
├── add-type-checking.txt     # locked by agent-2
└── .gitkeep

Each lock file contains the agent ID and a timestamp:

agent: agent-1
started: 2026-02-16T14:30:00Z
description: Fixing the parser error on nested function calls

When the agent finishes, it removes the lock file, commits, and pushes. The next agent that pulls will see the task is available again if it wasn’t completed, or gone from TODO.md if it was.

This isn’t bulletproof. If two agents try to lock the same task simultaneously, the second agent’s push gets rejected by git. It then has to pull, at which point it sees the lock and picks a different task. Git’s push rejection handles most cases, and Claude is smart enough to resolve these situations.

Running It

With all the files in place, the full startup is three commands:

$ ./setup-upstream.sh
$ docker compose build
$ docker compose up

To watch what’s happening:

# Follow a specific agent's logs
$ docker compose logs -f agent-1

# See what tasks are currently locked
$ git --git-dir=upstream.git show HEAD:current_tasks/

# Check the latest session log
$ git --git-dir=upstream.git log -1 --oneline -- session_logs/

Scale down to fewer agents by commenting out services in the compose file, or scale up with the script in the next section.

Everything below this line is optional. The core setup above is complete — you have agents running, coordinating, and pushing code. The sections that follow cover scaling, specialized roles, testing strategies, guardrails, and monitoring.

Scaling with a Script

If you want more than a handful of agents, generating the compose file by hand gets tedious. A script handles it:

1
#!/bin/bash
2

3
NUM_AGENTS="${1:-4}"
4
COMPOSE_FILE="docker-compose.yml"
5

6
cat > "$COMPOSE_FILE" <<EOF
7
services:
8
EOF
9

10
for i in $(seq 1 "$NUM_AGENTS"); do
11
    cat >> "$COMPOSE_FILE" <<EOF
12
  agent-${i}:
13
    build: .
14
    environment:
15
      - AGENT_ID=agent-${i}
16
      - ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
17
    volumes:
18
      - ./upstream.git:/upstream
19

20
EOF
21
done
22

23
echo "Generated $COMPOSE_FILE with $NUM_AGENTS agents"
24
echo "Run: docker compose up --build"

$ chmod +x spawn-agents.sh
$ ./spawn-agents.sh 16
Generated docker-compose.yml with 16 agents
Run: docker compose up --build

Specialized Agents

Not every agent needs the same prompt. You can dedicate agents to different roles — some fixing bugs, others refactoring, one maintaining documentation — using different prompt files. The PROMPT_FILE environment variable in run-agent.sh controls which prompt each agent uses.

1
#!/bin/bash
2

3
# Generate a compose file with specialized roles
4
cat > docker-compose.yml <<EOF
5
services:
6
EOF
7

8
# Main workers
9
for i in $(seq 1 8); do
10
    cat >> docker-compose.yml <<EOF
11
  worker-${i}:
12
    build: .
13
    environment:
14
      - AGENT_ID=worker-${i}
15
      - ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
16
    volumes:
17
      - ./upstream.git:/upstream
18

19
EOF
20
done
21

22
# Specialist agents with custom prompts
23
cat >> docker-compose.yml <<EOF
24
  quality:
25
    build: .
26
    environment:
27
      - AGENT_ID=quality
28
      - PROMPT_FILE=prompts/quality.md
29
      - ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
30
    volumes:
31
      - ./upstream.git:/upstream
32

33
  docs:
34
    build: .
35
    environment:
36
      - AGENT_ID=docs
37
      - PROMPT_FILE=prompts/docs.md
38
      - ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
39
    volumes:
40
      - ./upstream.git:/upstream
41

42
  tests:
43
    build: .
44
    environment:
45
      - AGENT_ID=tests
46
      - PROMPT_FILE=prompts/tests.md
47
      - ANTHROPIC_API_KEY=\${ANTHROPIC_API_KEY}
48
    volumes:
49
      - ./upstream.git:/upstream
50
EOF
51

52
echo "Generated docker-compose.yml with 8 workers + 3 specialists"
53
echo "Run: docker compose up --build"

A quality agent prompt might look like:

1
# Task
2

3
You are a code quality agent. Your job is to review recent commits
4
and improve code quality.
5

6
## Instructions
7

8
1. Pull latest changes
9
2. Look at the last 10 commits for code smells, duplication, or bugs
10
3. If you find issues, fix them
11
4. Do not change functionality — only improve structure and clarity
12
5. Run tests to make sure nothing breaks
13
6. Commit and push

Commit the prompt files to your repo so agents can access them after cloning.

Writing Good Tests

This is the most important part. Autonomous agents will solve whatever the tests tell them to solve. If your tests are wrong or incomplete, Claude will confidently produce the wrong thing.

Some things that help:

Keep test output short. Claude’s context window is finite. A test suite that dumps 10,000 lines of output will drown out useful information. Print a summary, log details to a file.

1
#!/bin/bash
2

3
LOGFILE="test_results/$(date +%Y%m%d-%H%M%S).log"
4
mkdir -p test_results
5

6
# Run tests, capture full output to log
7
./test-suite.sh > "$LOGFILE" 2>&1
8

9
# Print only the summary
10
TOTAL=$(grep -c "^TEST" "$LOGFILE")
11
PASSED=$(grep -c "^TEST.*PASS" "$LOGFILE")
12
FAILED=$(grep -c "^TEST.*FAIL" "$LOGFILE")
13

14
echo "Tests: $PASSED/$TOTAL passed, $FAILED failed"
15

16
if [ "$FAILED" -gt 0 ]; then
17
    echo ""
18
    echo "Failures:"
19
    grep "^TEST.*FAIL" "$LOGFILE" | head -20
20
    echo ""
21
    echo "Full log: $LOGFILE"
22
fi

Include a fast mode. A full test suite might take 30 minutes. Claude doesn’t need to run the whole thing every iteration. A --fast flag that runs a random 10% sample keeps things moving while still catching regressions.

Use deterministic sampling. Each agent should run the same subset consistently (so it can identify regressions), but different agents should cover different subsets. Seed the random sample with the agent ID.

Prefix errors consistently. ERROR: description on a single line makes it easy for Claude to grep for problems. Don’t scatter error information across multiple lines or formats.

Guarding the Repo

With multiple agents pushing code autonomously, it’s easy for one bad commit to cascade. A pre-receive hook on the bare repo runs the fast test suite before accepting any push. If tests fail, the push is rejected and the agent has to fix it first.

1
#!/bin/bash
2

3
WORK_DIR=$(mktemp -d)
4
trap "rm -rf $WORK_DIR" EXIT
5

6
# Check out the incoming code
7
while read oldrev newrev refname; do
8
    git --work-tree="$WORK_DIR" checkout -f "$newrev" 2>/dev/null
9
done
10

11
# Run the fast test suite
12
cd "$WORK_DIR"
13
if [ -f "./run-tests.sh" ]; then
14
    ./run-tests.sh --fast > /tmp/pre-receive-test.log 2>&1
15
    if [ $? -ne 0 ]; then
16
        echo "ERROR: Tests failed. Push rejected."
17
        echo ""
18
        tail -20 /tmp/pre-receive-test.log
19
        exit 1
20
    fi
21
fi

$ chmod +x upstream.git/hooks/pre-receive

This is the single highest-value addition to the whole setup. Without it, an agent pushes broken code, three other agents pull it, build on top of it, and now you have four agents all working on a broken foundation.

Stale Lock Cleanup

If an agent crashes mid-task — container dies, API error, whatever — its lock file stays in current_tasks/ forever. Other agents see the lock and avoid the task, so it never gets picked up again.

A simple cleanup script removes locks older than a threshold:

1
#!/bin/bash
2

3
MAX_AGE_HOURS="${1:-2}"
4
REPO_DIR="${2:-./upstream.git}"
5
WORK_DIR=$(mktemp -d)
6
trap "rm -rf $WORK_DIR" EXIT
7

8
git clone "$REPO_DIR" "$WORK_DIR" 2>/dev/null
9
cd "$WORK_DIR"
10

11
CLEANED=0
12

13
for lock in current_tasks/*.txt; do
14
    [ -f "$lock" ] || continue
15

16
    # Extract the timestamp from the lock file
17
    STARTED=$(grep "^started:" "$lock" | awk '{print $2}')
18
    if [ -z "$STARTED" ]; then
19
        continue
20
    fi
21

22
    # Compare with current time (GNU date)
23
    LOCK_EPOCH=$(date -d "$STARTED" +%s 2>/dev/null)
24
    NOW_EPOCH=$(date +%s)
25
    AGE_HOURS=$(( (NOW_EPOCH - LOCK_EPOCH) / 3600 ))
26

27
    if [ "$AGE_HOURS" -ge "$MAX_AGE_HOURS" ]; then
28
        echo "Removing stale lock: $lock (${AGE_HOURS}h old)"
29
        git rm "$lock"
30
        CLEANED=$((CLEANED + 1))
31
    fi
32
done
33

34
if [ "$CLEANED" -gt 0 ]; then
35
    git commit -m "cleanup: remove $CLEANED stale lock(s)"
36
    git push origin main
37
fi

This script uses GNU date -d, so run it on a Linux host or inside a container. Run it on a cron every 30 minutes or so:

$ crontab -e
*/30 * * * * /path/to/cleanup-locks.sh 2 /path/to/upstream.git

Two hours is a reasonable default. Most tasks finish well within that. If you have longer-running tasks, bump the threshold.

Slack Notifications

A post-receive hook on the bare repo posts to Slack whenever an agent pushes. Lets you passively monitor progress from your phone without watching logs.

1
#!/bin/bash
2

3
SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
4
REPO_NAME=$(basename $(pwd) .git)
5

6
while read oldrev newrev refname; do
7
    BRANCH=$(echo "$refname" | sed 's|refs/heads/||')
8
    AUTHOR=$(git log -1 --format="%an" "$newrev")
9
    MESSAGE=$(git log -1 --format="%s" "$newrev")
10
    CHANGED=$(git diff --stat "$oldrev" "$newrev" 2>/dev/null | tail -1)
11

12
    PAYLOAD=$(cat <<EOF
13
{
14
    "text": "*${REPO_NAME}* — ${AUTHOR} pushed to \`${BRANCH}\`\n>${MESSAGE}\n\`\`\`${CHANGED}\`\`\`"
15
}
16
EOF
17
)
18

19
    curl -s -X POST -H "Content-Type: application/json" \
20
         -d "$PAYLOAD" "$SLACK_WEBHOOK" > /dev/null
21
done

$ chmod +x upstream.git/hooks/post-receive

You get a message per push with the agent name, commit message, and a quick diffstat. Enough to know things are moving without opening a terminal.

Monitoring

A monitoring script gives you the full picture at a glance:

1
#!/bin/bash
2

3
UPSTREAM="upstream.git"
4

5
echo "=== Agent Status ==="
6
echo ""
7

8
for container in $(docker compose ps -q); do
9
    NAME=$(docker inspect --format '{{.Name}}' "$container" | sed 's/^\/\+//')
10
    STATUS=$(docker inspect --format '{{.State.Status}}' "$container")
11
    LAST_LOG=$(docker logs --tail 1 "$container" 2>&1)
12

13
    echo "$NAME [$STATUS]: $LAST_LOG"
14
done
15

16
echo ""
17
echo "=== Git Status ==="
18
echo "Commits in last hour: $(git --git-dir="$UPSTREAM" log --since='1 hour ago' --oneline | wc -l | tr -d ' ')"
19
echo "Latest commit: $(git --git-dir="$UPSTREAM" log -1 --oneline)"
20
echo ""
21
echo "=== Active Tasks ==="
22
git --git-dir="$UPSTREAM" show HEAD:current_tasks/ 2>/dev/null \
23
    | grep -v .gitkeep || echo "None"

Run it in a watch loop:

$ watch -n 30 ./monitor.sh

Cost

This burns through API credits fast. Some rough numbers based on Opus 4.6 pricing:

A single agent session runs maybe 10-30 minutes depending on the task
Each session uses roughly 50-200k input tokens and 5-20k output tokens
With 4 agents running continuously for 8 hours, expect $200-800 depending on task complexity

Sonnet is significantly cheaper if your tasks don’t need Opus-level reasoning. For straightforward bug fixes, refactoring, or test writing, Sonnet handles it fine at a fraction of the cost.

Limitations

Worth being honest about where this falls apart:

Merge conflicts get messy. Claude handles simple conflicts fine. Complex ones — especially in the same function — can produce broken merges. More agents means more conflicts.
No real communication between agents. The shared files (TODO.md, session logs) are a rough approximation. Agents can’t discuss a design decision or coordinate on an approach.
Claude can get stuck in loops. Without a human to course-correct, an agent might spend an entire session trying the same failing approach repeatedly. The STUCK.md convention helps, but doesn’t fully solve this.
Tests are the bottleneck. The quality of autonomous output is directly limited by the quality of your test suite. No tests, no guardrails.
Context window resets. Each new session starts fresh. Claude has to re-read project files every time, which wastes tokens and time. Session logs mitigate this but don’t eliminate it.

Wrapping Up

The core setup is five files: an agent loop, a prompt, a Dockerfile, a setup script, and a compose file. Everything else — scaling scripts, specialized agents, pre-receive hooks, monitoring — builds on top of that foundation.

The hard part isn’t the infrastructure — it’s writing good tests and prompts that keep agents productive without supervision. Most of your time should go into the test harness and the agent prompt, not the scaffolding.

Start small. Run one agent on a well-tested project and watch what it does. Add more agents once you’re confident the tests catch regressions. Scale up the ambition from there.