harness
by mikkelkrogsholmv1.0.0
Framework for autonomous multi-session development. Breaks projects into discrete features and implements them one at a time with clean state between sessions.
MIT GitHub
Keywords
autonomouslong-runningincrementalmulti-sessionfeaturesworkflow
Commands
continueAutonomously implement ALL remaining features until complete
initInitialize a new long-running project for incremental development
statusShow project status without making changes
Documentation
<p align="center">
<img src="logo.svg" alt="Harness Logo" width="400" height="400">
</p>
# Harness
A Claude Code plugin for autonomous multi-session development. Based on [Anthropic's research on effective harnesses for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents).
## Philosophy
Large projects can't be completed in a single session. This plugin provides a framework for:
- **Breaking work into discrete tasks** - 50-200 testable tasks per project
- **Clarifying requirements upfront** - Asks questions before generating tasks
- **Documentation-driven development** - Each task includes relevant doc URLs
- **Native task integration** - Uses Claude Code's built-in task system for tracking
- **Autonomous execution** - Runs through ALL tasks without human intervention
- **Atomic commits** - Each task gets its own commit
- **Tracking progress visibly** - Native UI, session logs, and git history tell the story
## Installation
### Option A: Add as Marketplace (Recommended)
This method persists across sessions and can be shared with your team.
**Step 1: Add the marketplace**
```bash
# Inside Claude Code
/plugin marketplace add mikkelkrogsholm/harness
```
**Step 2: Install the plugin**
```bash
# Install for yourself (user scope - works across all projects)
/plugin install harness@mikkelkrogsholm-harness
# OR install for this project only (project scope - shared with team via git)
/plugin install harness@mikkelkrogsholm-harness --scope project
```
Commands will be namespaced as `/harness:init`, `/harness:continue`, `/harness:status`.
### Option B: Test with --plugin-dir (Session Only)
For quick testing without permanent installation:
```bash
git clone https://github.com/mikkelkrogsholm/harness /tmp/harness-plugin
claude --plugin-dir /tmp/harness-plugin
```
Note: This only loads the plugin for that session.
### Option C: Manual Installation (Copy to .claude/)
For maximum control or when marketplace isn't available:
```bash
# Clone the repo
git clone https://github.com/mikkelkrogsholm/harness /tmp/harness-plugin
# Create .claude directories in your project
mkdir -p .claude/{commands,agents,scripts}
# Copy components
cp /tmp/harness-plugin/commands/*.md .claude/commands/
cp /tmp/harness-plugin/agents/*.md .claude/agents/
cp /tmp/harness-plugin/scripts/*.sh .claude/scripts/
# Make scripts executable
chmod +x .claude/scripts/*.sh
# Fix hook paths for manual installation
sed -i '' 's|\${CLAUDE_PLUGIN_ROOT}/scripts/|.claude/scripts/|g' .claude/agents/*.md
```
Commands will be `/harness-init`, `/harness-continue`, `/harness-status` (no namespace prefix).
### Option D: Project-Level Plugin Configuration
Add to your project's `.claude/settings.json` to auto-prompt team members:
```json
{
"extraKnownMarketplaces": ["mikkelkrogsholm/harness"],
"enabledPlugins": ["harness@mikkelkrogsholm-harness"]
}
```
When team members open the project and trust the folder, they'll be prompted to install.
## Usage
### 1. Initialize a Project
```
/harness:init Build a task management app with auth, projects, and real-time sync
```
(Or `/harness-init` for manual installation)
The bootstrap agent will:
1. **Ask clarifying questions** about:
- Tech stack (frontend, backend, database)
- Authentication method
- Scope (MVP vs full product)
- Third-party integrations
- Special requirements
2. **Research documentation** for chosen technologies
3. **Create tasks and files**:
- **Native tasks** in Claude Code's task system with metadata:
- Priority (1-5)
- Category (core, functional, ui, integration, polish)
- Documentation URLs
- Verification steps
- `claude-progress.txt` - Session log with assumptions documented
- `init.sh` - Development environment setup
### 2. Continue Development
```
/harness:continue
```
(Or `/harness-continue` for manual installation)
This runs autonomously until complete:
```
WHILE incomplete tasks exist:
1. Get next pending task (by priority)
2. Call incremental-workflow agent (fresh context)
3. Agent consults documentation URLs from task metadata
4. Agent implements, verifies, commits
5. Mark task completed, continue to next
END WHILE
```
Stops only when all tasks are completed or all remaining are blocked.
### 3. Check Status
```
/harness:status
```
(Or `/harness-status` for manual installation)
Shows progress without making changes:
- Overall completion percentage
- Progress by category (from task metadata)
- Recent session activity
- Git state
- Next task to implement
- Native Claude Code task UI integration
## How It Works
### Orchestrator Pattern
The `/harness:continue` command acts as an **orchestrator** that loops through tasks. For each task, it delegates to the `incremental-workflow` agent which runs in its own **isolated context**. This gives each task implementation:
- Fresh context window (no token buildup)
- Isolated execution
- Clean return to orchestrator
This is the same pattern as the proven [Motlin /todo-all method](https://motlin.com/blog/claude-code-running-for-hours).
### Native Task Integration
Tasks are stored in Claude Code's native task system, providing:
- **Visual tracking** in the Claude Code UI
- **No external dependencies** - no jq required
- **Persistent across sessions** via SessionStart/SessionEnd hooks
- **Metadata support** for priority, category, verification, and documentation URLs
- **Clean API** via TaskCreate, TaskList, TaskGet, TaskUpdate tools
### Two Agents
| Agent | Purpose | Hooks |
|-------|---------|-------|
| `project-bootstrap` | Asks questions, researches docs, creates tasks | Stop: Auto-commits bootstrap files |
| `incremental-workflow` | Consults docs, implements ONE task per invocation | Stop: Requires clean git state |
### Hook Flow
```
Session Lifecycle:
SessionStart → load-tasks.sh → Loads .claude/tasks/*.json into native task system
[Work happens...]
SessionEnd → save-tasks.sh → Saves native tasks back to .claude/tasks/*.json
Bootstrap (once per project):
project-bootstrap agent runs
Stop hook → bootstrap-commit.sh → Auto-commits progress log and init.sh
Workflow (per task):
incremental-workflow agent runs
[Implementation work...]
Stop hook → verify-clean-state.sh → Ensures clean git before stopping
```
### Documentation-Driven Development
Each task includes metadata with relevant documentation URLs:
```json
{
"subject": "Set up Next.js 14 project with App Router",
"description": "Initialize Next.js project with TypeScript and App Router...",
"activeForm": "Setting up Next.js project",
"metadata": {
"priority": 1,
"category": "core",
"documentation": [
"https://nextjs.org/docs/getting-started/installation",
"https://nextjs.org/docs/app/building-your-application/routing"
],
"verification": [
"npm run dev works",
"TypeScript compiles"
]
}
}
```
The `incremental-workflow` agent **fetches and reads these docs** before implementing each task, ensuring:
- Correct API usage
- Following official patterns
- Avoiding common pitfalls
### Hook Enforcement
The plugin uses hooks to enforce discipline and manage task persistence:
**SessionStart Hook** (setup-harness.sh):
- Creates `.claude/tasks/` directory structure
- Runs load-tasks.sh to load persisted tasks into native system
**SessionEnd Hook** (save-tasks.sh):
- Saves all native tasks to `.claude/tasks/*.json`
- Ensures task state persists across sessions
**Stop Hook** (incremental-workflow via verify-clean-state.sh):
- Checks for uncommitted changes
- Blocks stopping if working tree is dirty
- Forces clean handoffs between sessions
**Stop Hook** (project-bootstrap via bootstrap-commit.sh):
- Auto-commits bootstrap files after initialization
- Creates clean starting point for development
### Files Created
| File/Directory | Purpose |
|----------------|---------|
| `.claude/tasks/*.json` | Task persistence files (auto-managed by hooks) |
| `claude-progress.txt` | Session-by-session log with assumptions documented |
| `init.sh` | Script to set up development environment |
### Task Structure
Tasks are stored in Claude Code's native task system with rich metadata:
```json
{
"taskId": "task-001",
"subject": "Set up Next.js 14 project with App Router",
"description": "Initialize Next.js project with TypeScript, App Router, and configure Tailwind CSS.\n\nVerification:\n- npm run dev works\n- TypeScript compiles\n- Tailwind styles apply",
"activeForm": "Setting up Next.js project",
"status": "pending",
"metadata": {
"priority": 1,
"category": "core",
"documentation": [
"https://nextjs.org/docs/getting-started/installation",
"https://nextjs.org/docs/app/building-your-application/routing"
],
"verification": [
"npm run dev works",
"TypeScript compiles",
"Tailwind styles apply"
],
"tech_stack": {
"frontend": "Next.js 14",
"backend": "Node.js with tRPC",
"database": "PostgreSQL with Drizzle ORM",
"auth": "Better Auth"
}
}
}
```
**Metadata Fields:**
- `priority`: 1 (critical) → 5 (nice-to-have)
- `category`: `core`, `functional`, `ui`, `integration`, `polish`
- `documentation`: Array of relevant documentation URLs
- `verification`: Array of verification steps
- `tech_stack`: Technology choices (stored in first task only)
## Directory Structure
```
harness/
├── .claude-plugin/
│ └── plugin.json # Plugin manifest
├── agents/
│ ├── project-bootstrap.md # Bootstrap agent (asks questions, creates tasks)
│ └── incremental-workflow.md # Workflow agent (implements tasks)
├── commands/
│ ├── init.md # /harness:init
│ ├── continue.md # /harness:continue
│ └── status.md # /harness:status
├── hooks/
│ └── hooks.json # Hook configuration
├── scripts/
│ ├── setup-harness.sh # SessionStart hook (creates .claude/tasks/)
│ ├── load-tasks.sh # Loads persisted tasks
│ ├── save-tasks.sh # SessionEnd hook (saves tasks)
│ ├── bootstrap-commit.sh # Stop hook for bootstrap
│ ├── verify-clean-state.sh # Stop hook for workflow
│ └── migrate-features.sh # Migration utility
└── README.md
```
## Requirements
- Claude Code 1.0.33+ (for plugins with agents and hooks)
- Git (for commit hooks)
- No external dependencies (uses native task tools)
## Migration from feature_list.json
If you have an existing project using the old `feature_list.json` format, you can migrate to the task-based system:
### Automatic Migration
The `/harness:init` command now detects existing `feature_list.json` files and offers to migrate them:
```bash
# If feature_list.json exists
/harness:init
# Follow the prompts to migrate to task-based system
```
The migration will:
1. Convert each feature to a native task
2. Preserve priority, category, verification, and documentation
3. Back up `feature_list.json` to `feature_list.json.bak`
4. Create `.claude/tasks/` directory with persisted tasks
### Manual Migration
If you prefer manual migration, use the migration script:
```bash
# Run the migration script directly
./.claude/scripts/migrate-features.sh
# Or from the plugin directory
./scripts/migrate-features.sh /path/to/your/project
```
This creates a migration guide in `MIGRATION_GUIDE.json` that you can use with Claude Code's task tools.
## Benefits of Native Task Integration
The task-based architecture provides significant advantages over the previous `feature_list.json` approach:
### Visual Progress Tracking
- Tasks appear directly in Claude Code's native UI
- See task status, priorities, and categories at a glance
- No need to open external files to check progress
### No External Dependencies
- No `jq` installation required
- Uses Claude Code's built-in TaskCreate, TaskList, TaskGet, TaskUpdate tools
- Cleaner, more maintainable codebase
### Better Persistence
- SessionStart/SessionEnd hooks automatically manage task state
- Tasks persist across sessions via `.claude/tasks/*.json`
- Git-friendly format for team collaboration
### Rich Metadata Support
- Store priority, category, verification steps, and documentation URLs
- Access task details programmatically via TaskGet
- Filter and query tasks by metadata
### Cleaner Hook System
- Removed need for prevent-feature-edit.sh hook
- Task specifications are naturally immutable via TaskCreate
- Simpler hook configuration in hooks.json
### Improved Developer Experience
- Native tool integration feels more natural
- Better error messages from built-in tools
- Consistent API across all task operations
## Tips
### Starting Fresh
If you want to restart a project, delete the generated files:
```bash
rm -rf .claude/tasks/ claude-progress.txt init.sh
```
### Resuming After Context Limits
If a session hits context limits before completing, just run `/harness:continue` again. It picks up where it left off thanks to task persistence.
### Viewing Progress
Progress is visible directly in the Claude Code UI via the native task panel. You can also check:
```bash
# View progress log
cat claude-progress.txt
# List persisted task files
ls -la .claude/tasks/
# View git history
git log --oneline
```
### Managing the Plugin
```bash
# Update to latest version
/plugin update harness@mikkelkrogsholm-harness
# Disable temporarily
/plugin disable harness@mikkelkrogsholm-harness
# Re-enable
/plugin enable harness@mikkelkrogsholm-harness
# Uninstall
/plugin uninstall harness@mikkelkrogsholm-harness
```
## License
MIT