mirror of https://github.com/eyaltoledano/claude-task-master.git synced 2025-11-14 09:05:04 +00:00

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

2025-10-18 16:29:03 +02:00

12 KiB

Raw Permalink Blame History

Phase 1: Core Rails - State Machine & Orchestration

Objective

Build the WorkflowOrchestrator as a state machine that guides AI sessions through TDD workflow, rather than directly executing code.

Architecture Overview

Execution Model

The orchestrator acts as a state manager and guide, not a code executor:

┌─────────────────────────────────────────────────────────────┐
│                    Claude Code (MCP Client)                  │
│  - Queries "what to do next"                                │
│  - Executes work (writes tests, code, runs commands)        │
│  - Reports completion                                        │
└────────────────┬────────────────────────────────────────────┘
                 │ MCP Protocol
                 ▼
┌─────────────────────────────────────────────────────────────┐
│              WorkflowOrchestrator (tm-core)                  │
│  - Maintains state machine (RED → GREEN → COMMIT)           │
│  - Returns work units with context                          │
│  - Validates preconditions                                  │
│  - Records progress                                         │
│  - Persists state for resumability                          │
└─────────────────────────────────────────────────────────────┘

Why This Approach?

Separation of Concerns: State management separate from code execution
Leverage Existing Tools: Uses Claude Code's capabilities instead of reimplementing
Human-in-the-Loop: Easy to inspect state and intervene at any phase
Simpler Implementation: Orchestrator is pure logic, no AI model integration needed
Flexible Executors: Any tool (Claude Code, human, other AI) can execute work units

Core Components

1. WorkflowOrchestrator Service

Location: packages/tm-core/src/services/workflow-orchestrator.service.ts

Responsibilities:

Track current phase (RED/GREEN/COMMIT) per subtask
Generate work units with context for each phase
Validate phase completion criteria
Advance state machine on successful completion
Handle errors and retry logic
Persist run state for resumability

API:

interface WorkflowOrchestrator {
  // Start a new autopilot run
  startRun(taskId: string, options?: RunOptions): Promise<RunContext>;

  // Get next work unit to execute
  getNextWorkUnit(runId: string): Promise<WorkUnit | null>;

  // Report work unit completion
  completeWorkUnit(
    runId: string,
    workUnitId: string,
    result: WorkUnitResult
  ): Promise<void>;

  // Get current run state
  getRunState(runId: string): Promise<RunState>;

  // Pause/resume
  pauseRun(runId: string): Promise<void>;
  resumeRun(runId: string): Promise<void>;
}

interface WorkUnit {
  id: string;                    // Unique work unit ID
  phase: 'RED' | 'GREEN' | 'COMMIT';
  subtaskId: string;             // e.g., "42.1"
  action: string;                // Human-readable description
  context: WorkUnitContext;      // All info needed to execute
  preconditions: Precondition[]; // Checks before execution
}

interface WorkUnitContext {
  taskId: string;
  taskTitle: string;
  subtaskTitle: string;
  subtaskDescription: string;
  dependencies: string[];        // Completed subtask IDs
  testCommand: string;           // e.g., "npm test"

  // Phase-specific context
  redPhase?: {
    testFile: string;            // Where to create test
    testFramework: string;       // e.g., "vitest"
    acceptanceCriteria: string[];
  };

  greenPhase?: {
    testFile: string;            // Test to make pass
    implementationHints: string[];
    expectedFiles: string[];     // Files likely to modify
  };

  commitPhase?: {
    commitMessage: string;       // Pre-generated message
    filesToCommit: string[];     // Files modified in RED+GREEN
  };
}

interface WorkUnitResult {
  success: boolean;
  phase: 'RED' | 'GREEN' | 'COMMIT';

  // RED phase results
  testsCreated?: string[];
  testsFailed?: number;

  // GREEN phase results
  testsPassed?: number;
  filesModified?: string[];
  attempts?: number;

  // COMMIT phase results
  commitSha?: string;

  // Common
  error?: string;
  logs?: string;
}

interface RunState {
  runId: string;
  taskId: string;
  status: 'running' | 'paused' | 'completed' | 'failed';
  currentPhase: 'RED' | 'GREEN' | 'COMMIT';
  currentSubtask: string;
  completedSubtasks: string[];
  failedSubtasks: string[];
  startTime: Date;
  lastUpdateTime: Date;

  // Resumability
  checkpoint: {
    subtaskId: string;
    phase: 'RED' | 'GREEN' | 'COMMIT';
    attemptNumber: number;
  };
}

2. State Machine Logic

Phase Transitions:

START → RED(subtask 1) → GREEN(subtask 1) → COMMIT(subtask 1)
                                               ↓
                        RED(subtask 2) ← ─ ─ ─ ┘
                             ↓
                        GREEN(subtask 2)
                             ↓
                        COMMIT(subtask 2)
                             ↓
                           (repeat for remaining subtasks)
                             ↓
                        FINALIZE → END

Phase Rules:

RED: Can only transition to GREEN if tests created and failing
GREEN: Can only transition to COMMIT if tests passing (attempt < maxAttempts)
COMMIT: Can only transition to next RED if commit successful
FINALIZE: Can only start if all subtasks completed

Preconditions:

RED: No uncommitted changes (or staged from previous GREEN that failed)
GREEN: RED phase complete, tests exist and are failing
COMMIT: GREEN phase complete, all tests passing, coverage meets threshold

3. MCP Integration

New MCP Tools (expose WorkflowOrchestrator via MCP):

// Start an autopilot run
mcp__task_master_ai__autopilot_start(taskId: string, dryRun?: boolean)

// Get next work unit
mcp__task_master_ai__autopilot_next_work_unit(runId: string)

// Complete current work unit
mcp__task_master_ai__autopilot_complete_work_unit(
  runId: string,
  workUnitId: string,
  result: WorkUnitResult
)

// Get run state
mcp__task_master_ai__autopilot_get_state(runId: string)

// Pause/resume
mcp__task_master_ai__autopilot_pause(runId: string)
mcp__task_master_ai__autopilot_resume(runId: string)

4. Git/Test Adapters

GitAdapter (packages/tm-core/src/services/git-adapter.service.ts):

Check working tree status
Validate branch state
Read git config (user, remote, default branch)
Does NOT execute git commands (that's executor's job)

TestAdapter (packages/tm-core/src/services/test-adapter.service.ts):

Detect test framework from package.json
Parse test output (failures, passes, coverage)
Validate coverage thresholds
Does NOT run tests (that's executor's job)

5. Run State Persistence

Storage Location: .taskmaster/reports/runs/<runId>/

Files:

state.json - Current run state (for resumability)
log.jsonl - Event stream (timestamped work unit completions)
manifest.json - Run metadata
work-units.json - All work units generated for this run

Example state.json:

{
  "runId": "2025-01-15-142033",
  "taskId": "42",
  "status": "paused",
  "currentPhase": "GREEN",
  "currentSubtask": "42.2",
  "completedSubtasks": ["42.1"],
  "failedSubtasks": [],
  "checkpoint": {
    "subtaskId": "42.2",
    "phase": "GREEN",
    "attemptNumber": 2
  },
  "startTime": "2025-01-15T14:20:33Z",
  "lastUpdateTime": "2025-01-15T14:35:12Z"
}

Implementation Plan

Step 1: WorkflowOrchestrator Skeleton

Create workflow-orchestrator.service.ts with interfaces
Implement state machine logic (phase transitions)
Add run state persistence (state.json, log.jsonl)
Write unit tests for state machine

Step 2: Work Unit Generation

Implement getNextWorkUnit() with context assembly
Generate RED phase work units (test file paths, criteria)
Generate GREEN phase work units (implementation hints)
Generate COMMIT phase work units (commit messages)

Step 3: Git/Test Adapters

Create GitAdapter for status checks only
Create TestAdapter for output parsing only
Add precondition validation using adapters
Write adapter unit tests

Step 4: MCP Integration

Add MCP tool definitions in packages/mcp-server/src/tools/
Wire up WorkflowOrchestrator to MCP tools
Test MCP tools via Claude Code
Document MCP workflow in CLAUDE.md

Step 5: CLI Integration

Update autopilot.command.ts to call WorkflowOrchestrator
Add --interactive mode that shows work units and waits for completion
Add --resume flag to continue paused runs
Test end-to-end flow

Step 6: Integration Testing

Create test task with 2-3 subtasks
Run autopilot start → get work unit → complete → repeat
Verify state persistence and resumability
Test failure scenarios (test failures, git issues)

Success Criteria

WorkflowOrchestrator can generate work units for all phases
MCP tools allow Claude Code to query and complete work units
State persists correctly between work unit completions
Run can be paused and resumed from checkpoint
Adapters validate preconditions without executing commands
End-to-end: Claude Code can complete a simple task via work units

Out of Scope (Phase 1)

Actual git operations (branch creation, commits) - executor handles this
Actual test execution - executor handles this
PR creation - deferred to Phase 2
TUI interface - deferred to Phase 3
Coverage enforcement - deferred to Phase 2

Example Usage Flow

# Terminal 1: Claude Code session
$ claude

# In Claude Code (via MCP):
> Start autopilot for task 42
[Calls mcp__task_master_ai__autopilot_start(42)]
→ Run started: run-2025-01-15-142033

> Get next work unit
[Calls mcp__task_master_ai__autopilot_next_work_unit(run-2025-01-15-142033)]
→ Work unit: RED phase for subtask 42.1
→ Action: Generate failing tests for metrics schema
→ Test file: src/__tests__/schema.test.js
→ Framework: vitest

> [Claude Code creates test file, runs tests]

> Complete work unit
[Calls mcp__task_master_ai__autopilot_complete_work_unit(
  run-2025-01-15-142033,
  workUnit-42.1-RED,
  { success: true, testsCreated: ['src/__tests__/schema.test.js'], testsFailed: 3 }
)]
→ Work unit completed. State saved.

> Get next work unit
[Calls mcp__task_master_ai__autopilot_next_work_unit(run-2025-01-15-142033)]
→ Work unit: GREEN phase for subtask 42.1
→ Action: Implement code to pass failing tests
→ Test file: src/__tests__/schema.test.js
→ Expected implementation: src/schema.js

> [Claude Code implements schema.js, runs tests, confirms all pass]

> Complete work unit
[...]
→ Work unit completed. Ready for COMMIT.

> Get next work unit
[...]
→ Work unit: COMMIT phase for subtask 42.1
→ Commit message: "feat(metrics): add metrics schema (task 42.1)"
→ Files to commit: src/__tests__/schema.test.js, src/schema.js

> [Claude Code stages files and commits]

> Complete work unit
[...]
→ Subtask 42.1 complete! Moving to 42.2...

Dependencies

Existing TaskService (task loading, status updates)
Existing PreflightChecker (environment validation)
Existing TaskLoaderService (dependency ordering)
MCP server infrastructure

Estimated Effort

7-10 days

Next Phase

Phase 2 will add:

PR creation via gh CLI
Coverage enforcement
Enhanced error recovery
Full resumability testing

12 KiB Raw Permalink Blame History