claude-task-master/.taskmaster/docs/tdd-workflow-phase-2-pr-resumability.md
Ralph Khreish ccb87a516a
feat: implement tdd workflow (#1309)
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-10-18 16:29:03 +02:00

10 KiB

Phase 2: PR + Resumability - Autonomous TDD Workflow

Objective

Add PR creation with GitHub CLI integration, resumable checkpoints for interrupted runs, and enhanced guardrails with coverage enforcement.

Scope

  • GitHub PR creation via gh CLI
  • Well-formed PR body using run report
  • Resumable checkpoints and --resume flag
  • Coverage enforcement before finalization
  • Optional lint/format step
  • Enhanced error recovery

Deliverables

1. PR Creation Integration

PRAdapter (packages/tm-core/src/services/pr-adapter.ts):

class PRAdapter {
  async isGHAvailable(): Promise<boolean>
  async createPR(options: PROptions): Promise<PRResult>
  async getPRTemplate(runReport: RunReport): Promise<string>

  // Fallback for missing gh CLI
  async getManualPRInstructions(options: PROptions): Promise<string>
}

interface PROptions {
  branch: string
  base: string
  title: string
  body: string
  draft?: boolean
}

interface PRResult {
  url: string
  number: number
}

PR Title Format:

Task #<id> [<tag>]: <title>

Example: Task #42 [analytics]: User metrics tracking

PR Body Template:

Located at .taskmaster/templates/pr-body.md:

## Summary

Implements Task #42 from TaskMaster autonomous workflow.

**Branch:** {branch}
**Tag:** {tag}
**Subtasks completed:** {subtaskCount}

{taskDescription}

## Subtasks

{subtasksList}

## Test Coverage

| Metric | Coverage |
|--------|----------|
| Lines | {lines}% |
| Branches | {branches}% |
| Functions | {functions}% |
| Statements | {statements}% |

**All subtasks passed with {totalTests} tests.**

## Commits

{commitsList}

## Run Report

Full execution report: `.taskmaster/reports/runs/{runId}/`

---

🤖 Generated with [Task Master](https://github.com/cline/task-master) autonomous TDD workflow

Token replacement:

  • {branch} → branch name
  • {tag} → active tag
  • {subtaskCount} → number of completed subtasks
  • {taskDescription} → task description from TaskMaster
  • {subtasksList} → markdown list of subtask titles
  • {lines}, {branches}, {functions}, {statements} → coverage percentages
  • {totalTests} → total test count
  • {commitsList} → markdown list of commit SHAs and messages
  • {runId} → run ID timestamp

2. GitHub CLI Integration

Detection:

which gh

If not found, show fallback instructions:

✓ Branch pushed: analytics/task-42-user-metrics
✗ gh CLI not found - cannot create PR automatically

To create PR manually:
  gh pr create \
    --base main \
    --head analytics/task-42-user-metrics \
    --title "Task #42 [analytics]: User metrics tracking" \
    --body-file .taskmaster/reports/runs/2025-01-15-142033/pr.md

Or visit:
  https://github.com/org/repo/compare/main...analytics/task-42-user-metrics

Confirmation gate:

Ready to create PR:
  Title: Task #42 [analytics]: User metrics tracking
  Base: main
  Head: analytics/task-42-user-metrics

Create PR? [Y/n]

Unless --no-confirm flag is set.

3. Resumable Workflow

State Checkpoint (state.json):

{
  "runId": "2025-01-15-142033",
  "taskId": "42",
  "phase": "subtask-loop",
  "currentSubtask": "42.2",
  "currentPhase": "green",
  "attempts": 2,
  "completedSubtasks": ["42.1"],
  "commits": ["a1b2c3d"],
  "branch": "analytics/task-42-user-metrics",
  "tag": "analytics",
  "canResume": true,
  "pausedAt": "2025-01-15T14:25:35Z",
  "pausedReason": "max_attempts_reached",
  "nextAction": "manual_review_required"
}

Resume Command:

$ tm autopilot --resume

Resuming run: 2025-01-15-142033
  Task: #42 [analytics] User metrics tracking
  Branch: analytics/task-42-user-metrics
  Last subtask: 42.2 (GREEN phase, attempt 2/3 failed)
  Paused: 5 minutes ago

Reason: Could not achieve green state after 3 attempts
Last error: POST /api/metrics returns 500 instead of 201

Resume from subtask 42.2 GREEN phase? [Y/n]

Resume logic:

  1. Load state from .taskmaster/reports/runs/<runId>/state.json
  2. Verify branch still exists and is checked out
  3. Verify no uncommitted changes (unless --force)
  4. Continue from last checkpoint phase
  5. Update state file as execution progresses

Multiple interrupted runs:

$ tm autopilot --resume

Found 2 resumable runs:
  1. 2025-01-15-142033 - Task #42 (paused 5 min ago at subtask 42.2 GREEN)
  2. 2025-01-14-103022 - Task #38 (paused 2 hours ago at subtask 38.3 RED)

Select run to resume [1-2]:

4. Coverage Enforcement

Coverage Check Phase (before finalization):

async function enforceCoverage(runId: string): Promise<void> {
  const testResults = await testRunner.runAll()
  const coverage = await testRunner.getCoverage()

  const thresholds = config.test.coverageThresholds
  const failures = []

  if (coverage.lines < thresholds.lines) {
    failures.push(`Lines: ${coverage.lines}% < ${thresholds.lines}%`)
  }
  // ... check branches, functions, statements

  if (failures.length > 0) {
    throw new CoverageError(
      `Coverage thresholds not met:\n${failures.join('\n')}`
    )
  }

  // Store coverage in run report
  await storeRunArtifact(runId, 'coverage.json', coverage)
}

Handling coverage failures:

⚠️  Coverage check failed:
  Lines: 78.5% < 80%
  Branches: 75.0% < 80%

Options:
  1. Add more tests and resume
  2. Lower thresholds in .taskmaster/config.json
  3. Skip coverage check: tm autopilot --resume --skip-coverage

Run paused. Fix coverage and resume with:
  tm autopilot --resume

5. Optional Lint/Format Step

Configuration:

{
  "autopilot": {
    "finalization": {
      "lint": {
        "enabled": true,
        "command": "npm run lint",
        "fix": true,
        "failOnError": false
      },
      "format": {
        "enabled": true,
        "command": "npm run format",
        "commitChanges": true
      }
    }
  }
}

Execution:

Finalization Steps:

  ✓ All tests passing (12 tests, 0 failures)
  ✓ Coverage thresholds met (85% lines, 82% branches)

  LINT Running linter... ⏳
  LINT ✓ No lint errors

  FORMAT Running formatter... ⏳
  FORMAT ✓ Formatted 3 files
  FORMAT ✓ Committed formatting changes: "chore: auto-format code"

  PUSH Pushing to origin... ⏳
  PUSH ✓ Pushed analytics/task-42-user-metrics

  PR Creating pull request... ⏳
  PR ✓ Created PR #123
      https://github.com/org/repo/pull/123

6. Enhanced Error Recovery

Pause Points:

  • Max GREEN attempts reached (current)
  • Coverage check failed (new)
  • Lint errors (if failOnError: true)
  • Git push failed (new)
  • PR creation failed (new)

Each pause saves:

  • Full state checkpoint
  • Last command output
  • Suggested next actions
  • Resume instructions

Automatic recovery attempts:

  • Git push: retry up to 3 times with backoff
  • PR creation: fall back to manual instructions
  • Lint: auto-fix if enabled, otherwise pause

7. Finalization Phase Enhancement

Updated workflow:

  1. Run full test suite
  2. Check coverage thresholds → pause if failed
  3. Run lint (if enabled) → pause if failed and failOnError: true
  4. Run format (if enabled) → auto-commit changes
  5. Confirm push (unless --no-confirm)
  6. Push branch → retry on failure
  7. Generate PR body from template
  8. Create PR via gh → fall back to manual instructions
  9. Update task status to 'review' (configurable)
  10. Save final run report

Final output:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✅ Task #42 [analytics]: User metrics tracking - COMPLETE

  Branch: analytics/task-42-user-metrics
  Subtasks completed: 3/3
  Commits: 3
  Total tests: 12 (12 passed, 0 failed)
  Coverage: 85% lines, 82% branches, 88% functions, 85% statements

  PR #123: https://github.com/org/repo/pull/123

  Run report: .taskmaster/reports/runs/2025-01-15-142033/

Next steps:
  - Review PR and request changes if needed
  - Merge when ready
  - Task status updated to 'review'

Completed in 24 minutes

CLI Updates

New flags:

  • --resume → Resume from last checkpoint
  • --skip-coverage → Skip coverage checks
  • --skip-lint → Skip lint step
  • --skip-format → Skip format step
  • --skip-pr → Push branch but don't create PR
  • --draft-pr → Create draft PR instead of ready-for-review

Configuration Updates

Add to .taskmaster/config.json:

{
  "autopilot": {
    "finalization": {
      "lint": {
        "enabled": false,
        "command": "npm run lint",
        "fix": true,
        "failOnError": false
      },
      "format": {
        "enabled": false,
        "command": "npm run format",
        "commitChanges": true
      },
      "updateTaskStatus": "review"
    }
  },
  "git": {
    "pr": {
      "enabled": true,
      "base": "default",
      "bodyTemplate": ".taskmaster/templates/pr-body.md",
      "draft": false
    },
    "pushRetries": 3,
    "pushRetryDelay": 5000
  }
}

Success Criteria

  • Can create PR automatically with well-formed body
  • Can resume interrupted runs from any checkpoint
  • Coverage checks prevent low-quality code from being merged
  • Clear error messages and recovery paths for all failure modes
  • Run reports include full PR context for review

Out of Scope (defer to Phase 3)

  • Multiple test framework support (pytest, go test)
  • Diff preview before commits
  • TUI panel implementation
  • Extension/IDE integration

Testing Strategy

  • Mock gh CLI for PR creation tests
  • Test resume from each possible pause point
  • Test coverage failure scenarios
  • Test lint/format integration with mock commands
  • End-to-end test with PR creation on test repo

Dependencies

  • Phase 1 completed (core workflow)
  • GitHub CLI (gh) installed (optional, fallback provided)
  • Test framework supports coverage output

Estimated Effort

1-2 weeks

Risks & Mitigations

  • Risk: GitHub CLI auth issues

    • Mitigation: Clear auth setup docs, fallback to manual instructions
  • Risk: PR body template doesn't match all project needs

    • Mitigation: Make template customizable via config path
  • Risk: Resume state gets corrupted

    • Mitigation: Validate state on load, provide --force-reset option
  • Risk: Coverage calculation differs between runs

    • Mitigation: Store coverage with each test run for comparison

Validation

Test with:

  • Successful PR creation end-to-end
  • Resume from GREEN attempt failure
  • Resume from coverage failure
  • Resume from lint failure
  • Missing gh CLI (fallback to manual)
  • Lint/format integration enabled
  • Multiple interrupted runs (selection UI)