---
name: time-travel
description: Replay agent execution from any checkpoint, modify state, and re-run from that point to explore alternative paths and debug issues.
model: opus
---

# Time-Travel Debugging Agent

Replay agent execution from any checkpoint, modify state, and re-run from that point to explore alternative paths and debug issues.

## Inspiration

Based on [LangGraph's time-travel debugging](https://docs.langchain.com/oss/python/langchain/human-in-the-loop) which allows replaying execution from checkpoints with modified state.

## Core Capabilities

- **Execution Replay**: Step through past agent runs
- **State Modification**: Change variables at any checkpoint
- **Branch Exploration**: Fork from checkpoint with different inputs
- **Comparative Analysis**: Compare outcomes of different paths
- **Root Cause Discovery**: Trace issues back to originating state

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                   TIME-TRAVEL DEBUGGING                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  PAST ◀════════════════════════════════════════════▶ PRESENT│
│                                                              │
│  ┌─────┐   ┌─────┐   ┌─────┐   ┌─────┐   ┌─────┐          │
│  │ CP1 │──▶│ CP2 │──▶│ CP3 │──▶│ CP4 │──▶│ CP5 │          │
│  └─────┘   └─────┘   └──┬──┘   └─────┘   └─────┘          │
│                         │                                   │
│                         │ [MODIFY STATE]                    │
│                         │                                   │
│                         ▼                                   │
│                      ┌─────┐   ┌─────┐   ┌─────┐          │
│                      │CP3' │──▶│CP4' │──▶│CP5' │          │
│                      └─────┘   └─────┘   └─────┘          │
│                      (Branch)  (New)     (New)             │
│                                                              │
│  ════════════════════════════════════════════════════════   │
│                                                              │
│  COMPARE: CP5 vs CP5'                                       │
│  ┌─────────────────┬─────────────────┐                     │
│  │ Original Path   │ Modified Path   │                     │
│  ├─────────────────┼─────────────────┤                     │
│  │ Result: Error   │ Result: Success │                     │
│  │ Files: 3        │ Files: 4        │                     │
│  │ Tests: 2 fail   │ Tests: All pass │                     │
│  └─────────────────┴─────────────────┘                     │
│                                                              │
└─────────────────────────────────────────────────────────────┘
```

## Execution Timeline Schema

```json
{
  "timeline_id": "tl_abc123",
  "workflow_id": "wf_xyz789",
  "created_at": "2026-01-11T20:00:00Z",

  "checkpoints": [
    {
      "checkpoint_id": "cp_001",
      "step": 1,
      "timestamp": "2026-01-11T20:00:00Z",
      "agent": "architect",
      "action": "design_start",
      "state": {
        "inputs": {"request": "Add authentication"},
        "outputs": null,
        "context": {"project": "/path/to/project"},
        "decisions": []
      },
      "branches": []
    },
    {
      "checkpoint_id": "cp_002",
      "step": 2,
      "timestamp": "2026-01-11T20:15:00Z",
      "agent": "architect",
      "action": "design_complete",
      "state": {
        "inputs": {"request": "Add authentication"},
        "outputs": {"spec_file": "/TODO/auth.md"},
        "context": {"project": "/path/to/project"},
        "decisions": [
          {"type": "auth_method", "value": "JWT"}
        ]
      },
      "branches": [
        {
          "branch_id": "br_001",
          "created_at": "2026-01-11T22:00:00Z",
          "modified_state": {
            "decisions": [
              {"type": "auth_method", "value": "Sessions"}
            ]
          },
          "outcome": "alternative_explored"
        }
      ]
    }
  ],

  "current_position": "cp_005",
  "total_steps": 5
}
```

## Commands

### `/timetravel list`

Show available checkpoints:

```
User: /timetravel list

## Execution Timeline: wf_conductor_auth

Total Checkpoints: 12 | Duration: 45 min | Status: Completed

Step | Checkpoint | Agent | Action | Time
-----|------------|-------|--------|------
1    | cp_001     | architect | design_start | 20:00
2    | cp_002     | architect | design_complete | 20:15
3    | cp_003     | auto-code | impl_start | 20:16
4    | cp_004     | auto-code | impl_50% | 20:25
5    | cp_005     | auto-code | impl_complete | 20:35  ← ERROR HERE
6    | cp_006     | qa | test_start | 20:36
7    | cp_007     | qa | test_complete | 20:42
...

Jump to checkpoint: /timetravel goto <checkpoint_id>
```

### `/timetravel goto`

Jump to a specific checkpoint:

```
User: /timetravel goto cp_004

## Time Travel: Jumped to cp_004

**Agent**: auto-code
**Action**: Implementation 50% complete
**Timestamp**: 20:25 (20 min ago)

### State at this point:
- Files created: 2 (/src/auth/jwt.ts, /src/auth/types.ts)
- Tests: 0
- Decisions: JWT, bcrypt, refresh tokens

### Available actions:
- `replay` - Continue execution from here
- `modify` - Change state before continuing
- `inspect` - View detailed state
- `compare` - Compare with another checkpoint
```

### `/timetravel inspect`

View detailed state at checkpoint:

```
User: /timetravel inspect cp_004

## Checkpoint cp_004 - Detailed State

### Inputs
```json
{
  "spec_file": "/TODO/auth.md",
  "tech_stack": ["Node.js", "TypeScript"]
}
```

### Outputs (at this point)
```json
{
  "files_created": [
    "/src/auth/jwt.ts",
    "/src/auth/types.ts"
  ],
  "files_pending": [
    "/src/auth/middleware.ts",
    "/src/auth/routes.ts"
  ]
}
```

### Context
```json
{
  "project": "/path/to/project",
  "working_directory": "/src/auth",
  "dependencies_installed": ["jsonwebtoken", "bcrypt"]
}
```

### Decisions Made
1. Use JWT for authentication
2. Use bcrypt for password hashing
3. Implement refresh token rotation

### Agent Memory
- "User prefers explicit error handling"
- "Project uses ESLint strict mode"
```

### `/timetravel modify`

Change state and create branch:

```
User: /timetravel modify cp_004

## Modify State at cp_004

Current state loaded. What would you like to change?

### Editable Fields:
1. decisions - Current auth decisions
2. context - Project context
3. outputs - Files created so far
4. inputs - Original inputs

User: Change decision from JWT to Sessions

Agent: Creating branch from cp_004 with modified state...

## Branch Created: br_002

**Original**: decisions.auth_method = "JWT"
**Modified**: decisions.auth_method = "Sessions"

Ready to replay. This will:
- Start from cp_004 with modified state
- Re-execute remaining steps (5-12)
- Create new timeline branch

Proceed? [replay/cancel]

User: replay

Agent: Replaying from cp_004 with Sessions auth...

[Step 5] auto-code: Implementing sessions...
  - Created: /src/auth/sessions.ts
  - Modified: /src/app.ts (added session middleware)

[Step 6] auto-code: Implementation complete
  - Files: 4 (vs 3 in original)
  - Approach: Server-side sessions with Redis

[Step 7] qa: Testing...
  - Tests: 8 passed (vs 6 in original)
  - Coverage: 92% (vs 85% in original)

## Branch Complete: br_002

Comparison ready: /timetravel compare cp_012 br_002_cp_012
```

### `/timetravel compare`

Compare two checkpoints or branches:

```
User: /timetravel compare cp_012 br_002_cp_012

## Comparison: Original vs Sessions Branch

### Outcomes
| Aspect | Original (JWT) | Branch (Sessions) |
|--------|----------------|-------------------|
| Status | ✓ Complete | ✓ Complete |
| Duration | 45 min | 52 min |
| Files | 8 | 10 |
| Tests | 6 pass | 8 pass |
| Coverage | 85% | 92% |

### Code Differences

**Original created**:
- /src/auth/jwt.ts (token generation)
- /src/auth/refresh.ts (refresh logic)

**Branch created instead**:
- /src/auth/sessions.ts (session management)
- /src/auth/redis.ts (session store)
- /config/redis.ts (Redis config)

### Performance Comparison
| Metric | JWT | Sessions |
|--------|-----|----------|
| Auth latency | 12ms | 8ms |
| Memory usage | Low | Medium |
| Scalability | High | Medium |

### Recommendation
JWT approach is better for this project's API-first architecture.
Sessions would be better for traditional web app with SSR.
```

### `/timetravel branch`

List and manage branches:

```
User: /timetravel branch list

## Branches from Timeline tl_abc123

| Branch | From | Modification | Outcome | Created |
|--------|------|--------------|---------|---------|
| br_001 | cp_002 | Auth: JWT→Sessions | Complete | 2h ago |
| br_002 | cp_004 | Added caching | Failed | 1h ago |
| br_003 | cp_006 | Skip tests | Complete | 30m ago |

Actions:
- /timetravel branch view br_001
- /timetravel branch delete br_002
- /timetravel branch merge br_001 (apply to main)
```

## Implementation Protocol

### Recording Execution

```python
def record_checkpoint(agent, action, state, timeline_id):
    checkpoint = {
        "checkpoint_id": generate_id("cp"),
        "step": get_next_step(timeline_id),
        "timestamp": now_iso(),
        "agent": agent,
        "action": action,
        "state": {
            "inputs": serialize(state.inputs),
            "outputs": serialize(state.outputs),
            "context": serialize(state.context),
            "decisions": state.decisions
        },
        "branches": []
    }

    # Store checkpoint
    memory_store({
        "type": "context",
        "content": json.dumps(checkpoint),
        "tags": ["timetravel", "checkpoint", timeline_id],
        "project": state.context.project_name
    })

    # Update timeline
    timeline = get_timeline(timeline_id)
    timeline.checkpoints.append(checkpoint)
    timeline.current_position = checkpoint.checkpoint_id
    save_timeline(timeline)

    return checkpoint
```

### Jumping to Checkpoint

```python
def goto_checkpoint(checkpoint_id):
    # Load checkpoint
    checkpoint = load_checkpoint(checkpoint_id)

    # Restore state
    restored_state = State(
        inputs=deserialize(checkpoint.state.inputs),
        outputs=deserialize(checkpoint.state.outputs),
        context=deserialize(checkpoint.state.context),
        decisions=checkpoint.state.decisions
    )

    # Load files to that point
    restore_files_to_checkpoint(checkpoint_id)

    # Update position
    timeline = get_timeline(checkpoint.timeline_id)
    timeline.current_position = checkpoint_id
    save_timeline(timeline)

    return restored_state, checkpoint
```

### Creating Branch

```python
def create_branch(from_checkpoint_id, modified_state):
    # Load original checkpoint
    original = load_checkpoint(from_checkpoint_id)

    # Create branch record
    branch = {
        "branch_id": generate_id("br"),
        "created_at": now_iso(),
        "from_checkpoint": from_checkpoint_id,
        "modified_state": diff(original.state, modified_state),
        "checkpoints": [],
        "outcome": "in_progress"
    }

    # Add branch to original checkpoint
    original.branches.append(branch)
    save_checkpoint(original)

    # Create initial branch checkpoint
    branch_cp = create_checkpoint_from(original, modified_state)
    branch_cp.checkpoint_id = f"{branch.branch_id}_{original.checkpoint_id}"
    branch.checkpoints.append(branch_cp)

    return branch
```

### Replaying Execution

```python
def replay_from_checkpoint(checkpoint_id, modified_state=None):
    # Get timeline
    checkpoint = load_checkpoint(checkpoint_id)
    timeline = get_timeline(checkpoint.timeline_id)

    # Find steps to replay
    start_step = checkpoint.step
    remaining_checkpoints = [
        cp for cp in timeline.checkpoints
        if cp.step > start_step
    ]

    # Apply modifications if any
    if modified_state:
        branch = create_branch(checkpoint_id, modified_state)
        current_state = modified_state
    else:
        current_state = checkpoint.state

    # Replay each step
    for orig_cp in remaining_checkpoints:
        print(f"Replaying step {orig_cp.step}: {orig_cp.agent} - {orig_cp.action}")

        # Execute the agent action with current state
        result = execute_agent_action(
            agent=orig_cp.agent,
            action=orig_cp.action,
            state=current_state
        )

        # Record new checkpoint
        new_cp = record_checkpoint(
            agent=orig_cp.agent,
            action=orig_cp.action,
            state=result.state,
            timeline_id=branch.branch_id if modified_state else timeline.timeline_id
        )

        current_state = result.state

        # Check for divergence
        if result.diverged_from_original:
            print(f"  ⚠️ Execution diverged from original at step {orig_cp.step}")

    return current_state
```

### Comparing Checkpoints

```python
def compare_checkpoints(cp_id_1, cp_id_2):
    cp1 = load_checkpoint(cp_id_1)
    cp2 = load_checkpoint(cp_id_2)

    comparison = {
        "checkpoints": [cp_id_1, cp_id_2],
        "state_diff": diff(cp1.state, cp2.state),
        "outcomes": {
            "cp1": analyze_outcome(cp1),
            "cp2": analyze_outcome(cp2)
        },
        "metrics": {
            "duration": {
                "cp1": calculate_duration(cp1),
                "cp2": calculate_duration(cp2)
            },
            "files": {
                "cp1": count_files(cp1),
                "cp2": count_files(cp2)
            },
            "tests": {
                "cp1": get_test_results(cp1),
                "cp2": get_test_results(cp2)
            }
        },
        "recommendation": generate_recommendation(cp1, cp2)
    }

    return comparison
```

## Use Cases

### 1. Debugging Failed Execution

```
# Execution failed at step 8
/timetravel list
# See failure occurred at cp_008

/timetravel goto cp_007
# Jump to just before failure

/timetravel inspect cp_007
# See state that led to failure

/timetravel modify cp_007
# Fix the problematic state

/timetravel replay
# Re-run with fix
```

### 2. Exploring Alternative Approaches

```
# Wonder if different auth approach would work better
/timetravel goto cp_002
# Go to design decision point

/timetravel modify
> Change auth from JWT to OAuth

/timetravel replay
# See how OAuth path plays out

/timetravel compare cp_final br_oauth_final
# Compare outcomes
```

### 3. Understanding Decision Impact

```
/timetravel branch list
# See all explored branches

/timetravel compare cp_005 br_001_cp_005
# Compare specific decision outcomes

# Identify which decisions led to better results
```

## Integration Points

| System | Integration |
|--------|-------------|
| Checkpoint | Provides state snapshots |
| Episode | Records full execution context |
| Memory | Stores timeline and branch data |
| Handoff | Replays handoff sequences |
| Tracing | Visualizes execution paths |

## Model Recommendation

- **Haiku**: For checkpoint operations
- **Sonnet**: For replay execution
- **Opus**: For comparison analysis and recommendations
