Skip to content

feat: add oncall mode for structured incident response#11143

Open
saneroen wants to merge 3 commits intoRooCodeInc:mainfrom
saneroen:feat/oncall-mode
Open

feat: add oncall mode for structured incident response#11143
saneroen wants to merge 3 commits intoRooCodeInc:mainfrom
saneroen:feat/oncall-mode

Conversation

@saneroen
Copy link

@saneroen saneroen commented Jan 31, 2026

Related GitHub Issue

Closes: #11142

Roo Code Task Context (Optional)

Description

This PR adds a new Oncall mode as a custom mode that enables structured incident response workflows using markdown files. The mode supports workflow orchestration, MCP tool integration, and automatic directory setup for incident logging.

Key Changes:

  • Added oncall mode configuration to .roomodes as a custom mode with workflow execution instructions
  • Implemented automatic incidents/ directory creation when switching to oncall mode in ClineProvider.ts
  • Added example workflows: k8s-troubleshooting.md and example-service-runbook.md in .roo/rules-oncall/
  • Created auto-loaded workflow rules in .roo/rules-oncall/ (workflows are automatically loaded as context)
  • Added slash commands for quick workflow execution (/k8s-troubleshoot, /oncall-workflow)
  • Added incidents/ to .gitignore to prevent committing incident logs
  • Configured file restrictions to allow editing incident logs and workflow files

Design Choices:

  • Custom Mode: Implemented as a custom mode in .roomodes (project-specific) rather than built-in, allowing teams to customize it per project
  • Workflow Location: All workflows consolidated in .roo/rules-oncall/ (uses existing rules infrastructure, auto-loaded as context)
  • Simple Markdown: Workflows are simple markdown files that users can create and edit without code changes
  • Non-blocking Directory Creation: Directory creation is non-blocking - mode switch succeeds even if directory creation fails
  • MCP Integration: Leverages existing MCP infrastructure - no new MCP-specific code needed
  • Example Workflows: Provided as templates to demonstrate the pattern without being prescriptive

Roadmap Alignment:

This contribution aligns with the Enhanced User Experience roadmap goal:

  • Streamlines incident response workflow for oncall engineers
  • Reduces friction by providing structured, reusable workflows
  • Improves daily-use tool experience for operational teams
  • Makes it easy to follow runbooks and document incidents consistently

Test Procedure

Manual Testing Steps:

  1. Mode Configuration:

    • Switch to Oncall mode from mode selector
    • Verify mode appears with 🚨 icon
    • Confirm mode switch completes successfully
  2. Directory Creation:

    • Switch to Oncall mode
    • Verify incidents/ directory is created in workspace root
    • Verify incidents/pages/ and incidents/escalations/ subdirectories exist
    • Check that incidents/ is in .gitignore
  3. Workflow Execution:

    • Ask Roo: "Help me troubleshoot a Kubernetes pod issue using the k8s workflow"
    • Verify Roo reads k8s-troubleshooting.md workflow from .roo/rules-oncall/
    • Confirm Roo follows workflow steps sequentially
    • Verify Roo can use MCP tools if configured
  4. Slash Commands:

    • Type /k8s-troubleshoot in chat
    • Verify command executes and switches to oncall mode
    • Test /oncall-workflow k8s-troubleshooting
  5. Custom Workflows:

    • Create a custom workflow file in .roo/rules-oncall/
    • Ask Roo to execute it
    • Verify Roo can read and follow custom workflows (they're auto-loaded as context)
  6. File Restrictions:

    • Try to edit a file outside allowed patterns in oncall mode
    • Verify restriction works correctly
    • Verify incident log files can be edited

Testing Environment:

  • VS Code with Roo Code extension
  • MCP servers configured (optional, for full testing)

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).

  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).

  • Self-Review: I have performed a thorough self-review of my code.

  • Testing: New and/or updated tests have been added to cover my changes (if applicable).

  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).

  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

UI Changes:

  • Oncall mode appears in mode selector with 🚨 icon
  • No other UI changes - workflows are executed through chat interaction

Documentation Updates

  • No documentation updates are required.

  • Yes, documentation updates are required.

Documentation to Update:

  • Add Oncall mode to modes documentation (as an example custom mode)
  • Document workflow creation and execution in .roo/rules-oncall/
  • Explain MCP integration for oncall workflows
  • Add examples of creating custom workflows
  • Document automatic incidents/ directory creation

Additional Notes

  • The incidents/ directory is created automatically when switching to oncall mode but ignored by git
  • Workflows are stored as markdown files in .roo/rules-oncall/ - users can version control their workflows
  • Workflows in .roo/rules-oncall/ are automatically loaded as context when the mode is active
  • Example workflows are provided as templates, not prescriptive requirements
  • MCP tool integration requires users to configure their own MCP servers
  • Directory creation errors are logged but don't block mode switch
  • This is implemented as a custom mode in .roomodes - teams can customize it per project or copy it to global settings

Note: Ensure issue #11142 is assigned to you before submitting. If not assigned, comment "Claiming" on the issue and DM Hannes Rudolph (hrudolph) on Discord to get assigned.

Get in Touch

Discord: santy2509


Important

Introduces oncall mode for structured incident response with markdown workflows, automatic directory setup, and new commands in ClineProvider.ts.

  • Behavior:
    • Adds oncall mode to .roomodes for structured incident response using markdown workflows.
    • Implements automatic incidents/ directory creation in ClineProvider.ts when switching to oncall mode.
    • Introduces slash commands /k8s-troubleshoot and /oncall-workflow for executing workflows.
  • Workflows:
    • Adds example workflows k8s-troubleshooting.md and example-service-runbook.md in .roo/rules-oncall/.
    • Workflows are auto-loaded as context in oncall mode.
  • Misc:
    • Adds incidents/ to .gitignore.
    • Updates package.json version to 3.46.2.

This description was created by Ellipsis for bbd40f9. You can customize this summary. It will automatically update as commits are pushed.

@saneroen saneroen requested review from cte, jr and mrubens as code owners January 31, 2026 23:53
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. Enhancement New feature or request labels Jan 31, 2026
@roomote
Copy link
Contributor

roomote bot commented Jan 31, 2026

Rooviewer Clock   See task on Roo Cloud

Reviewed merge commit (37914bc). No new issues found. One existing issue remains unaddressed.

  • Missing test coverage for oncall mode directory creation in handleModeSwitch
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment on lines +1361 to +1380
// Create incidents directory when switching to oncall mode
if (newMode === "oncall") {
try {
const workspacePath = getWorkspacePath()
if (workspacePath) {
const incidentsDir = path.join(workspacePath, "incidents")
const pagesDir = path.join(incidentsDir, "pages")
const escalationsDir = path.join(incidentsDir, "escalations")

// Create directories if they don't exist
await fs.mkdir(pagesDir, { recursive: true })
await fs.mkdir(escalationsDir, { recursive: true })
}
} catch (error) {
// Log error but don't fail mode switch if directory creation fails
this.log(
`Failed to create incidents directory when switching to oncall mode: ${error instanceof Error ? error.message : String(error)}`,
)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new oncall mode directory creation logic lacks test coverage. The existing handleModeSwitch tests in ClineProvider.sticky-mode.spec.ts only cover switching to "architect", "debug", and "code" modes. Consider adding tests that verify: (1) directories are created when switching to oncall mode, (2) mode switch succeeds even when directory creation fails, and (3) existing directories are handled gracefully with { recursive: true }.

Fix it with Roo Code or mention @roomote and request a fix.

@saneroen
Copy link
Author

saneroen commented Feb 1, 2026

Claiming

I've implemented the oncall mode feature and have a PR ready. Requesting assignment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Add Oncall mode with markdown workflow orchestration and MCP integration

1 participant