Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions .github/workflows/traceability.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
name: Refresh SEP traceability manifest

# Regenerates src/seps/traceability.json by running the conformance suite against
# the reference SDK and recording which check IDs were emitted, then opens a PR
# with the diff. NOT a PR gate — runs on demand / on a schedule and proposes an
# update for review. plan.modelcontextprotocol.io reads the committed file from
# main.
#
# Depends on the `conformance sdk` subcommand (#277), which clones+builds the SDK
# and runs the client+server suites. The `run` job executes third-party SDK code,
# so it has NO repo write token (read-only perms, persist-credentials: false) and
# only uploads results as an artifact; the separate `propose` job holds the
# write/PR permissions and never executes SDK code.

on:
workflow_dispatch:
inputs:
sdk:
description: 'SDK ref to run against (e.g. typescript-sdk@<sha>)'
default: 'typescript-sdk@main'
schedule:
- cron: '0 6 * * 1' # Weekly, Monday 06:00 UTC.

concurrency:
group: traceability-refresh
cancel-in-progress: true

jobs:
run:
runs-on: ubuntu-latest
permissions:
contents: read
env:
SDK_REF: ${{ inputs.sdk || 'typescript-sdk@main' }}
steps:
- uses: actions/checkout@v6
with:
persist-credentials: false # no git token while SDK code runs

- uses: actions/setup-node@v6
with:
node-version: 24
cache: npm

- run: npm ci
- run: npm run build

- name: Run conformance suites against the reference SDK
# `sdk` requires --mode client|server; run both into the same results dir
# (the second reuses the cached checkout + build via --skip-build).
run: |
node dist/index.js sdk "$SDK_REF" --mode client --suite all -o results
node dist/index.js sdk "$SDK_REF" --mode server --suite all --skip-build -o results

- name: Fail if no results were produced
run: |
if [ -z "$(find results -name checks.json -print -quit 2>/dev/null)" ]; then
echo "No checks.json produced — the suite run failed; not proposing a manifest."
exit 1
fi

- uses: actions/upload-artifact@v4
with:
name: conformance-results
path: results
retention-days: 7

propose:
needs: run
runs-on: ubuntu-latest
# Requires the repo/org setting "Allow GitHub Actions to create and approve
# pull requests" to be enabled, otherwise `gh pr create` fails.
permissions:
contents: write
pull-requests: write
env:
SDK_REF: ${{ inputs.sdk || 'typescript-sdk@main' }}
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: 24
cache: npm
- run: npm ci
- run: npm run build

- uses: actions/download-artifact@v4
with:
name: conformance-results
path: results

- name: Regenerate manifest
run: |
set -euo pipefail
# Record the resolved sha (stable per SDK commit) so the manifest's
# `source` only changes when the SDK actually advances — no per-run noise.
ref="${SDK_REF#*@}"
sha="$(git ls-remote https://github.com/modelcontextprotocol/typescript-sdk.git "$ref" | cut -f1)"
node dist/index.js traceability --results results \
--source "typescript-sdk@${sha:0:12}"

- name: Open/update the rolling refresh PR
env:
GH_TOKEN: ${{ github.token }}
run: |
set -euo pipefail
if git diff --quiet -- src/seps/traceability.json; then
echo "traceability.json unchanged"
exit 0
fi
# One rolling branch/PR, force-updated each run, so the schedule does
# not accrue a new PR every week.
branch="traceability-refresh"
git config user.name 'github-actions[bot]'
git config user.email 'github-actions[bot]@users.noreply.github.com'
git checkout -B "$branch"
git add src/seps/traceability.json
git commit -m "chore: refresh SEP traceability manifest ($SDK_REF)"
git push --force origin "$branch"
gh pr view "$branch" >/dev/null 2>&1 || gh pr create \
--head "$branch" \
--title 'chore: refresh SEP traceability manifest' \
--body 'Automated refresh from a conformance run against the reference SDK. Review the coverage diff before merging.'
5 changes: 5 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Generated by `conformance traceability` — formatting is owned by the
# generator (deterministic JSON.stringify), not Prettier. Without this, the
# repo's `prettier --check .` would reformat the file and fight the generator's
# output (and the refresh workflow's `git diff` check).
src/seps/traceability.json
22 changes: 22 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,27 @@ npx @modelcontextprotocol/conformance new-sep <NNNN>

The command looks up PR #`<NNNN>` in `modelcontextprotocol/modelcontextprotocol` (SEP numbers are PR numbers), derives `spec_url` from the `docs/specification/draft/*.mdx` file it changes, and writes `src/seps/sep-<NNNN>.yaml` with TODO `requirements[]` rows. Use `--spec-path` or `--spec-url` to skip the lookup. The `new-sep` Claude Code skill drives the same flow end-to-end, parses the spec diff, and fills in the requirement rows.

### Traceability manifest

`src/seps/traceability.json` is a generated map of, per SEP, which declared `check:` IDs are actually emitted when the conformance suite runs against the reference SDK. It is consumed by plan.modelcontextprotocol.io to track SEP-2484 progress.

The emitted check IDs come from a real suite run (not a source scan), so dynamic (template-literal) IDs resolve to their concrete values. Generate the manifest from a results directory:

```sh
# 1. Run the suite against the reference SDK, collecting checks.json files:
node dist/index.js client --command '<sdk conformance client>' --suite all -o results
node dist/index.js server --url '<sdk conformance server url>' --suite all -o results
# 2. Build the manifest from those results:
npm run traceability -- --results results
npm run traceability -- --results results --strict # exit 1 on any untested (advisory)
```

Manifest shape: `{ schemaVersion, docs, source, seps }`, where `seps` is keyed by SEP number. Each requirement is `tested` (its check ID was emitted) or `untested` (declared but never emitted — a real gap, or a check that only fires against a deliberately-broken impl, i.e. it needs a negative test). `"tested" means a scenario emitted the check ID, NOT that any SDK passes it` — per-SDK results live in `tier-check`. Matching is exact, so a scenario's emitted check IDs must match the requirement slugs in the yaml (one check ID per MUST/SHOULD, emitted once per case). `source` records what was run against (e.g. `typescript-sdk@<sha>`); the `docs` field points back here.

Contract for consumers (plan.mcp.io): a SEP appears only if it has a traceability yaml or emits `sep-NNNN-*` check IDs. **A SEP absent from the manifest has no conformance artifacts — treat it as not-started** (diff against your own SEP list to find them). `untracked` lists emitted IDs with no yaml row (usually scenario gates).

The manifest is refreshed by `.github/workflows/traceability.yml` (manual/scheduled), which runs the suite against typescript-sdk and opens a PR with the diff — it is **not** a PR gate. Untested checks are advisory for now; the intended future policy is that an untested check must be backed by a negative test.

## Examples: prove it passes and fails

A new scenario should come with:
Expand All @@ -101,3 +122,4 @@ Use the existing CLI runner (`npx @modelcontextprotocol/conformance client|serve
- `npm test` passes
- For non-trivial scenario changes, run against at least one real SDK (typescript-sdk or python-sdk) to see actual output. For changes to shared infrastructure (runner, tier-check), test against go-sdk or csharp-sdk too.
- Scenario is registered in the right suite in `src/scenarios/index.ts`
- If you changed a `sep-*.yaml` or scenario check IDs, `src/seps/traceability.json` will drift; the traceability workflow refreshes it via PR (or regenerate locally with `--results` from a suite run)
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"lint:fix": "eslint src/ examples/ --fix && prettier --write .",
"lint:fix_check": "npm run lint:fix && git diff --exit-code --quiet",
"tier-check": "node dist/index.js tier-check",
"traceability": "tsx src/index.ts traceability",
"check": "npm run typecheck && npm run lint",
"typecheck": "tsgo --noEmit",
"prepack": "npm run build",
Expand Down
4 changes: 4 additions & 0 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ import {
import { createTierCheckCommand } from './tier-check';
import { createNewSepCommand } from './new-sep';
import { createSdkCommand } from './sdk-runner';
import { createTraceabilityCommand } from './traceability';
import packageJson from '../package.json';

// Note on naming: `command` refers to which CLI command is calling this.
Expand Down Expand Up @@ -548,6 +549,9 @@ program.addCommand(createNewSepCommand());
// SDK command - run local conformance against an SDK at a specific ref
program.addCommand(createSdkCommand());

// SEP traceability manifest command
program.addCommand(createTraceabilityCommand());

// List scenarios command
program
.command('list')
Expand Down
Loading
Loading