Skip to content

feat: add session profiling template and protocol (#44)#75

Closed
Alan-Jowett wants to merge 1 commit intomicrosoft:mainfrom
Alan-Jowett:feat/profile-session-template
Closed

feat: add session profiling template and protocol (#44)#75
Alan-Jowett wants to merge 1 commit intomicrosoft:mainfrom
Alan-Jowett:feat/profile-session-template

Conversation

@Alan-Jowett
Copy link
Copy Markdown
Member

Adds the session profiling template and protocol from issue #44, building on the structure proposed by @themechbro in PR #72.

Relationship to PR #72

@themechbro's PR #72 established the right scope — a reasoning protocol with 5 phases and a template, not a dashboard. This PR keeps that structure and adds the PromptKit conventions needed for the assembly engine to compose it:

  • Full YAML frontmatter (persona, protocols, format, params, contracts)
  • {{param}} placeholders for substitution
  • SPDX headers
  • Detailed phase instructions in the protocol (comparable to traceability-audit.md)

The commit includes a Co-authored-by trailer for @themechbro.

What's included

Protocol (protocols/reasoning/session-profiling.md):

Template (templates/profile-session.md):

  • Persona: specification-analyst (systematic, evidence-driven analysis)
  • Protocols: anti-hallucination + self-verification + session-profiling
  • Format: investigation-report (F-NNN findings with severity)
  • Params: session_log, assembled_prompt, focus_areas
  • Non-goals: not a quality audit, won't recommend removing guardrails

Manifest: additive-only — 2 new entries, no reformatting of existing content.

Usage

npx @alan-jowett/promptkit assemble profile-session \
  -p session_log="$(cat session.log)" \
  -p assembled_prompt="$(cat prompt.md)" \
  -p focus_areas="all" \
  -o session-profile.md

Closes #44.

Add profile-session template and session-profiling reasoning protocol
for analyzing completed LLM session logs to identify token inefficiencies
and structural waste.

Built on the structure from PR microsoft#72 by @themechbro — keeps the 5-phase
methodology and adds full PromptKit conventions:

Protocol (session-profiling):
- 5 phases: segment log, map to components, detect inefficiencies,
  quantify impact, produce recommendations
- 7 inefficiency types: redundant reasoning, false starts, re-derivation,
  protocol loops, unused context, verbose compliance, persona drift
- Each recommendation tied to a specific PromptKit component file

Template (profile-session):
- Full frontmatter: persona (specification-analyst), protocols
  (anti-hallucination, self-verification, session-profiling),
  format (investigation-report), params, contracts
- Params: session_log, assembled_prompt, focus_areas
- Non-goals: not a quality audit, not a guardrail remover

Manifest: additive-only changes (protocol + template entries, no
reformatting of existing content).

Co-authored-by: themechbro <109350438+themechbro@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 24, 2026 16:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “session profiling” capability to PromptKit by introducing a reasoning protocol and a template that analyze completed LLM session logs, attribute token usage back to PromptKit components, and produce optimization recommendations.

Changes:

  • Added profile-session template for running session log + assembled prompt profiling and emitting an investigation-report output.
  • Added session-profiling reasoning protocol with a 5-phase methodology (segment → map → detect → quantify → recommend).
  • Updated manifest.yaml to register the new protocol and template for assembly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
templates/profile-session.md New template that ingests session_log + assembled_prompt and directs the model to output an investigation-report-style profiling report.
protocols/reasoning/session-profiling.md New reasoning protocol defining the phased workflow and inefficiency taxonomy for profiling sessions.
manifest.yaml Registers the new protocol and template so PromptKit can discover and assemble them.

where tokens were spent, which PromptKit components contributed to
inefficiency, and what concrete changes would reduce waste. The goal is
observability — turning an opaque session transcript into an attributed
breakdown of cost and quality.
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol intro says the goal is a breakdown of “cost and quality,” but the accompanying template’s Non-Goals explicitly says not to evaluate whether the session output was correct/high-quality. Consider rephrasing here to avoid implying that correctness/quality is in scope (e.g., focus on cost drivers and efficiency tradeoffs).

Suggested change
breakdown of cost and quality.
breakdown of cost drivers and efficiency tradeoffs.

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +121
3. Assess **quality impact** — did this inefficiency also degrade the
output quality (not just cost)? A redundant derivation wastes tokens
but may not harm quality; a false start followed by an incorrect
restart harms both.
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phase 4 asks to assess “quality impact” (including whether output was incorrect), which conflicts with the template’s stated scope of cost/efficiency rather than output correctness. Please align the protocol with the template by either removing correctness evaluation from this step or tightly scoping “quality impact” to efficiency-related effects (e.g., increased retries/loops) without judging task correctness.

Suggested change
3. Assess **quality impact** — did this inefficiency also degrade the
output quality (not just cost)? A redundant derivation wastes tokens
but may not harm quality; a false start followed by an incorrect
restart harms both.
3. Assess **efficiency-related impact** — beyond raw token count, did
this inefficiency cause extra interaction steps (e.g., additional
self-verification cycles, repeated clarifications, or re-running
protocol phases) that increased latency or operational cost? Do not
judge whether the final task output was correct.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +57
2. For each finding, classify using the investigation-report format:
- Use finding IDs (F-001, F-002, …) with severity levels
- Attribute each finding to a specific PromptKit component
(persona file, protocol file and phase, format rule, or
template parameter)
- Include estimated token cost of each inefficiency
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The investigation-report format requires specific severity values (Critical/High/Medium/Low/Informational) and includes a required Category field per finding. Consider updating these instructions to (a) enumerate the allowed severity values and (b) specify where the RE-* inefficiency codes should go (e.g., use them as the finding Category) so outputs reliably conform to the format.

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +63
4. In the Executive Summary, report:
- Total estimated session tokens
- Estimated wasteful tokens and percentage
- Top 3 optimization opportunities by token savings
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The investigation-report format constrains the Executive Summary to 2–4 sentences. The current instruction to include multiple bullet items (total tokens, waste %, top 3 opportunities) may cause outputs to violate that format requirement; consider relocating these metrics to another section (e.g., Scope/Findings) or explicitly requiring they be condensed into the 2–4 sentence limit.

Suggested change
4. In the Executive Summary, report:
- Total estimated session tokens
- Estimated wasteful tokens and percentage
- Top 3 optimization opportunities by token savings
4. In the Executive Summary, write a concise 2–4 sentence narrative that
includes the total estimated session tokens, the estimated wasteful
tokens and percentage, and the top 3 optimization opportunities by
token savings.

Copilot uses AI. Check for mistakes.
- Do NOT evaluate whether the session's *output* was correct or
high-quality. This is a cost/efficiency analysis, not a quality audit.
- Do NOT suggest changes to the user's input parameters (session_log,
code_context, etc.) — only to PromptKit components (persona,
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Non-Goals section references code_context, but this template’s params are session_log, assembled_prompt, and focus_areas. Please update this line to reference the actual parameters (or remove the extraneous example) so the template stays self-consistent for users.

Suggested change
code_context, etc.) — only to PromptKit components (persona,
assembled_prompt, focus_areas, etc.) — only to PromptKit components (persona,

Copilot uses AI. Check for mistakes.
protocols, format, template).
- Do NOT recommend removing guardrail protocols (anti-hallucination,
self-verification) unless they are demonstrably causing loops.
Guardrails have a cost but exist for a reason.
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This template is missing a "Quality checklist" section. CONTRIBUTING.md specifies that template bodies should include a quality checklist before finalizing output; adding one here would keep new templates consistent with the documented authoring guidelines.

Suggested change
Guardrails have a cost but exist for a reason.
Guardrails have a cost but exist for a reason.
## Quality checklist
Before finalizing your output, verify that:
- [ ] You have executed the session-profiling protocol and followed all steps in the Instructions.
- [ ] The investigation report includes all required sections from the `investigation-report` format; if any section has no content, it explicitly states "None identified".
- [ ] Every finding has a unique ID (F-001, F-002, …), a severity level, an estimated token cost, and is attributed to a specific PromptKit component (persona, protocol phase, format rule, or template parameter).
- [ ] The Executive Summary reports total estimated session tokens, estimated wasteful tokens and percentage, and the top 3 optimization opportunities by token savings.
- [ ] The Remediation Plan contains concrete, file-level changes to PromptKit components and does not suggest changes to user inputs or the removal of guardrail protocols without clear evidence of looping.

Copilot uses AI. Check for mistakes.
@Alan-Jowett Alan-Jowett deleted the feat/profile-session-template branch March 24, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Prompt Execution Profiler (Token & Structure Analysis)

2 participants