feat: Add ai-optimization skill for SageMaker AI Optimization APIs by Lokiiiiii · Pull Request #147 · awslabs/agent-plugins

Lokiiiiii · 2026-04-24T17:22:35Z

New skill covering the 14 SageMaker AI Optimization API operations (AIWorkloadConfig, AIBenchmarkJob, AIRecommendationJob) with guided workflows for benchmarking LLM inference and getting deployment recommendations.

Skill structure:

SKILL.md (110 lines): Main skill with intent-matching description
references/benchmark-workflow.md (96 lines): Benchmark job guide
references/benchmark-results.md (78 lines): Results download code
references/recommendation-workflow.md (96 lines): Recommendation guide
references/recommendation-options.md (74 lines): Config options + dataset
references/recommendation-deploy.md (41 lines): ModelPackage deployment
references/interpreting-results.md (78 lines): Metrics presentation

All files conform to DESIGN_GUIDELINES.md limits (SKILL.md <300, references <100 lines each). Code samples verified against the public Smithy model.

Changes

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

krokoko · 2026-04-26T17:38:06Z

Benchmark polling loop has no post-loop status check — failed jobs silently proceed to results download

File: benchmark-workflow.md (polling block)

The benchmark polling loop breaks on Failed but has no status gate afterward. The SKILL.md workflow proceeds to "download results," which will fail with a confusing error when the tar.gz doesn't
exist. The recommendation workflow does handle this correctly — this is an asymmetry.

Fix: Add post-loop status handling matching the recommendation workflow:
if status == "Failed":
print(f"Benchmark failed: {resp.get('FailureReason', 'Unknown')}")
print("Fix the issue above and re-run the job.")
elif status == "Stopped":
print("Benchmark was stopped before completion.")
else:
print("Benchmark completed. Proceed to download results.")

krokoko · 2026-04-26T17:38:35Z

Polling loops have no timeout — potential infinite hang

Files: benchmark-workflow.md, recommendation-workflow.md

Both while True loops have no max duration. If a job enters an unexpected non-terminal state, the notebook cell hangs indefinitely. Add a MAX_WAIT_SECONDS guard (e.g., 3600s for benchmark, 7200s for recommendation).

krokoko · 2026-04-26T17:39:28Z

No error handling on S3 download and tar extraction

File: benchmark-results.md

s3.get_object() and tarfile.open() have no try/except. Users with wrong IAM permissions see raw AccessDenied tracebacks. A corrupted archive gives an opaque ReadError. Add try/except with actionable messages.

krokoko · 2026-04-26T17:39:52Z

Pandas marked "optional" but unconditionally used

File: benchmark-results.md, line 6

Comment says # optional, for tabular display but pd.DataFrame(...) is called unconditionally. Either remove the "optional" comment or add a fallback.

krokoko · 2026-04-26T17:40:39Z

sm client used without being defined in benchmark-results.md

File: benchmark-results.md, line 12

The code uses sm.describe_ai_benchmark_job(...) but sm = boto3.client("sagemaker") is never created in this code block. Either add the client creation or a comment noting it comes from a prior cell.

New skill covering the 14 SageMaker AI Optimization API operations (AIWorkloadConfig, AIBenchmarkJob, AIRecommendationJob) with guided workflows for benchmarking LLM inference and getting deployment recommendations. Skill structure: - SKILL.md (110 lines): Main skill with intent-matching description - references/benchmark-workflow.md (96 lines): Benchmark job guide - references/benchmark-results.md (78 lines): Results download code - references/recommendation-workflow.md (96 lines): Recommendation guide - references/recommendation-options.md (74 lines): Config options + dataset - references/recommendation-deploy.md (41 lines): ModelPackage deployment - references/interpreting-results.md (78 lines): Metrics presentation All files conform to DESIGN_GUIDELINES.md limits (SKILL.md <300, references <100 lines each). Code samples verified against the public Smithy model.

Copilot

Pull request overview

Adds a new ai-optimization skill to the sagemaker-ai plugin, providing guided agent workflows for SageMaker AI Optimization APIs (AIWorkloadConfig, AIBenchmarkJob, AIRecommendationJob) to benchmark LLM inference and generate deployment recommendations.

Changes:

Introduces ai-optimization skill (SKILL.md) plus reference guides for benchmarking, recommendations, results interpretation, and deployment.
Adds notebook-oriented Python examples for creating/monitoring jobs and downloading benchmark artifacts.
Updates the plugins/sagemaker-ai/README.md to list and describe the new skill.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
plugins/sagemaker-ai/skills/ai-optimization/SKILL.md	Defines the skill’s scope, principles, and step-based workflow.
plugins/sagemaker-ai/skills/ai-optimization/references/benchmark-workflow.md	Benchmark job workflow and sample notebook cells.
plugins/sagemaker-ai/skills/ai-optimization/references/benchmark-results.md	Notebook cell for downloading and summarizing benchmark results from S3 artifacts.
plugins/sagemaker-ai/skills/ai-optimization/references/recommendation-workflow.md	Recommendation job workflow and sample notebook cell for job creation/polling.
plugins/sagemaker-ai/skills/ai-optimization/references/recommendation-options.md	Option catalog (instance types, optimization, datasets, framework) with snippets.
plugins/sagemaker-ai/skills/ai-optimization/references/recommendation-deploy.md	Sample deployment notebook cell using recommendation output ModelPackage.
plugins/sagemaker-ai/skills/ai-optimization/references/interpreting-results.md	Guidance for presenting benchmark/recommendation outputs and metrics.
plugins/sagemaker-ai/README.md	Adds `ai-optimization` to the skill list and includes an “AI Optimization” section.

Copilot · 2026-04-30T00:03:17Z

+- **Throughput (Output Token Throughput)** — Tokens generated per second. Higher is better for batch processing.
+- **Optimizations** — What the service applied:
+  - **Kernel Tuning** — GPU kernel optimizations specific to the hardware. Improves throughput with no quality impact.
+  - **Speculative Decoding** — Uses a smaller draft model to predict tokens ahead. Improves latency but requires a compatible draft model.


The speculative decoding description here says it “improves latency”, but recommendation-options.md describes speculative decoding as improving throughput (up to 10x). Please reconcile these so the skill gives consistent guidance on what speculative decoding optimizes.

Suggested change

- **Speculative Decoding** — Uses a smaller draft model to predict tokens ahead. Improves latency but requires a compatible draft model.

- **Speculative Decoding** — Uses a smaller draft model to predict tokens ahead. Primarily improves throughput (and can also reduce per-token latency) but requires a compatible draft model.

Copilot · 2026-04-30T00:03:18Z

+```python
+sm.create_ai_workload_config(
+    AIWorkloadConfigName=config_name,
+    AIWorkloadConfigs={"WorkloadSpec": {"Inline": json.dumps(workload_spec)}},
+    DatasetConfig={


This section instructs generating a notebook cell, but the snippet references json.dumps(workload_spec) and config_name without defining/importing them in the cell. Please make the cell self-contained (add the missing import/definitions or clearly point to where they come from) to avoid copy/paste failures.

Copilot · 2026-04-30T00:03:18Z

 | 10 | `hyperpod-ssm`             | Remote command execution and file transfer on HyperPod cluster nodes via SSM                                             | [SKILL.md](skills/hyperpod-ssm/SKILL.md)             |
 | 11 | `hyperpod-version-checker` | Check and compare software component versions across HyperPod cluster nodes                                              | [SKILL.md](skills/hyperpod-version-checker/SKILL.md) |
 | 12 | `hyperpod-issue-report`    | Generate diagnostic reports for HyperPod troubleshooting and support cases                                               | [SKILL.md](skills/hyperpod-issue-report/SKILL.md)    |
+| 13 | `ai-optimization`          | Guided workflows for benchmarking LLM inference and getting deployment recommendations (best instance, optimizations)    | [SKILL.md](skills/ai-optimization/SKILL.md)          |


Since this PR adds a new skill (ai-optimization), it’s a new feature for the sagemaker-ai plugin, but the plugin manifest versions remain 1.1.0. Per docs/MAINTAINERS_GUIDE.md:60, please bump the plugin version (in both .claude-plugin/plugin.json and .codex-plugin/plugin.json) accordingly.

Copilot · 2026-04-30T00:03:18Z

+# Benchmark Workflow
+
+Guide the user through creating and running an AI Benchmark Job.
+
+## Step 1: Gather Endpoint Information


This reference file is 107 lines, which exceeds the repo’s guidance to keep reference files under 100 lines (docs/DESIGN_GUIDELINES.md:321-322, 555-556). Please trim or split this into smaller reference files so the skill stays within context limits.

Copilot · 2026-04-30T00:03:18Z

+
+import boto3
+
+# sm client is defined in a prior cell (Step 3)


The comment says the SageMaker client is defined in a prior cell in “Step 3”, but in this workflow it’s created in Step 2 (workload config). Update the step reference (or redefine the client in this snippet) so the instructions are consistent.

Suggested change

# sm client is defined in a prior cell (Step 3)

# sm client is defined in a prior cell (Step 2)

Lokiiiiii requested review from a team, krokoko, scottschreckengaust and theagenticguy April 24, 2026 17:22

Lokiiiiii requested review from a team as code owners April 24, 2026 17:22

github-advanced-security AI found potential problems Apr 24, 2026

View reviewed changes

Comment thread plugins/sagemaker-ai/skills/ai-optimization/references/recommendation-options.md Dismissed

Comment thread plugins/sagemaker-ai/skills/ai-optimization/references/recommendation-options.md Dismissed

Lokiiiiii changed the title ~~Add ai-optimization skill for SageMaker AI Optimization APIs~~ feat: Add ai-optimization skill for SageMaker AI Optimization APIs Apr 24, 2026

Lokiiiiii force-pushed the ai-optimization branch from c304eaa to 38bdf3e Compare April 26, 2026 18:21

MichaelWalker-git requested a review from Copilot April 29, 2026 23:58

Copilot started reviewing on behalf of MichaelWalker-git April 29, 2026 23:58 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add ai-optimization skill for SageMaker AI Optimization APIs#147

feat: Add ai-optimization skill for SageMaker AI Optimization APIs#147
Lokiiiiii wants to merge 1 commit intoawslabs:mainfrom
Lokiiiiii:ai-optimization

Lokiiiiii commented Apr 24, 2026

Uh oh!

Uh oh!

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	- Speculative Decoding — Uses a smaller draft model to predict tokens ahead. Improves latency but requires a compatible draft model.
	- Speculative Decoding — Uses a smaller draft model to predict tokens ahead. Primarily improves throughput (and can also reduce per-token latency) but requires a compatible draft model.

	# sm client is defined in a prior cell (Step 3)
	# sm client is defined in a prior cell (Step 2)

Conversation

Lokiiiiii commented Apr 24, 2026

Related

Changes

Acknowledgment

Uh oh!

Uh oh!

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

krokoko commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants