Skip to content

feat: improve skill scores for apache/beam#37863

Open
popey wants to merge 1 commit intoapache:masterfrom
popey:improve/skill-review-optimization
Open

feat: improve skill scores for apache/beam#37863
popey wants to merge 1 commit intoapache:masterfrom
popey:improve/skill-review-optimization

Conversation

@popey
Copy link

@popey popey commented Mar 16, 2026

Hullo apache 👋

I ran your skills through tessl skill review at work and found some targeted improvements.

score_card

Here's the full before/after:

Skill Before After Change
license-compliance 70% 96% +26%
gradle-build 77% 96% +19%
io-connectors 77% 94% +17%
beam-concepts 77% 90% +13%
contributing 81% 94% +13%
python-development 77% 90% +13%
ci-cd 81% 89% +8%
java-development 83% 90% +7%
runners 85% 90% +5%
Changes summary

Descriptions (all 9 skills)

  • Expanded action verbs beyond generic "Guides understanding" to specific actions like "Configures, debugs, implements"
  • Added natural trigger terms users would actually type (e.g., "build.gradle", "gradlew", "pull request", "CLA", "RAT check")
  • Ensured every description has an explicit "Use when..." clause with multiple trigger scenarios

beam-concepts: Removed explanatory prose Claude already knows (historical context, property definitions), tightened PCollection/PTransform descriptions, added verification step to Dead Letter Queue pattern

ci-cd: Replaced verbose workflow tables with compact naming convention reference, added concrete gh CLI commands for listing/rerunning workflows, added executable debugging workflow with copy-paste ready commands

contributing: Added validation checkpoint to run pre-commit tests locally before pushing, expanded trigger terms to include "pull request", "CLA", "how to contribute"

gradle-build: Replaced flat error list with structured troubleshooting workflow including explicit verification steps for each error type

io-connectors: Replaced bare component list for creating new connectors with step-by-step workflow including test and verification checkpoints

java-development: Added artifact verification step after publishToMavenLocal

license-compliance: Added explicit 5-step compliance workflow with RAT check validation loop, consolidated repetitive license headers (8 near-identical blocks) into grouped format by comment style

python-development: Added tarball verification step after building source distribution

runners: Added structured debugging workflow: start with DirectRunner to isolate logic errors, then escalate to target runner


  • No issue referenced (skill improvements only, no functional code changes)
  • Update CHANGES.md with noteworthy changes. (N/A — skill files only)
  • If this contribution is large, please file an Apache Individual Contributor License Agreement. (Small contribution — skill metadata and content improvements only)

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide and ask it to optimize your skill. Ping me - @popey - if you hit any snags.

Thanks in advance 🙏

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.

  • Update CHANGES.md with noteworthy changes.

  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

  • See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

Hullo @apache 👋

I ran your skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after:

| Skill | Before | After | Change |
|-------|--------|-------|--------|
| license-compliance | 70% | 96% | +26% |
| gradle-build | 77% | 96% | +19% |
| io-connectors | 77% | 94% | +17% |
| beam-concepts | 77% | 90% | +13% |
| contributing | 81% | 94% | +13% |
| python-development | 77% | 90% | +13% |
| ci-cd | 81% | 89% | +8% |
| java-development | 83% | 90% | +7% |
| runners | 85% | 90% | +5% |

<details>
<summary>Changes summary</summary>

**Descriptions (all 9 skills)**
- Expanded action verbs beyond generic "Guides understanding" to specific actions like "Configures, debugs, implements"
- Added natural trigger terms users would actually type (e.g., "build.gradle", "gradlew", "pull request", "CLA", "RAT check")
- Ensured every description has an explicit "Use when..." clause with multiple trigger scenarios

**beam-concepts**: Removed explanatory prose Claude already knows (historical context, property definitions), tightened PCollection/PTransform descriptions, added verification step to Dead Letter Queue pattern

**ci-cd**: Replaced verbose workflow tables with compact naming convention reference, added concrete `gh` CLI commands for listing/rerunning workflows, added executable debugging workflow with copy-paste ready commands

**contributing**: Added validation checkpoint to run pre-commit tests locally before pushing, expanded trigger terms to include "pull request", "CLA", "how to contribute"

**gradle-build**: Replaced flat error list with structured troubleshooting workflow including explicit verification steps for each error type

**io-connectors**: Replaced bare component list for creating new connectors with step-by-step workflow including test and verification checkpoints

**java-development**: Added artifact verification step after `publishToMavenLocal`

**license-compliance**: Added explicit 5-step compliance workflow with RAT check validation loop, consolidated repetitive license headers (8 near-identical blocks) into grouped format by comment style

**python-development**: Added tarball verification step after building source distribution

**runners**: Added structured debugging workflow: start with DirectRunner to isolate logic errors, then escalate to target runner

</details>

---

 - [x] No issue referenced (skill improvements only, no functional code changes)
 - [ ] Update `CHANGES.md` with noteworthy changes. *(N/A — skill files only)*
 - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). *(Small contribution — skill metadata and content improvements only)*

---

Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute.

Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at [this Tessl guide](https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices) and ask it to optimize your skill. Ping me - [@popey](https://github.com/popey) - if you hit any snags.

Thanks in advance 🙏
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the clarity, completeness, and usability of the Apache Beam skill definitions. By refining descriptions, adding practical workflows, and incorporating verification steps across various development areas like CI/CD, Gradle builds, I/O connectors, and language-specific development, the changes aim to provide more targeted and actionable guidance for developers working with Apache Beam.

Highlights

  • Enhanced Skill Descriptions: Expanded action verbs and added natural trigger terms to all skill descriptions, ensuring each has an explicit 'Use when...' clause with multiple trigger scenarios for better clarity and utility.
  • Improved Beam Concepts Documentation: Removed verbose explanatory prose, tightened descriptions for PCollection and PTransform, and added a verification step for the Dead Letter Queue pattern.
  • Streamlined CI/CD Guidance: Replaced verbose workflow tables with a compact naming convention reference, included concrete gh CLI commands for workflow management, and introduced an executable debugging workflow with copy-paste ready commands.
  • Refined Contribution Process: Added a validation checkpoint to run pre-commit tests locally before pushing and expanded trigger terms to include 'pull request', 'CLA', and 'how to contribute'.
  • Structured Gradle Build Troubleshooting: Replaced a flat list of common errors with a structured troubleshooting workflow, including explicit verification steps for various error types.
  • Clearer I/O Connectors Workflow: Replaced a bare component list for creating new connectors with a step-by-step workflow, incorporating test and verification checkpoints.
  • Added Java Artifact Verification: Included an artifact verification step after publishToMavenLocal in the Java development guidance.
  • Detailed License Compliance Workflow: Introduced an explicit 5-step compliance workflow with RAT check validation, and consolidated repetitive license headers into grouped formats by comment style.
  • Python Tarball Verification: Added a tarball verification step after building source distributions in the Python development guidance.
  • Enhanced Runner Debugging Workflow: Implemented a structured debugging workflow for runners, advising to start with DirectRunner to isolate logic errors before escalating to target runners.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .agent/skills/beam-concepts/SKILL.md
    • Updated the skill description to be more comprehensive, covering troubleshooting and specific use cases.
    • Removed historical context and simplified the 'Pipeline' abstraction description.
    • Condensed the 'PCollection' properties into a single, concise sentence.
    • Removed redundant descriptive text for 'PTransform' and 'ParDo'.
    • Removed descriptive text for 'GroupByKey'.
    • Combined 'CoGroupByKey' and 'Combine' into a single section.
    • Combined 'Flatten' and 'Partition' into a single section and removed their individual descriptions.
    • Simplified the description of 'Windowing' types.
    • Removed descriptive text for 'Triggers', 'Side Inputs', 'Pipeline Options', and 'Schema'.
    • Added a verification step for the Dead Letter Queue pattern, including a PAssert example.
  • .agent/skills/ci-cd/SKILL.md
    • Updated the skill description to include more specific actions like debugging, analyzing logs, and troubleshooting flaky tests.
    • Replaced detailed workflow tables for 'Key Workflows' with a compact naming convention reference.
    • Replaced 'Triggering Specific Workflows' and 'Workflow Dispatch' sections with a 'Re-running Specific Workflows' section, including gh CLI commands.
    • Replaced generic 'Common Failure Patterns' with a structured 'Debugging Workflow' that includes steps for checking flakiness, re-running, local reproduction, and verifying fixes.
    • Updated the 'Local Debugging' section to provide more specific gradlew commands and troubleshooting steps for Java and Python.
  • .agent/skills/contributing/SKILL.md
    • Updated the skill description to include specific aspects like signing the CLA, running precommit checks, and following commit conventions.
    • Added steps to run pre-commit tests locally before pushing a pull request and to verify tests pass.
    • Clarified that pre-commit tests run automatically on the PR.
  • .agent/skills/gradle-build/SKILL.md
    • Updated the skill description to include specific actions like configuring build.gradle, running tasks, and resolving dependency conflicts.
    • Replaced the 'Common Errors' section with a structured 'Troubleshooting Workflow' for build failures, detailing steps for NoClassDefFoundError, proto-related errors, and dependency conflicts.
  • .agent/skills/io-connectors/SKILL.md
    • Updated the skill description to include specific actions like implementing read/write transforms, configuring connection parameters, and testing connectors.
    • Replaced the 'Key components' list for creating new connectors with a step-by-step 'Workflow' that includes testing and verification checkpoints.
  • .agent/skills/java-development/SKILL.md
    • Updated the skill description to include specific actions like compiling, running tests, building SDK containers, and publishing artifacts.
    • Added an artifact verification step using ls ~/.m2/repository/ after publishToMavenLocal.
  • .agent/skills/license-compliance/SKILL.md
    • Updated the skill description to include specific actions like adding, validating, and formatting license headers.
    • Replaced the 'Overview' section with a structured 'Workflow' for license compliance, detailing steps from creating a new file to validating with gradlew rat.
    • Consolidated repetitive license header examples for Go, Python, YAML, Shell, Markdown, XML, and HTML into grouped formats by comment style, providing concise references.
  • .agent/skills/python-development/SKILL.md
    • Updated the skill description to include specific actions like configuring virtual environments, running pytest, building SDK tarballs, and debugging pipeline execution.
    • Added a tarball verification step using ls -la dist/*.tar.gz after building source distribution.
  • .agent/skills/runners/SKILL.md
    • Updated the skill description to include specific actions like configuring, optimizing, and troubleshooting runners for pipeline execution.
    • Replaced the generic 'Debugging' section with a structured 'Debugging Workflow' that advises starting with DirectRunner and escalating to target runners.
    • Removed the redundant 'Enable logging' point from the Direct Runner section as it is covered in the general debugging workflow.
Activity
  • The author, popey, initiated this pull request to improve the skill scores for Apache Beam's documentation, based on a tessl skill review.
  • The changes focus on skill metadata and content improvements, with no functional code changes, and therefore no issue was referenced.
  • The author noted this as a small contribution, not requiring an Apache Individual Contributor License Agreement.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Contributor

Assigning reviewers:

R: @tvalentyn added as fallback since no labels match configuration

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@@ -0,0 +1,59 @@
Hullo @apache 👋
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it's part of pr?

./gradlew -Ppublishing -p sdks/java/io/kafka publishToMavenLocal

# Verify: check artifact exists
ls ~/.m2/repository/org/apache/beam/beam-sdks-java-io-kafka/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is gradle so I'm not sure if this verify step will work here.

2. **NoClassDefFoundError**: run `./gradlew clean` then retry; if persists, delete `~/.gradle/caches`
3. **Proto-related errors**: run `./gradlew generateProtos` then retry build
4. **Dependency conflicts**: run `./gradlew :module:dependencies --configuration runtimeClasspath` to inspect, use `--scan` for detailed analysis
5. Verify fix: re-run the original build command to confirm success
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, I prefer previous structure as some troubleshooting playbooks will be longer than 3 steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants