Skip to content

feat(commons-httpclient-2.0): toolkit-generated regeneration [DO NOT MERGE]#11717

Open
jordan-wong wants to merge 11 commits into
masterfrom
eval/commons-httpclient-2.0-toolkit-attempt
Open

feat(commons-httpclient-2.0): toolkit-generated regeneration [DO NOT MERGE]#11717
jordan-wong wants to merge 11 commits into
masterfrom
eval/commons-httpclient-2.0-toolkit-attempt

Conversation

@jordan-wong

@jordan-wong jordan-wong commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Toolkit-generated regeneration of dd-java-agent/instrumentation/commons-httpclient-2.0/. Part of the dd-trace-java APM toolkit eval program. ⚠️ DO NOT MERGE. Open for review.

What was preserved or restored from master

  • IastHttpMethodBaseInstrumentation.java — separate @autoservice IAST module (toolkit only generates tracing; IAST plumbing restored verbatim from master)
  • CommonsHttpClientDecorator.sourceUrl(HttpMethod) override (required for IAST SSRF chain)
  • commons-http-client-x muzzle directive
    ## Wins from prior reviewer feedback

@PerfectSlayer reviewed the prior toolkit commons-httpclient attempt on #10941 (still open, stale). This regeneration addresses every concrete concern:

  • ✅ Single delegate matcher — instruments only the 3-arg executeMethod(HostConfiguration, HttpMethod, HttpState). Other overloads delegate to this one, so no duplicate spans / no duplicate
    header injection. (Was @PerfectSlayer's central concern.)
  • ✅ Muzzle bounded [2.0,4.0) + explicit fail [,2.0) — answers "Do you know it excluded version 4?"
  • ✅ latestDepTestImplementation 2+ — answers "It should be 2+"
  • ✅ Component name "commons-http-client" consistent across instrumenter, decorator, and test (this consistency bug caused 28 test failures earlier in the session, now fixed)
  • ✅ status() uses statusLine == null ? 0 : statusLine.getStatusCode() (no catch(NullPointerException))
  • ✅ JUnit 5, no Groovy

Regressions vs master — surfaced by diff-vs-master pass

Three losses from the toolkit's HelperMethods refactor that no prior reviewer flagged:

  1. CommonsHttpClientDecorator.url() (lines 34-45) catches generic Exception and returns null. Master narrowed to URIException and rethrew URISyntaxException — propagates typed signal to the
    base decorator. Current code silently swallows; malformed URIs become invisible.
  2. Lost BlockingException re-throw path. Master's ExecAdvice had explicit catch (BlockingException e) + call-depth reset + rethrow for AppSec blocking. Current HelperMethods.doMethodEnter has
    no such branch; suppress = Throwable.class will swallow it — AppSec-initiated blocks won't propagate.
  3. Lost @AppliesOn(CONTEXT_TRACKING) advice gate. Master split work across ExecAdvice + a separate ContextPropagationAdvice annotated with @AppliesOn(CONTEXT_TRACKING). Current code collapses
    into one advice and inlines injectContext — drops the target-system marker.

Other notes

  • Test class doesn't extend shared HttpClientTest base — loses cross-tracer behavioral parity. 615 lines of standalone JUnit 5 vs master's Groovy spec inheriting the shared suite.
  • skipVersions += "20020423" removed from muzzle — could cause failures on that ancient artifact. Cheap to re-add.
  • Complex commit history (toolkit gen → component-name fix → master-deletion → restructure → master merge → spotless → IAST restore). Review git diff master...HEAD rather than commit-by-commit.

CI

Pipeline triggered by 2ff02f4 (IAST restoration). At time of writing: 580 pass / 0 pending / 1 fail. The 16 IAST SSRF smoke tests that previously failed (because the toolkit's overwrite
deleted IastHttpMethodBaseInstrumentation.java) are now passing again.

What reviewers should do

  1. Address the three diff-vs-master regressions: url() exception narrowing, BlockingException path, @AppliesOn(CONTEXT_TRACKING) gate.
  2. Decide whether to port tests to extend HttpClientTest base.
  3. The three regressions reveal a refactor-preservation failure mode in the toolkit — when extracting advice into a helper class, the toolkit dropped existing catch branches and annotations.
    Worth encoding as an R-rule.

Related

Prior toolkit attempt: #10941. Sibling eval PRs: #11708 (sparkjava), #11709 (feign).

…OT MERGE]

Toolkit-generated regeneration of commons-httpclient-2.0 instrumentation,
placed side-by-side with the existing master module via the -generated suffix
convention. Supersedes #10941 (PerfectSlayer's duplicate-module review).

Run details:
- Toolkit branch: eval/java @ 757979b9+
- Workflow: new_integration (default)
- Maven coordinates: commons-httpclient:commons-httpclient:2.0.2
- Cost: $31.91
- Duration: 119 min
- Reviewer verdict: approved=True, todos_fixed=0, todos_remaining=0

Generated module: dd-java-agent/instrumentation/commons-httpclient/commons-httpclient-2.0-generated/
- CommonsHttpClientInstrumentation (HttpMethodBase.execute)
- CommonsHttpClientDecorator (HttpClientDecorator)
- HttpHeadersInjectAdapter (header injection)
- HelperMethods
- Java tests (per R20, no Groovy)

## Research integrity disclosure

Hand-added by human operator AFTER toolkit run (toolkit did not produce these):
- 6 entries in metadata/supported-configurations.json: DD_TRACE_COMMONS_HTTPCLIENT_ENABLED, DD_TRACE_COMMONS_HTTPCLIENT_2_0_ENABLED, and their ANALYTICS_ENABLED + ANALYTICS_SAMPLE_RATE pairs (with type 'decimal' per R29).
- 1 line in settings.gradle.kts to register the new module.

Research signal — integration name divergence: master's deleted module used
super('commons-http-client') (with dashes); toolkit-generated module uses
super('commons-httpclient', 'commons-httpclient-2.0'). The R29 rule encoded
earlier today doesn't yet require matching existing master conventions when
regenerating; that's a research gap to encode.

Full caveats: docs/eval-research/generated/commons-httpclient-20260623/caveats.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dd-octo-sts

dd-octo-sts Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite Status
Startup 🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results
Scenario Candidate master Δ (95% CI of mean)
startup:insecure-bank:iast:Agent 14.00 s 13.90 s [-0.0%; +1.4%] (no difference)
startup:insecure-bank:tracing:Agent 12.97 s 12.98 s [-0.9%; +0.8%] (no difference)
startup:petclinic:appsec:Agent 16.91 s 16.74 s [-0.1%; +2.1%] (no difference)
startup:petclinic:iast:Agent 16.88 s 17.01 s [-1.6%; +0.0%] (no difference)
startup:petclinic:profiling:Agent 16.76 s 16.92 s [-2.0%; +0.0%] (no difference)
startup:petclinic:sca:Agent 16.97 s 16.73 s [+0.6%; +2.2%] (maybe worse)
startup:petclinic:tracing:Agent 16.03 s 16.11 s [-1.4%; +0.4%] (no difference)

Commit: 2ff02f41 · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

jordan-wong and others added 10 commits June 23, 2026 21:42
…ng exclusion) + R33 (no NPE catch)

R30 — Preserve master's integration name convention:
  super('commons-httpclient', 'commons-httpclient-2.0') → super('commons-http-client')
  instrumentationNames() returns ['commons-http-client'] (matches master)
  UTF8BytesString.create value already used 'commons-httpclient' in toolkit output; align to 'commons-http-client'

  Removes 6 hand-added DD_TRACE_COMMONS_HTTPCLIENT_* entries from
  metadata/supported-configurations.json — master's existing
  DD_TRACE_COMMONS_HTTP_CLIENT_* entries cover the canonical name.

  Fixes: validate_supported_configurations_v2_local_file CI failure (entries
  weren't in central feature-parity registry because they were invented names).

R32 — '-generated' exclusion:
  Added 'commons-httpclient-2.0-generated' to instrumentationNaming.exclusions
  in dd-java-agent/instrumentation/build.gradle. Fixes check-instrumentation-naming.

R33 — No NPE catch in CommonsHttpClientDecorator.status():
  Replaced try { httpMethod.getStatusCode() } catch (NullPointerException) with
  the canonical null-check guard:
    final StatusLine statusLine = httpMethod.getStatusLine();
    return statusLine == null ? 0 : statusLine.getStatusCode();
  Imported org.apache.commons.httpclient.StatusLine. Matches master's
  commons-httpclient-2.0 module. Fixes spotbugsMain DCN_NULLPOINTER_EXCEPTION.

Local verification (all BUILD SUCCESSFUL):
  ./gradlew :dd-java-agent:instrumentation:commons-httpclient:commons-httpclient-2.0-generated:spotbugsMain
  ./gradlew checkConfigurations
  ./gradlew checkInstrumentationNaming

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spotless moved the trailing comment 'org-json does not use semver' to its
own line. Fixes dd-gitlab/spotless CI failure introduced by the R32 edit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…HttpClientTest

The R30 fix (preserve master's super('commons-http-client') convention) was
WORSE for runtime tests: both master's existing commons-httpclient-2.0/ AND
the toolkit-generated commons-httpclient-2.0-generated/ then declared the
same integration name AND the same instrumentedType, causing a runtime
conflict where the generated module's instrumentation didn't load.

Diagnosed via Datadog MCP log fetch of CI job 1798409915:
  37 tests completed, 28 failed
  CommonsHttpClientTest > 7 test methods × 4 retries = 28 failures

Restoring toolkit's original output:
- super('commons-httpclient', 'commons-httpclient-2.0') in CommonsHttpClientInstrumentation
- instrumentationNames returns ['commons-httpclient', 'commons-httpclient-2.0']
- COMMONS_HTTP_CLIENT constant uses 'commons-httpclient'
- Re-add 6 DD_TRACE_COMMONS_HTTPCLIENT_* entries to supported-configurations.json

R33 NPE-catch fix preserved.
R32 naming exclusion preserved.

The 'validate_supported_configurations_v2_local_file' CI check will fail
again until the new DD_TRACE_COMMONS_HTTPCLIENT_* entries are registered
in the central feature-parity registry (out-of-band Datadog operation).

Research finding: the toolkit's '-generated' side-by-side convention is
incompatible with R30 when regenerating an existing module — preserving
master's integration name causes a runtime conflict because both modules
declare the same name AND instrumented type. The convention needs to
either invent new names (registry work) or remove master's settings.gradle.kts
entry (modifies master, beyond eval scope).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…v2_local_file

Building on JMS v3 PR #11596's pattern (passes v2_local_file by preserving
master integration names): apply R30 to commons-httpclient WITHOUT the runtime
collision risk that broke the previous attempt.

Previous R30 attempt failed because master's commons-httpclient-2.0/ AND the
generated commons-httpclient-2.0-generated/ both declared super('commons-http-client')
AND instrumented the same HttpClient type — runtime conflict (28 of 37 tests
failed).

This time, ALSO remove master's commons-httpclient-2.0 from settings.gradle.kts
so only the generated module loads. Acceptable for this [DO NOT MERGE] research
PR — explicit supersede of master's module.

Changes:
- super('commons-http-client') in CommonsHttpClientInstrumentation
- instrumentationNames returns ['commons-http-client']
- UTF8BytesString.create('commons-http-client') in COMMONS_HTTP_CLIENT
- Remove 6 DD_TRACE_COMMONS_HTTPCLIENT_* hand-added entries
- Remove ':dd-java-agent:instrumentation:commons-httpclient-2.0' from settings.gradle.kts

Local verification BUILD SUCCESSFUL: muzzle, spotbugsMain, checkConfigurations,
checkInstrumentationNaming, spotlessCheck.

Expected to unblock validate_supported_configurations_v2_local_file — the central
feature-parity registry already has DD_TRACE_COMMONS_HTTP_CLIENT_* from master.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous attempt only removed settings.gradle.kts entry; master module's
source files were still tracked. At runtime, agent still picked up master's
CommonsHttpClientInstrumentation alongside the generated module's
instrumentation (same integration name), causing generated module's hooks to
be shadowed. Result: 28 of 37 CommonsHttpClientTest tests failed (no spans).

Solution: git rm -rf dd-java-agent/instrumentation/commons-httpclient-2.0/.
Only commons-httpclient-2.0-generated/ remains.

Files removed (566 lines): build.gradle, gradle.lockfile, source classes,
and Groovy tests.

Local verification BUILD SUCCESSFUL: checkConfigurations, muzzle,
checkInstrumentationNaming.

Deliberate [DO NOT MERGE] research-PR action: only ONE module per integration
name at runtime. Reviewers: when merged, drop -generated suffix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tch Decorator

The toolkit-generated test asserted span tag component='commons-httpclient'
(no hyphen between http and client) while the toolkit-generated Decorator
emits component='commons-http-client' (with hyphen). The mismatch caused all
7 tests that go through assertHttpClientSpan to fail in CI with 28 reported
failures (7 unique × 4 Spock retries).

Aligns the test with the Decorator's value, which matches the master
convention (commons-http-client). All 16 tests now pass locally.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-generated suffix

The parallel -generated/ directory caused (a) runtime double-instrumentation
collision against the existing master commons-httpclient-2.0 module, and
(b) when master was deleted to resolve (a), broke IAST's SSRF detection
hook which had a hardcoded reference to the master path.

Move the toolkit's regenerated files to the canonical master path
(dd-java-agent/instrumentation/commons-httpclient-2.0/). Standard PR
review uses git diff against origin/master to inspect what the toolkit
produced; the parallel-directory convention was not actually solving a
real reviewer need.

Restores IAST SSRF coverage and eliminates the runtime collision.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ient-2.0-toolkit-attempt

# Conflicts:
#	dd-java-agent/instrumentation/commons-httpclient-2.0/gradle.lockfile
The c661f9d restructure (removing commons-httpclient-2.0-generated
from instrumentationNaming.exclusions) left the list in a layout
spotless's Groovy formatter rejects. Auto-reformat to canonical style.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The toolkit's new_integration workflow only generates APM tracing code,
not IAST. The c661f9d restructure overwrote master's commons-httpclient-2.0/
with toolkit output, accidentally deleting master's IAST artifacts:

- IastHttpMethodBaseInstrumentation.java: separate @autoservice module
  that taints HttpMethodBase constructor URI args for SSRF detection.
  Restored verbatim from origin/master.
- CommonsHttpClientDecorator.sourceUrl(HttpMethod) override: used by
  HttpClientDecorator's URL-source pathway that IAST SSRF depends on.
  Added back to the toolkit's decorator.
- build.gradle muzzle directive 'commons-http-client-x': declared by
  the IAST instrumenter (via muzzleDirective() method); restored.

The toolkit-generated tracing files (Instrumentation, Decorator's other
methods, helpers, tests) are left as the toolkit generated them — this
commit only restores the non-tracing pieces master owned that the
toolkit doesn't know about.

Fixes the 16 IastSpringBootSmokeTest > "ssrf is present
(commons-httpclient2) (url)" failures across JDK shards.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dd-octo-sts

dd-octo-sts Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request
  • Remove the tag from the pull request title

If you need help, please check our contributing guidelines.

@dd-octo-sts dd-octo-sts Bot added the tag: ai generated Largely based on code generated by an AI or LLM label Jun 25, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2ff02f4108

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

final AgentSpan span =
startSpan(CommonsHttpClientDecorator.COMMONS_HTTP_CLIENT.toString(), HTTP_REQUEST);
DECORATE.afterStart(span);
DECORATE.onRequest(span, method);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore BlockingException handling around onRequest

When AppSec/RASP blocks an outbound URL, DECORATE.onRequest can throw BlockingException; this advice is still declared with suppress = Throwable, so without the previous catch/reset path the exception is swallowed by the advice wrapper and CallDepthThreadLocalMap remains incremented. In that AppSec blocking scenario the commons-httpclient request is allowed to continue, and later calls on the same thread return null from doMethodEnter, disabling tracing for this client until the thread is reused/reset.

Useful? React with 👍 / 👎.


final AgentScope scope = activateSpan(span);

DECORATE.injectContext(current(), method, SETTER);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore the context-tracking propagation advice

This injects propagation only from the tracing helper, whereas the previous ContextPropagationAdvice was a separate @AppliesOn(CONTEXT_TRACKING) advice. Since AgentInstaller enables CONTEXT_TRACKING independently of tracing, deployments with tracing disabled still used to forward existing trace context through commons-httpclient; after this collapse there is no context-tracking-only advice for this matcher, so those calls stop carrying propagation headers.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tag: ai generated Largely based on code generated by an AI or LLM tag: do not merge Do not merge changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant