Skip to content

Add property to allow autoconfiguration of SDK telemetry version#8037

Open
anuraaga wants to merge 4 commits intoopen-telemetry:mainfrom
anuraaga:telemetry-version-knob
Open

Add property to allow autoconfiguration of SDK telemetry version#8037
anuraaga wants to merge 4 commits intoopen-telemetry:mainfrom
anuraaga:telemetry-version-knob

Conversation

@anuraaga
Copy link
Contributor

While discussing how to reflect telemetry version in declarative config in a cross-language way (open-telemetry/semantic-conventions#3293), it would be helpful to have a traditional config property for it to be able to roll out new metrics to javaagent users. This defines otel.experimental.sdk.telemetry.version to allow that.

#7928

@anuraaga anuraaga requested a review from a team as a code owner January 30, 2026 04:12
@codecov
Copy link

codecov bot commented Jan 30, 2026

Codecov Report

❌ Patch coverage is 79.59184% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.17%. Comparing base (b56bc54) to head (8010939).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
.../otlp/internal/InternalTelemetryConfiguration.java 57.14% 2 Missing and 1 partial ⚠️
.../autoconfigure/InternalTelemetryConfiguration.java 57.14% 2 Missing and 1 partial ⚠️
...sdk/autoconfigure/LoggerProviderConfiguration.java 81.81% 1 Missing and 1 partial ⚠️
...sdk/autoconfigure/TracerProviderConfiguration.java 80.00% 1 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (79.59%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##               main    #8037    +/-   ##
==========================================
  Coverage     90.16%   90.17%            
- Complexity     7478     7603   +125     
==========================================
  Files           834      840     +6     
  Lines         22559    22837   +278     
  Branches       2239     2282    +43     
==========================================
+ Hits          20341    20593   +252     
- Misses         1515     1528    +13     
- Partials        703      716    +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. As a principle, I / we want to make sure that declarative config is a strict superset of what's possible with env vars / system properties, so we'll want to make sure that #7928 is also resolved on the declarative config side of things as well.

Comment on lines 314 to 333
.anySatisfy(
scopeMetrics -> {
assertThat(scopeMetrics.getScope().getName())
.isEqualTo("io.opentelemetry.sdk.logs");
assertThat(scopeMetrics.getMetricsList())
.satisfiesExactlyInAnyOrder(
metric ->
assertThat(metric.getName())
.isEqualTo("otel.sdk.log.created"),
metric ->
assertThat(metric.getName())
.isEqualTo("otel.sdk.processor.log.processed"),
metric ->
assertThat(metric.getName())
.isEqualTo(
"otel.sdk.processor.log.queue.capacity"),
metric ->
assertThat(metric.getName())
.isEqualTo("otel.sdk.processor.log.queue.size"));
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit if we're not going to assert on the attributes, maybe a more terse assert syntax

Suggested change
.anySatisfy(
scopeMetrics -> {
assertThat(scopeMetrics.getScope().getName())
.isEqualTo("io.opentelemetry.sdk.logs");
assertThat(scopeMetrics.getMetricsList())
.satisfiesExactlyInAnyOrder(
metric ->
assertThat(metric.getName())
.isEqualTo("otel.sdk.log.created"),
metric ->
assertThat(metric.getName())
.isEqualTo("otel.sdk.processor.log.processed"),
metric ->
assertThat(metric.getName())
.isEqualTo(
"otel.sdk.processor.log.queue.capacity"),
metric ->
assertThat(metric.getName())
.isEqualTo("otel.sdk.processor.log.queue.size"));
})
.anySatisfy(
scopeMetrics -> {
assertThat(scopeMetrics.getScope().getName())
.isEqualTo("io.opentelemetry.sdk.logs");
assertThat(
scopeMetrics.getMetricsList().stream()
.map(Metric::getName)
.collect(Collectors.toSet()))
.isEqualTo(
new HashSet<>(
Arrays.asList(
"otel.sdk.log.created",
"otel.sdk.processor.log.processed",
"otel.sdk.processor.log.queue.capacity",
"otel.sdk.processor.log.queue.size")));

Or maybe even something like maintaining a Map<String, Set<String>>, with keys of each expected scope and values being the set of metric names expected for that scope.

Not critical, but this test does get pretty long with so many verbose asserts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call - I added a helper too, it seems to be both much smaller and more readable

import javax.annotation.Nullable;

/** Reads the desired SDK internal telemetry version from {@link ConfigProperties}. */
final class InternalTelemetryConfiguration {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too bad we need two copies of this code. Maybe each can have a note in the javadoc referencing the other to increase the chance of keeping changes in sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah added the references

final class InternalTelemetryConfiguration {

@Nullable
static InternalTelemetryVersion getVersion(ConfigProperties config) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey I was thinking about internal telemetry and defaults more. Right now, we have SDK components which have different versions of internal telemetry according to their programmatic config API. For example, SdkTracerProvider / SdkLoggerProvider / SimpleSpanProcessor / SimpleLogRecordProcessor did not emit any telemetry prior to the new semantic conventions, and so they default to the latest. BatchSpanProcessor, BatchLogRecordProcessor, Otlp{Signal}{Protocol}Exporter all emit legacy by default.

That means that if otel.experimental.sdk.telemetry.version is omitted, or if otel.experimental.sdk.telemetry.version=legacy, the SDK ends up emitting split brain telemetry.

I think we need to address that here in and in #8045. My proposal:

  • Both system prop / env var config and declarative config should take measures to prevent ever emitting split brain internal telemetry
  • System prop / env var config defaults to legacy if otel.experimental.sdk.telemetry.version is unspecified.
  • Declarative config defaults to no internal telemetry if .instrumentation/development.java.otel_sdk.internal_telemetry_version is unspecified
  • Once the semconv for internal telemetry is stable, declarative config updates the default to the stable version
  • If the internal telemetry version resolves to legacy, internal telemetry is only emitted from components that emit legacy. I.e. internal telemetry is disabled for BatchSpanProcessor, BatchLogRecordProcessor, Otlp{Signal}{Protocol}Exporter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metric names are just names, I don't know if there is any actual negative user impact from having inconsistent naming conventions in legacy mode, most commonly they wouldn't notice the new metrics, or appreciate them existing vs not (and maybe it gives a carrot for users to update to latest? ;) ). Agree that it feels a bit weird but not sure it's better to hide the new metrics to avoid the split brain. Should be simple to implement if we'd like that though.

Declarative config defaults to no internal telemetry

To confirm, in the other PR we discussed it could be confusing if the default is different for the two config mechanisms but looks like we'd go for that. Does it mean we need to add InternalTelemetryVersion.DISABLED?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and maybe it gives a carrot for users to update to latest? ;)

See, I was thinking that omitting them from legacy would provide a carrot to update to latest 😂

Does it mean we need to add InternalTelemetryVersion.DISABLED?

I was imagining implementing this by calling setMeterProvider with a noop, to avoid adding InternalTelemetryVersion.DISABLED and having two ways to accomplish the same thing.

To confirm, in the other PR we discussed it could be confusing if the default is different for the two config mechanisms but looks like we'd go for that.

Yes and I agree that its confusing. The crux for me right now is whether or not we are allowed to migrate declarative config to change the default in the future when semconv stabilizes. I know you made the argument that its reasonable to do so with sufficient lead time and easy config knobs, but right now the precedent for opentelemetry-java and for the broader opentelemetry project is that such a change is breaking and only allowed in major versions. I think that declarative config is likely to be the dominant config mechanism in the long term, and so the top priority for me is making sure that it doesn't carry around the legacy internal telemetry baggage indefinitely. Can accomplish this by:

  1. Relaxing the restriction that changing the telemetry schema is breaking and only allowed in major version.
  2. Accepting inconsistent defaults between system props / env vars and declarative config, and having declarative config default to disabled for the time being.

Option 2 is a two way door. With option 1, the otel-java SIG needs to convince itself that this type of change is allowed, and we need to hope that the broader otel community doesn't come out with very explicit guidance to the contrary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See, I was thinking that omitting them from legacy would provide a carrot to update to latest

You're right, that's a much better carrot - reworked so the autoconfiguration knob now does not enable new metrics (by not setting a MeterProvider) when set to legacy, and made the default explicitly legacy instead of null.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants