Skip to content

out_opentelemetry: add metadata token authentication#11721

Open
evgfitil wants to merge 1 commit intofluent:masterfrom
evgfitil:feat/otel-metadata-token-auth
Open

out_opentelemetry: add metadata token authentication#11721
evgfitil wants to merge 1 commit intofluent:masterfrom
evgfitil:feat/otel-metadata-token-auth

Conversation

@evgfitil
Copy link
Copy Markdown

@evgfitil evgfitil commented Apr 16, 2026

Closes #11675

The OTel output plugin supports OAuth2 client-credentials authentication but has no way to fetch Bearer tokens from cloud instance metadata endpoints - the standard mechanism on platforms where a VM has a linked service account.

Today the only options are a sidecar proxy or a custom build. The out_stackdriver plugin already solves this for GCP via gce_metadata.c; this PR brings the same capability to the OTel plugin in a cloud-neutral way - no vendor-specific naming, no new dependencies.

New configuration options

Key Description Default
metadata_token_url HTTP URL of the metadata token endpoint. Presence enables the feature (none)
metadata_token_header Extra header for the token request (e.g. Metadata-Flavor: Google) (none)
metadata_token_refresh Token refresh interval in seconds 3600
metadata_token_scope Optional scope appended as ?scopes=<value> query parameter (none)
metadata_token_audience Optional audience appended as ?audience=<value> query parameter (none)

Token caching, periodic refresh, and 401 retry reuse the existing oauth2 code paths - no new token injection mechanism is introduced


Testing

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption
Config and logs - token fetch with metadata_token_scope
[SERVICE]
    flush     5
    log_level debug
    daemon    off

[INPUT]
    Name  dummy
    Dummy {"message": "gcp-scope-test"}
    Rate  1
    Tag   test.logs

[OUTPUT]
    Name                     opentelemetry
    Match                    *
    Host                     127.0.0.1
    Port                     4318
    Logs_uri                 /v1/logs
    metadata_token_url       http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token
    metadata_token_header    Metadata-Flavor: Google
    metadata_token_refresh   3600
    metadata_token_scope     https://www.googleapis.com/auth/cloud-platform
[debug] metadata: token refreshed, expires in 2825 seconds
[debug] metadata: token still valid, skipping refresh
[debug] metadata: token still valid, skipping refresh
[debug] metadata: token still valid, skipping refresh
[info] [engine] service has stopped (0 pending tasks)
Config and logs - token fetch without scope
[OUTPUT]
    Name                     opentelemetry
    Match                    *
    Host                     127.0.0.1
    Port                     4318
    Logs_uri                 /v1/logs
    metadata_token_url       http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token
    metadata_token_header    Metadata-Flavor: Google
    metadata_token_refresh   3600
[debug] metadata: token refreshed, expires in 2612 seconds
[debug] metadata: token still valid, skipping refresh
[debug] metadata: token still valid, skipping refresh
[info] [engine] service has stopped (0 pending tasks)
Config and logs - token refresh cycle (metadata_token_refresh=90)
[debug] metadata: token refreshed, expires in 90 seconds
[debug] metadata: token still valid, skipping refresh
...
# refresh #2
[debug] metadata: token refreshed, expires in 90 seconds
...
# refresh #3
[debug] metadata: token refreshed, expires in 90 seconds
...
# refresh #4
[debug] metadata: token refreshed, expires in 90 seconds
[info] [engine] service has stopped (0 pending tasks)
Valgrind - 25 tests, 0 leaks, 0 errors

Build: cmake -DFLB_DEV=On -DFLB_VALGRIND=On -DFLB_TESTS_RUNTIME=On

==74627== Command: ./bin/flb-rt-out_opentelemetry
==74627==
Test default_config...                          [ OK ]
Test metadata_token_url_sets_context...         [ OK ]
Test metadata_token_default_refresh...          [ OK ]
Test metadata_token_custom_refresh...           [ OK ]
Test metadata_token_mutual_exclusion...         [ OK ]
Test metadata_token_https_rejected...           [ OK ]
Test metadata_token_low_refresh_rejected...     [ OK ]
Test no_metadata_token_backward_compat...       [ OK ]
Test metadata_token_fetch_on_first_flush...     [ OK ]
Test metadata_token_refresh_on_expiry...        [ OK ]
Test metadata_token_custom_header...            [ OK ]
Test metadata_token_fetch_failure...            [ OK ]
Test metadata_token_legacy_post...              [ OK ]
Test metadata_token_401_recovery...             [ OK ]
Test metadata_token_refresh_interval_override...[ OK ]
Test metadata_token_missing_expires_in...       [ OK ]
Test metadata_token_short_expires_in...         [ OK ]
Test metadata_token_scope_query_param...        [ OK ]
Test metadata_token_audience_query_param...     [ OK ]
Test metadata_token_both_query_params...        [ OK ]
Test metadata_token_scope_without_url_ignored...[ OK ]
Test metadata_token_scope_url_with_existing_query...[ OK ]
Test metadata_token_audience_url_with_existing_query...[ OK ]
Test metadata_token_empty_scope_ignored...      [ OK ]
Test metadata_token_empty_audience_ignored...   [ OK ]
SUCCESS: All unit tests have passed.
==74627==
==74627== HEAP SUMMARY:
==74627==     in use at exit: 0 bytes in 0 blocks
==74627==   total heap usage: 85,100 allocs, 85,100 frees, 27,321,304 bytes allocated
==74627==
==74627== All heap blocks were freed -- no leaks are possible
==74627==
==74627== ERROR SUMMARY: 0 errors from 0 contexts
  • [N/A] Run local packaging test
  • [N/A] Set ok-package-test label (requires maintainer)

Documentation

  • Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added metadata token authentication support to the OpenTelemetry output plugin with automatic token refresh capabilities.
    • Introduced five new configuration options: metadata_token_url, metadata_token_header, metadata_token_refresh, metadata_token_scope, and metadata_token_audience.
  • Tests

    • Added 24 new runtime tests for metadata token authentication scenarios and backward compatibility validation.

Signed-off-by: Evgenii Akhmetzianov <evgfitil@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 16, 2026

📝 Walkthrough

Walkthrough

This pull request adds cloud metadata endpoint token authentication to the OpenTelemetry output plugin. It introduces new configuration options (metadata_token_url, metadata_token_header, metadata_token_refresh, metadata_token_scope, metadata_token_audience) that enable fetching IAM bearer tokens via HTTP GET from a metadata endpoint, parsing the JSON response, and reusing the existing OAuth2 infrastructure for token caching and refresh. The implementation includes thread-safe token management, 401 recovery logic, and comprehensive runtime tests covering token fetch timing, expiry handling, and query parameter construction.

Changes

Cohort / File(s) Summary
Build Configuration
plugins/out_opentelemetry/CMakeLists.txt, tests/runtime/CMakeLists.txt
Added opentelemetry_metadata.c source file to plugin build and registered new runtime test executable for OpenTelemetry output.
Metadata Token Core Implementation
plugins/out_opentelemetry/opentelemetry_metadata.c, plugins/out_opentelemetry/opentelemetry_metadata.h
New module implementing metadata token lifecycle: JSON parsing of token responses (extracting access_token and expires_in), HTTP GET request construction with optional query parameters (scope, audience) and custom headers, token refresh with TTL computation and mutex-protected atomic OAuth2 context updates, and cleanup routines.
Plugin Header Extensions
plugins/out_opentelemetry/opentelemetry.h
Added metadata token configuration fields, mutex for thread safety, upstream connection handle, and HTTP client initialization tracking to struct opentelemetry_context.
Configuration & Context Management
plugins/out_opentelemetry/opentelemetry_conf.c
Integrated metadata token setup with validation (enforcing mutual exclusivity with explicit OAuth2 config), HTTP client initialization tracking, and cleanup during context destruction.
Plugin Core Logic
plugins/out_opentelemetry/opentelemetry.c
Added metadata token injection into HTTP POST requests via Bearer token, 401 handling with token invalidation/retry, pre-dispatch token refresh during flush, and five new public configuration parameters. Includes mutex-protected token acquisition for HTTP/2 mode.
Runtime Test Suite
tests/runtime/out_opentelemetry.c
Comprehensive test module (25 test functions) validating metadata token URL validation, mutual exclusivity with OAuth2, token fetch timing, expiry handling, custom headers, query parameter construction (scope/audience), legacy HTTP POST transport, and 401 recovery.

Sequence Diagrams

sequenceDiagram
    participant Flush as Flush Operation
    participant Metadata as Metadata Token Module
    participant HTTP as HTTP Client
    participant Endpoint as Metadata Endpoint
    participant OAuth2 as OAuth2 Context
    participant Request as OTLP Request

    Flush->>Metadata: flb_otel_metadata_token_refresh()
    Metadata->>OAuth2: Lock oauth2_mutex
    alt Token Valid (within skew)
        Metadata->>OAuth2: Unlock, skip refresh
    else Token Expired
        Metadata->>OAuth2: Unlock temporarily
        Metadata->>HTTP: GET metadata_token_url
        HTTP->>Endpoint: HTTP GET request
        Endpoint-->>HTTP: 200 + {access_token, expires_in}
        HTTP-->>Metadata: Response body
        Metadata->>Metadata: Parse JSON, compute TTL
        Metadata->>OAuth2: Lock oauth2_mutex
        Metadata->>OAuth2: Update access_token, expires_at
        Metadata->>OAuth2: Unlock
    end
    Flush->>Request: Dispatch OTLP data
    Request->>Request: Inject Bearer token from OAuth2
Loading
sequenceDiagram
    participant Plugin as Plugin
    participant HTTP as HTTP Client
    participant Metadata as Metadata Module
    participant Server as OTLP Server

    Plugin->>Metadata: Token acquire (HTTP/2 path)
    Metadata->>Metadata: Lock metadata_mutex
    Metadata->>HTTP: Fetch access_token from OAuth2
    Metadata->>Metadata: Copy token to owned SDS
    Metadata->>Metadata: Unlock metadata_mutex
    Metadata-->>Plugin: Bearer token
    Plugin->>HTTP: Construct request + Bearer header
    HTTP->>Server: POST with Authorization header
    Server-->>HTTP: 200 OK
    HTTP-->>Plugin: Success
    Plugin->>Metadata: Free copied token
Loading
sequenceDiagram
    participant OTLP as OTLP Request
    participant Server as OTLP Server
    participant Plugin as Plugin
    participant Metadata as Metadata Module

    OTLP->>Server: POST with Bearer token
    Server-->>OTLP: 401 Unauthorized
    OTLP->>Plugin: Handle 401 response
    Plugin->>Plugin: Invalidate oauth2_ctx->access_token
    Plugin->>Metadata: flb_otel_metadata_token_refresh()
    Metadata->>Metadata: Fetch fresh token from metadata endpoint
    Metadata-->>Plugin: New Bearer token
    Plugin->>Server: Retry POST with new Bearer token
    Server-->>Plugin: 200 OK
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

docs-required

Suggested reviewers

  • cosmo0920
  • patrick-stephens

Poem

🐰 A metadata token hops into view,
Cloud credentials fetched, oh what's new!
With mutex locks and 401 retries,
Beneath Bearer headers, the token flies—
Bearer auth blooms where OAuth2 lies! 🌸

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.73% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'out_opentelemetry: add metadata token authentication' directly matches the main change: adding metadata token authentication support to the out_opentelemetry plugin.
Linked Issues check ✅ Passed All code requirements from issue #11675 are met: metadata_token_url/header/refresh configuration options, HTTP GET token fetching with JSON parsing, oauth2 context reuse, and cloud-neutral design without new dependencies.
Out of Scope Changes check ✅ Passed All changes are scoped to metadata token authentication: new config parameters, token lifecycle functions, supporting structures in headers, HTTP client initialization tracking, and comprehensive test coverage with 25 test cases.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/runtime/CMakeLists.txt (1)

263-263: Consider adding port serialization for this test.

The test file uses several mock server ports (18901, 18902, 18903, 18904). While these ports don't conflict with the existing serialized tests (2020, 4318, 5170), if other tests start using similar high ports, there could be conflicts. Consider adding a flb_runtime_lock_tests entry if port conflicts arise in CI.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/runtime/CMakeLists.txt` at line 263, The test registered with
FLB_RT_TEST for FLB_OUT_OPENTELEMETRY uses hardcoded mock ports and should be
serialized to avoid CI port conflicts; update the test registration by either
replacing FLB_RT_TEST(FLB_OUT_OPENTELEMETRY "out_opentelemetry.c") with the
serialized variant (e.g., FLB_RT_TEST_SERIALIZED or your repo's equivalent) or
add FLB_OUT_OPENTELEMETRY to the flb_runtime_lock_tests list so the runtime test
harness runs it under the test lock/serialization mechanism.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/runtime/CMakeLists.txt`:
- Line 263: The test registered with FLB_RT_TEST for FLB_OUT_OPENTELEMETRY uses
hardcoded mock ports and should be serialized to avoid CI port conflicts; update
the test registration by either replacing FLB_RT_TEST(FLB_OUT_OPENTELEMETRY
"out_opentelemetry.c") with the serialized variant (e.g., FLB_RT_TEST_SERIALIZED
or your repo's equivalent) or add FLB_OUT_OPENTELEMETRY to the
flb_runtime_lock_tests list so the runtime test harness runs it under the test
lock/serialization mechanism.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ac41ace1-7fb5-4569-8888-35916cd02409

📥 Commits

Reviewing files that changed from the base of the PR and between 63ed88e and 50d4814.

📒 Files selected for processing (8)
  • plugins/out_opentelemetry/CMakeLists.txt
  • plugins/out_opentelemetry/opentelemetry.c
  • plugins/out_opentelemetry/opentelemetry.h
  • plugins/out_opentelemetry/opentelemetry_conf.c
  • plugins/out_opentelemetry/opentelemetry_metadata.c
  • plugins/out_opentelemetry/opentelemetry_metadata.h
  • tests/runtime/CMakeLists.txt
  • tests/runtime/out_opentelemetry.c

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50d4814712

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +324 to +326
sep = strstr(ctx->metadata_token_header, ": ");
if (sep) {
name_len = (size_t)(sep - ctx->metadata_token_header);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept header syntax without mandatory space after ':'

The metadata header parser only accepts values containing the exact ": " separator, so a valid header like Metadata-Flavor:Google is silently dropped. For metadata endpoints that require this header (for example, GCP), token fetches will fail and every flush will retry even though the user supplied a syntactically valid HTTP header. Split on ':' and trim optional whitespace instead of requiring a literal colon-space sequence.

Useful? React with 👍 / 👎.

Comment on lines +198 to +200
tmp = flb_sds_cat(ctx->metadata_token_path,
ctx->metadata_token_scope,
strlen(ctx->metadata_token_scope));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge URL-encode metadata query parameters before appending

The scope/audience values are concatenated into the request URI without percent-encoding. If a value contains reserved characters (e.g., spaces in multi-scope strings, &, or =), the generated query string is malformed or semantically altered, which can break token retrieval or send unintended parameters. Encode user-provided query values before appending them to metadata_token_path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Apr 17, 2026

hi @evgfitil , thanks for contributing this.

The feature makes sense, but metadata_token_* does not look like the right public API.

Those names describe the implementation detail instead of the auth model. This is still OAuth2/Bearer auth with a different token source, so it would be cleaner to keep it under oauth2.* and model metadata as a source/mode instead of adding a second flat auth namespace.

A cleaner shape would be something like:

  • oauth2.token_source metadata
  • oauth2.metadata_url
  • oauth2.metadata_header

and then reuse the existing generic fields for the rest, such as oauth2.scope, oauth2.audience, and the refresh setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

out_opentelemetry: add cloud metadata endpoint token authentication

2 participants