Skip to content

in_node_exporter_metrics: add timex linux collector#11718

Open
ngillen wants to merge 1 commit intofluent:masterfrom
ngillen:node_exporter-timex
Open

in_node_exporter_metrics: add timex linux collector#11718
ngillen wants to merge 1 commit intofluent:masterfrom
ngillen:node_exporter-timex

Conversation

@ngillen
Copy link
Copy Markdown

@ngillen ngillen commented Apr 15, 2026

Extend node_exporter_metrics by exposing timex metrics by parsing struct timex from adjtimex and aligning the output with Prometheus node_exporter conventions.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features
    • Added a timex metric collector for Linux that monitors NTP/clock synchronization (offset, frequency adjustment, error estimates, status, PPS metrics, counters, TAI offset, and sync status). Disabled by default; enable via the collector.timex.scrape_interval configuration option.
    • Non-Linux builds expose a no-op placeholder so the collector symbol is available cross-platform.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

Adds a new "timex" collector to the in_node_exporter_metrics plugin with build changes, plugin wiring, new timex metric fields, and a Linux-specific implementation that polls NTP state via adjtimex() to publish gauges and counters.

Changes

Cohort / File(s) Summary
Build
plugins/in_node_exporter_metrics/CMakeLists.txt
Added ne_timex.c to the plugin source list.
Plugin wiring
plugins/in_node_exporter_metrics/ne.c
Registered timex_collector in collector list and added collector.timex.scrape_interval config property.
Core headers
plugins/in_node_exporter_metrics/ne.h
Enabled timex in default metrics and extended struct flb_ne with multiple timex gauge and counter fields.
Collector public header
plugins/in_node_exporter_metrics/ne_timex.h
New header declaring extern struct flb_ne_collector timex_collector.
Collector implementation (platform abstraction)
plugins/in_node_exporter_metrics/ne_timex.c
Added collector symbol; on non-Linux provides stub with NULL callbacks, on Linux includes Linux-specific implementation.
Collector implementation (Linux)
plugins/in_node_exporter_metrics/ne_timex_linux.c
New Linux collector: configures metric handles, calls adjtimex(), maps fields to gauges/counters, computes sync status and unit scaling, and exposes init/update callbacks.

Sequence Diagram

sequenceDiagram
    participant Plugin as Plugin
    participant Timex as Timex Collector
    participant Kernel as adjtimex() Syscall
    participant Metrics as CMT Metrics

    Plugin->>Timex: register timex_collector
    Plugin->>Timex: cb_init (ne_timex_init)
    Timex->>Metrics: create gauges & counters (offset, freq, errors, counters...)
    Plugin->>Timex: cb_update (ne_timex_update)
    Timex->>Kernel: adjtimex()
    Kernel-->>Timex: NTP/timex state (struct)
    Timex->>Metrics: cmt_gauge_set / cmt_counter_set (scaled values)
    Timex-->>Plugin: update complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

backport to v4.0.x

Poem

🐰 I hopped in code at break of day,

timex ticks now counted on the way,
adjtimex sings its timing rhyme,
gauges bloom and counters chime,
a rabbit cheers—metrics keep time!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: adding timex Linux collector to the in_node_exporter_metrics plugin.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f9ff4b9b5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

cmt_gauge_set(ctx->timex_tick, ts, tx.tick / MICROSECONDS, 0, NULL);
cmt_gauge_set(ctx->timex_pps_freq, ts, tx.ppsfreq / PPM16FRAC, 0, NULL);
cmt_gauge_set(ctx->timex_jitter, ts, tx.jitter / divisor, 0, NULL);
cmt_gauge_set(ctx->timex_shift, ts, tx.shift, 0, NULL);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Export PPS shift as seconds, not raw shift exponent

node_timex_pps_shift_seconds is currently set from tx.shift directly, but struct timex.shift is the shift value used to represent an interval in powers of two; exporting it raw yields values like 4 instead of the actual interval seconds (16), so the metric is systematically underreported and no longer matches node_exporter-compatible semantics for this series.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to man 2 adjtimex, timex.shift is PPS interval duration in seconds.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_node_exporter_metrics/ne_timex_linux.c`:
- Around line 253-256: ne_timex_init currently ignores timex_configure() result
and always returns 0, which allows null metric handles later; change
ne_timex_init to check the return value of timex_configure(ctx) and return a
non-zero error code (propagate the failure) when timex_configure fails so
callers know init failed; update any callers (collector init path) to handle a
non-zero return if necessary.
- Around line 259-269: ne_timex_update currently ignores errors from
timex_update and always returns 0; change it to check the return value of
timex_update(ctx, ts) and propagate a non-zero error when timex_update (which
wraps adjtimex) fails. Specifically, call ret = timex_update(ctx, ts); if ret
indicates failure, log or report the error and return a non-zero code from
ne_timex_update instead of 0 so callers can detect collection failure. Ensure
you reference ne_timex_update and timex_update in the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: efe96f92-6f35-469a-bf96-4102daec5db3

📥 Commits

Reviewing files that changed from the base of the PR and between 63ed88e and 7f9ff4b.

📒 Files selected for processing (6)
  • plugins/in_node_exporter_metrics/CMakeLists.txt
  • plugins/in_node_exporter_metrics/ne.c
  • plugins/in_node_exporter_metrics/ne.h
  • plugins/in_node_exporter_metrics/ne_timex.c
  • plugins/in_node_exporter_metrics/ne_timex.h
  • plugins/in_node_exporter_metrics/ne_timex_linux.c

Comment thread plugins/in_node_exporter_metrics/ne_timex_linux.c Outdated
Comment thread plugins/in_node_exporter_metrics/ne_timex_linux.c Outdated
Extend node_exporter_metrics by exposing timex metrics by parsing struct timex
from adjtimex and aligning the output with Prometheus node_exporter conventions.

Signed-off-by: Nicolas Gillen <nicolas.gillen@agregio-solutions.com>
@ngillen ngillen force-pushed the node_exporter-timex branch from 7f9ff4b to 5a21c14 Compare April 15, 2026 21:50
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_node_exporter_metrics/ne_timex_linux.c`:
- Around line 123-126: The gauge node_timex_pps_shift_seconds is exporting the
raw exponent (tx.shift) instead of the PPS interval in seconds; change the
export so that the value reported is 2^tx.shift (e.g., use (double)(1ULL <<
tx.shift) or pow(2.0, tx.shift)) before calling cmt_gauge_set for
node_timex_pps_shift_seconds (refer to tx.shift and the export site around the
existing cmt_gauge_set on line ~242); ensure the result is a floating-point
seconds value and handle tx.shift as an unsigned integer to avoid overflow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7df8f8bf-b426-47e0-aa72-8e73cd211b28

📥 Commits

Reviewing files that changed from the base of the PR and between 7f9ff4b and 5a21c14.

📒 Files selected for processing (6)
  • plugins/in_node_exporter_metrics/CMakeLists.txt
  • plugins/in_node_exporter_metrics/ne.c
  • plugins/in_node_exporter_metrics/ne.h
  • plugins/in_node_exporter_metrics/ne_timex.c
  • plugins/in_node_exporter_metrics/ne_timex.h
  • plugins/in_node_exporter_metrics/ne_timex_linux.c
✅ Files skipped from review due to trivial changes (2)
  • plugins/in_node_exporter_metrics/CMakeLists.txt
  • plugins/in_node_exporter_metrics/ne_timex.h
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/in_node_exporter_metrics/ne.c
  • plugins/in_node_exporter_metrics/ne_timex.c

Comment on lines +123 to +126
/* node_timex_pps_shift_seconds */
g = cmt_gauge_create(ctx->cmt, "node", "timex", "pps_shift_seconds",
"Pulse per second interval duration.",
0, NULL);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Upstream node_exporter timex shift handling:"
curl -fsSL https://raw.githubusercontent.com/prometheus/node_exporter/master/collector/timex_linux.go \
  | rg -n "pps_shift_seconds|Shift|Pow|Exp2"

echo
echo "Linux adjtimex(2) docs for 'shift' semantics:"
curl -fsSL https://man7.org/linux/man-pages/man2/adjtimex.2.html \
  | rg -n "shift|interval"

Repository: fluent/fluent-bit

Length of output: 171


🏁 Script executed:

# First, locate and examine the actual file
fd -t f "ne_timex_linux.c" --exec wc -l {} \;

Repository: fluent/fluent-bit

Length of output: 116


🏁 Script executed:

# Read the specific file to verify the referenced code
cat -n plugins/in_node_exporter_metrics/ne_timex_linux.c | sed -n '120,130p'

Repository: fluent/fluent-bit

Length of output: 447


🏁 Script executed:

# Check the actual metric update code around line 242
cat -n plugins/in_node_exporter_metrics/ne_timex_linux.c | sed -n '235,250p'

Repository: fluent/fluent-bit

Length of output: 1145


🏁 Script executed:

# Search for all references to timex_shift in the codebase
rg "timex_shift" --type c --type h -B 2 -A 2

Repository: fluent/fluent-bit

Length of output: 1387


🏁 Script executed:

# Check if there's any local documentation or comments about timex shift semantics
rg -i "shift.*exponent|shift.*2\^|shift.*power" plugins/in_node_exporter_metrics/

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Look at the full timex_update function to understand context
cat -n plugins/in_node_exporter_metrics/ne_timex_linux.c | sed -n '200,250p'

Repository: fluent/fluent-bit

Length of output: 2113


🏁 Script executed:

# Check if there are any constants or macros related to shift conversion
rg "SHIFT|shift" plugins/in_node_exporter_metrics/ne_timex_linux.c

Repository: fluent/fluent-bit

Length of output: 258


🏁 Script executed:

# Try alternative web search using different approach for node_exporter verification
curl -fsSL https://raw.githubusercontent.com/prometheus/node_exporter/main/collector/timex_linux.go 2>/dev/null | head -200 || echo "Fallback: Checking local Go docs reference"

Repository: fluent/fluent-bit

Length of output: 43


node_timex_pps_shift_seconds exports the raw exponent instead of converted seconds.

The metric name and description indicate this should be the PPS interval duration in seconds. However, tx.shift is a small integer exponent (0-31) where the actual interval is 2^shift seconds. Line 242 exports the raw value without conversion, unlike other timex metrics (e.g., tx.tick / MICROSECONDS, tx.stabil / PPM16FRAC) which apply appropriate conversions.

For example, tx.shift = 4 should represent 16 seconds (2^4), not the value 4.

Suggested fix
 static int timex_update(struct flb_ne *ctx, uint64_t ts)
 {
     struct timex tx = {};
     int ret = 0;

     double sync_status = 0;
     double divisor = 0;
+    double pps_shift_seconds = 0;
     
     ret = adjtimex(&tx);
     
     if (ret == -1)
     {
         flb_plg_error(ctx->ins, "error on  adjtimex: error: %d, %s", errno, strerror(errno));
         return -1;
     }
     
     if (ret == TIME_ERROR)
     {
         sync_status = 0;
     }
     else
     {
         sync_status = 1;
     }
     
     if (tx.status & STA_NANO)
     {
         divisor = NANOSECONDS;
     }
     else
     {
         divisor = MICROSECONDS;
     }
     
     cmt_gauge_set(ctx->timex_sync_status, ts, sync_status, 0, NULL);
     cmt_gauge_set(ctx->timex_offset, ts, tx.offset / divisor, 0, NULL);
     cmt_gauge_set(ctx->timex_freq, ts, 1.0 + tx.freq / PPM16FRAC, 0, NULL);
     cmt_gauge_set(ctx->timex_maxerror, ts, tx.maxerror / MICROSECONDS, 0, NULL);
     cmt_gauge_set(ctx->timex_esterror, ts, tx.esterror / MICROSECONDS, 0, NULL);
     cmt_gauge_set(ctx->timex_status, ts, tx.status, 0, NULL);
     cmt_gauge_set(ctx->timex_constant, ts, tx.constant, 0, NULL);
     cmt_gauge_set(ctx->timex_tick, ts, tx.tick / MICROSECONDS, 0, NULL);
     cmt_gauge_set(ctx->timex_pps_freq, ts, tx.ppsfreq / PPM16FRAC, 0, NULL);
     cmt_gauge_set(ctx->timex_jitter, ts, tx.jitter / divisor, 0, NULL);
-    cmt_gauge_set(ctx->timex_shift, ts, tx.shift, 0, NULL);
+    if (tx.shift >= 0 && tx.shift < 63) {
+        pps_shift_seconds = (double) (1ULL << tx.shift);
+    }
+    cmt_gauge_set(ctx->timex_shift, ts, pps_shift_seconds, 0, NULL);
     cmt_gauge_set(ctx->timex_stabil, ts, tx.stabil / PPM16FRAC, 0, NULL);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_node_exporter_metrics/ne_timex_linux.c` around lines 123 - 126,
The gauge node_timex_pps_shift_seconds is exporting the raw exponent (tx.shift)
instead of the PPS interval in seconds; change the export so that the value
reported is 2^tx.shift (e.g., use (double)(1ULL << tx.shift) or pow(2.0,
tx.shift)) before calling cmt_gauge_set for node_timex_pps_shift_seconds (refer
to tx.shift and the export site around the existing cmt_gauge_set on line ~242);
ensure the result is a floating-point seconds value and handle tx.shift as an
unsigned integer to avoid overflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant