Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

to_hex (used by benchmark items hex_int / hex_long) previously routed evaluation through make_scalar_function(..., vec![]), which converts scalar inputs into size‑1 arrays before execution. This adds avoidable overhead for constant folding / scalar evaluation.

What changes are included in this PR?

  • Add match-based scalar fast path for integer scalars:
    • Int8/16/32/64 and UInt8/16/32/64
  • Remove make_scalar_function(..., vec![]) usage
Type Before After Speedup
to_hex/scalar_i32 270.73 ns 86.676 ns 3.12x
to_hex/scalar_i64 254.71 ns 89.254 ns 2.85x

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Feb 2, 2026
@comphead
Copy link
Contributor

comphead commented Feb 2, 2026

might be related #20044


#[inline]
fn to_hex_scalar<T: ToHex>(value: T) -> String {
let mut hex_buffer = [0u8; 16];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we prob can make it reusable buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. My initial thinking was the scalar fast path buffer is a fixed 16‑byte stack array, so it’s already very cheap and not heap-allocating. The array path is where buffer reuse matters and we already reuse a single buffer across the whole batch there. What do you think, should we still go ahead with the buffer reuse change?

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kumarUjjawal I think this is great, however you can check if the reusable buffer can bring up more benefits

@alamb alamb added the performance Make DataFusion faster label Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation performance Make DataFusion faster sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants