Skip to content

Cleanup / overhaul StringViewArrayBuilder and related types #21684

@neilconway

Description

@neilconway

Is your feature request related to a problem or challenge?

StringViewArrayBuilder / StringArrayBuilder / LargeStringArrayBuilder has a fairly special-purpose API; it is basically only well-suited to use by concat and concat_ws, which is indeed the only place that uses it today. That makes it particularly odd that it is pub, not pub(crate).

In addition, there are a bunch of places where we use Arrow's StringBuilder / StringViewBuilder that would benefit from a better API: by passing the NULL bitmap to finish(), we can avoid the overhead of constructing the NULL bitmap iteratively.

Describe the solution you'd like

  1. Rename StringViewArrayBuilder -> ConcatStringViewBuilder, and so on for the other two functions.
  2. Make the Concat[...]Builder types pub(crate)
  3. Add a new family of specialized builders (StringArrayBuilder, LargeStringArrayBuilder, StringViewArrayBuilder), with an API like with_capacity, append_value(&str), append_placeholder(), finish(Option<NullBuffer>). This should be pub(crate).
  4. Switch most of the places where we use Arrow's StringBuilder to use one of the new bulk-NULL builders. This should cover call-sites like initcap, reverse, translate, substrindex, replace, lpad/rpad, repeat, min_max_bytes, first_last/state, concat_elements_utf8view, lower, and upper.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions