Skip to content

[refine](column) replace field-based nested default with column-level operation in ColumnVariant#62547

Open
Mryange wants to merge 2 commits intoapache:masterfrom
Mryange:refine-variant-dev-4.16
Open

[refine](column) replace field-based nested default with column-level operation in ColumnVariant#62547
Mryange wants to merge 2 commits intoapache:masterfrom
Mryange:refine-variant-dev-4.16

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented Apr 16, 2026

What problem does this PR solve?

Issue Number: N/A

Problem Summary:

ColumnVariant::try_insert_default_from_nested was filling missing nested subcolumn rows by
converting the last row of a sibling leaf column to a Field, recursively walking the Field
tree via FieldVisitorReplaceScalars to replace scalar values with the type default, and then
inserting the resulting Field back into the column.

This approach was unnecessarily expensive (column → Field → Field tree traversal → column) and
inconsistent with the batch counterpart try_insert_many_defaults_from_nested, which operates
entirely at the column level using cut + clone_with_default_values.

The scalar visitor (FieldVisitorReplaceScalars) existed as duplicate dead code in both
column_variant.cpp and variant_util.cpp.

Changes

  • try_insert_default_from_nested: rewrite to use the same column-level approach as
    try_insert_many_defaults_from_nestedleaf->data.cut(leaf_size - 1, 1).clone_with_default_values(field_info) — eliminating the Field round-trip and the dependency on get_default() / FieldVisitorReplaceScalars.
  • Remove FieldVisitorReplaceScalars from both column_variant.cpp and variant_util.cpp; neither file has any remaining callers.
  • Remove field_visitor unit test in column_variant_test.cpp that directly tested the now-deleted visitor class.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

/review

@github-actions
Copy link
Copy Markdown
Contributor

OpenCode automated review failed and did not complete.

Error: Review step was skipped (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/24491092035

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a blocking behavioral regression in ColumnVariant::try_insert_default_from_nested.

Critical checkpoints:

  • Goal / correctness: The PR aims to refactor nested default synthesis to a column-level implementation. It does not fully preserve the previous behavior: missing nested siblings now use a synthesized row even when the donor sibling row is NULL, which changes the inserted value from the old insert_default() fallback to a copied/synthesized nested row. No test in this PR proves the null-row case still behaves correctly.
  • Change size / focus: The code deletion is small and focused, but it also removed the only direct coverage of the old transformation logic without replacing it with assertions for the changed path.
  • Concurrency: I did not find new concurrency or lock-order issues in the touched code.
  • Lifecycle / config / compatibility: No lifecycle, config, or wire/storage compatibility changes here.
  • Parallel paths: The batch helper try_insert_many_defaults_from_nested() already used the column-level clone path, but the single-row helper previously had an extra null guard. This PR made the two paths look similar by dropping that guard, which is the source of the regression.
  • Conditional checks: The removed last_field.is_null() check was behaviorally significant; removing it is not a mechanical cleanup.
  • Test coverage: Existing tests in column_variant_test.cpp do not assert the actual inserted row shape/nullability for try_insert_default_from_nested, and this PR deletes the visitor-only test without adding a replacement for the null-donor case.
  • Test results / observability / transactionality / FE-BE propagation: Not applicable for this change.
  • Performance: The new path is plausibly cheaper, but correctness regressed in the null-donor case.
  • Other issues: none beyond the above blocker and the missing targeted regression test.

Please restore the old null-row behavior (or add an explicit justification if semantics are intentionally changing) and add a targeted unit/regression test that covers a missing nested sibling when the donor nested leaf row is NULL.

if (last_field.is_null()) {
size_t leaf_size = leaf->data.size();
if (leaf_size == 0) {
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes behavior for nullable nested siblings. Before this patch, try_insert_default_from_nested() called get_last_field() and returned false when the donor leaf row was NULL, so the caller fell back to entry->data.insert_default(). After the refactor we always cut the last donor row and rebuild a default-shaped subcolumn from it, which means a row like items.tags = null can now synthesize a nested sibling value for a missing items.id instead of keeping that sibling NULL too. That is a user-visible semantic change in the JSON/variant parse path, not just a refactor. Please preserve the old null guard here or add an explicit test showing the new behavior is intended.

@@ -3082,30 +3082,6 @@ TEST_F(ColumnVariantTest, get_field_info_all_types) {
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch removes the only direct test around the deleted transformation logic, but it does not add any replacement coverage for the behavior that changed in try_insert_default_from_nested(). The existing try_insert_default_from_nested test only exercises the call and never asserts the inserted row contents/nullability. Please add a targeted test for the blocking case above: donor nested sibling row is NULL, target sibling path is missing, and we must still fall back to insert_default() rather than materializing a synthesized nested row.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

/review

1 similar comment
@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • be/src/core/column/column_variant.cpp: try_insert_default_from_nested() now rebuilds the synthetic row through clone_with_default_values(FieldInfo), but the local FieldInfo only carries scalar_type_id and num_dimensions. For metadata-carrying nested types, especially decimal arrays copied through ColumnVariant::try_insert() / insert_from(), that loses the entry type's precision and scale and can fail while materializing the missing sibling row. The old single-row path inserted a typed default Field into the existing column and preserved that metadata.

Critical Checkpoints

  • Goal / correctness: The PR aims to replace the field round-trip with a column-level operation. It improves the fast path idea, but it does not preserve behavior for metadata-carrying nested types, and the current tests do not prove correctness for the changed path.
  • Scope / minimality: The change is localized, but it also changes the single-row nested-default semantics by routing it through a helper with different type-metadata behavior.
  • Concurrency: No new concurrency, locking, or lifecycle issues were introduced in the touched code.
  • Configuration / compatibility: No config, persistence, or FE/BE protocol compatibility changes are involved here.
  • Parallel code paths: try_insert_many_defaults_from_nested() already uses the column-level helper; this PR makes the single-row path share the same metadata-loss risk instead of keeping its previous typed-field behavior.
  • Tests: Coverage is insufficient for this behavior change. The remaining try_insert_default_from_nested test only smoke-calls the helper and does not assert the produced nested default values or typed metadata.
  • Observability / persistence / transactions: Not applicable for this refactor.
  • Performance: The intended optimization is reasonable, but correctness needs to be preserved before the single-row path switches implementations.

I did not find additional blocking issues beyond this regression after tracing the insert, parse, finalize, and copy paths around ColumnVariant nested defaults.

}

auto default_scalar = entry->data.get_least_common_type()->get_default();
FieldInfo field_info = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the single-row path from inserting a typed default Field into the existing entry column to rebuilding a temporary subcolumn from FieldInfo. That is not equivalent for metadata-carrying types. Here field_info only keeps scalar_type_id and num_dimensions; precision and scale stay at their defaults. If a missing nested sibling is ARRAY<DECIMAL(10,2)> and we hit this path through ColumnVariant::try_insert() or insert_from(), clone_with_default_values() reaches create_array_of_type(TYPE_DECIMAL..., 0, 0) and throws from create_decimal(). The old implementation used entry->data.get_least_common_type()->get_default() and inserted that typed field directly, so it preserved the decimal metadata. Please preserve the full type metadata here before switching the single-row path over to clone_with_default_values().

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found in this PR.

Critical checkpoint conclusions:

  • Goal / correctness: The change does achieve its stated goal of removing the per-row column -> Field -> visitor -> column round-trip and aligning try_insert_default_from_nested() with the existing column-level batch path in try_insert_many_defaults_from_nested().
  • Change size / focus: The modification is small and focused to the nested-default helper plus dead-code/test cleanup.
  • Concurrency: Not applicable. This path is synchronous column manipulation with no new shared-state or lock behavior.
  • Lifecycle / static init: Not applicable.
  • Config / compatibility / FE-BE protocol: Not applicable. No new config, protocol, or persisted format changes.
  • Parallel code paths: The main parallel path here is the batch helper; this PR improves consistency by making the single-row helper use the same column-level approach.
  • Conditional checks: The previous null-row guard is preserved with is_null_at(leaf_size - 1).
  • Tests: Existing unit coverage does touch try_insert_default_from_nested() and clone_with_default_values(), but the dedicated assertions around exact defaulted nested shapes are still fairly weak, so mixed-dimension and null-donor cases remain the main residual testing gap.
  • Observability / persistence / transactionality: Not applicable.
  • Performance: Likely improved, since the new path stays at the column level and removes unnecessary Field materialization and visitor traversal.

Overall summary: no critical blocking issues identified during review; residual risk is limited to test depth rather than an obvious correctness problem in the implementation.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.12% (20189/38005)
Line Coverage 36.68% (190091/518272)
Region Coverage 32.90% (147482/448210)
Branch Coverage 34.05% (64597/189694)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.73% (27439/37214)
Line Coverage 57.35% (296323/516654)
Region Coverage 54.35% (245839/452320)
Branch Coverage 56.13% (106788/190262)

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

run beut

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.73% (27439/37214)
Line Coverage 57.35% (296290/516654)
Region Coverage 54.35% (245827/452320)
Branch Coverage 56.12% (106782/190262)

@Mryange Mryange force-pushed the refine-variant-dev-4.16 branch from 2cc10c2 to 1048ac1 Compare April 16, 2026 11:25
@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.72% (27434/37214)
Line Coverage 57.33% (296183/516654)
Region Coverage 54.34% (245785/452320)
Branch Coverage 56.09% (106719/190262)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.12% (20189/38005)
Line Coverage 36.68% (190122/518272)
Region Coverage 32.96% (147729/448210)
Branch Coverage 34.07% (64633/189694)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 90.91% (10/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.93% (26769/37214)
Line Coverage 54.94% (283866/516654)
Region Coverage 52.01% (235273/452320)
Branch Coverage 53.44% (101676/190262)

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Apr 16, 2026

/review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants