Skip to content

Conversation

@yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Dec 25, 2025

What changes were proposed in this pull request?

This PR optimizes ORC serialization performance by pre-allocating OrcList with the exact array size instead of relying on dynamic resizing.

Key changes:

  • Pre-allocate OrcList with numElements from the input array
  • Avoid multiple ArrayList resize operations and element copying during serialization
  • Cache numElements value to avoid redundant calls in the loop condition

Why are the changes needed?

Problem:
When serializing arrays to ORC format, the current implementation creates an empty OrcList (which extends ArrayList) and grows it dynamically. For large arrays, this triggers multiple resize operations, each requiring:

  1. Allocating a new larger backing array
  2. Copying all existing elements to the new array
  3. Discarding the old array

Performance Impact:
For an array with 65,536 elements, the default ArrayList growth pattern (1.5x capacity increase) causes ~16 resize operations, copying approximately 1 million elements in total.

Solution:
By pre-allocating the OrcList with the known size, we eliminate all resize operations and associated element copying, resulting in:

=== Resize Behavior Analysis: 1000000 elements ===
Target size: 1,000,000 elements
Total resizes: 29
Total elements copied: 2,430,972
Final capacity: 1,215,487

Resize sequence:
  1. Resize:     10 →     15 (copy     10 elements)
  2. Resize:     15 →     22 (copy     15 elements)
  3. Resize:     22 →     33 (copy     22 elements)
  4. Resize:     33 →     49 (copy     33 elements)
  5. Resize:     49 →     73 (copy     49 elements)
  6. Resize:     73 →    109 (copy     73 elements)
  7. Resize:    109 →    163 (copy    109 elements)
  8. Resize:    163 →    244 (copy    163 elements)
  9. Resize:    244 →    366 (copy    244 elements)
  10. Resize:    366 →    549 (copy    366 elements)
  11. Resize:    549 →    823 (copy    549 elements)
  12. Resize:    823 →  1,234 (copy    823 elements)
  13. Resize:  1,234 →  1,851 (copy  1,234 elements)
  14. Resize:  1,851 →  2,776 (copy  1,851 elements)
  15. Resize:  2,776 →  4,164 (copy  2,776 elements)
  16. Resize:  4,164 →  6,246 (copy  4,164 elements)
  17. Resize:  6,246 →  9,369 (copy  6,246 elements)
  18. Resize:  9,369 → 14,053 (copy  9,369 elements)
  19. Resize: 14,053 → 21,079 (copy 14,053 elements)
  20. Resize: 21,079 → 31,618 (copy 21,079 elements)
  21. Resize: 31,618 → 47,427 (copy 31,618 elements)
  22. Resize: 47,427 → 71,140 (copy 47,427 elements)
  23. Resize: 71,140 → 106,710 (copy 71,140 elements)
  24. Resize: 106,710 → 160,065 (copy 106,710 elements)
  25. Resize: 160,065 → 240,097 (copy 160,065 elements)
  26. Resize: 240,097 → 360,145 (copy 240,097 elements)
  27. Resize: 360,145 → 540,217 (copy 360,145 elements)
  28. Resize: 540,217 → 810,325 (copy 540,217 elements)
  29. Resize: 810,325 → 1,215,487 (copy 810,325 elements)

Overhead ratio: 2.43x
(For every element added, 2.43 elements are copied during resizes)

=== Testing Array Size: 1000000 ===
Array Size: 1,000,000 elements
Iterations: 50
Without pre-allocation: 1328.69 ms total (26.574 ms/iter)
With pre-allocation:    620.36 ms total (12.407 ms/iter)
Time saved:             708.33 ms (53.3% improvement)
Expected resize count:  29 times
Final capacity needed:  1215487

Does this PR introduce any user-facing change?

No. This is a performance optimization with no functional changes. The output remains identical.

How was this patch tested?

  1. Existing Tests: All existing ORC-related tests pass, ensuring correctness is maintained
  2. Performance Testing: see codeblock above

Was this patch authored or co-authored using generative AI tooling?

GitHub Copilot w/ Claude Sonnet 4.5

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes ORC array serialization performance by pre-allocating OrcList with the exact array size instead of using dynamic resizing. The optimization eliminates multiple ArrayList resize operations and element copying, resulting in a ~53% performance improvement for large arrays (1M elements) based on the benchmark results in the PR description.

Key changes:

  • Pre-allocate OrcList with known array size to avoid dynamic resizing
  • Cache numElements() value to avoid redundant method calls in loop condition

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant