Skip to content

perf: optimized validString in CommentedString#1067

Merged
fortmarek merged 4 commits intotuist:mainfrom
ChrisBenua:optimisedCommentedString
Feb 26, 2026
Merged

perf: optimized validString in CommentedString#1067
fortmarek merged 4 commits intotuist:mainfrom
ChrisBenua:optimisedCommentedString

Conversation

@ChrisBenua
Copy link
Contributor

@ChrisBenua ChrisBenua commented Feb 20, 2026

Resolves #1066

Short description 📝

Optimised CommendString.validString

Solution 📦

  • Replace StringProtocol.contains with checking that one UInt8 array (specifically UnsafeBufferPointer) is subarray of another. Why this approach is faster? Because StringProtocol.contains spends most of its time on string indexing (which requires much logic to preserve grapheme clusters).
  • Replace rangeOfCharacter with checking that UInt8 array (specifically UnsafeBufferPointer) contains specific value.
  • Instead of calling append for each string character in reduce function - calculate needed capacity upfront and create String using String(unsafeUninitializedCapacity:) init

Implementation 👩‍💻👨‍💻

Detail in a checklist the steps that you took to implement the PR.

  • Replace StringProtocol.contains and rangeOfCharacters with UnsafeBufferPointer<UInt8> operations
  • Get rid of reduce + append. It required too much memory allocation. Instead - calculate needed capacity upfront and create String using String(unsafeUninitializedCapacity:) init

Performance

Performance Comparison

Measured with M2 Max 32 GB RAM, MacOS 26.2, release build. If I run these benchmarks many times, average deviation is about 3-4%

Previous validString implementation performance:

simple alphanumeric (fast path)       0.47 µs/call (0.047 s total, 100000 iters)
underscore+slash (valid, special)     1.30 µs/call (0.130 s total, 100000 iters)
// and ___ (needs quoting)            0.77 µs/call (0.077 s total, 100000 iters)
needs escaping (\" \n \t \\)          0.74 µs/call (0.074 s total, 100000 iters)
spaces (invalid chars, no escape)     0.39 µs/call (0.039 s total, 100000 iters)
"true"                                0.01 µs/call (0.001 s total, 100000 iters)
"false"                               0.01 µs/call (0.001 s total, 100000 iters)
long clean (2200 chars)               119.46 µs/call (11.946 s total, 100000 iters)
long dirty (1500 chars, escaping)     36.50 µs/call (3.650 s total, 100000 iters)
mixed batch (10 strings)              161.31 µs/call (8.065 s total, 50000 iters)

Introduced validString implementation performance:

simple alphanumeric (fast path)       0.03 µs/call (0.003 s total, 100000 iters) - 15x boost
underscore+slash (valid, special)     0.06 µs/call (0.006 s total, 100000 iters) - 20x boost
// and ___ (needs quoting)            0.09 µs/call (0.009 s total, 100000 iters) - 8x boost
needs escaping (\" \n \t \\)          0.09 µs/call (0.009 s total, 100000 iters) - 8x boost
spaces (invalid chars, no escape)     0.07 µs/call (0.007 s total, 100000 iters) - 5x boost
"true"                                0.01 µs/call (0.001 s total, 100000 iters) - same
"false"                               0.00 µs/call (0.000 s total, 100000 iters) - same
long clean (2200 chars)               5.50 µs/call (0.550 s total, 100000 iters) - 20x boost
long dirty (1500 chars, escaping)     2.30 µs/call (0.230 s total, 100000 iters) - 16x boost
mixed batch (10 strings)              8.12 µs/call (0.406 s total, 50000 iters) - 20x boost

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 20, 2026
@dosubot
Copy link

dosubot bot commented Feb 20, 2026

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@ChrisBenua
Copy link
Contributor Author

@dosubot could you review this PR? It optimized CommentedString.validString calculation by avoiding String indexing overhead by using UnsafeBufferPointer-s instead. Also avoid multiple String allocations by replacing reduce and string concatenation by calculating needed capacity upfront and create String using String(unsafeUninitializedCapacity:) init

@dosubot
Copy link

dosubot bot commented Feb 20, 2026

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

Copy link
Member

@fortmarek fortmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯

Thanks a lot for this PR, the improvements are impressive. I have a couple minor nits, but otherwise, looks good!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 23, 2026
@ChrisBenua ChrisBenua changed the title Optimized CommentedString.validString perf: Optimized CommentedString.validString Feb 24, 2026
@ChrisBenua ChrisBenua requested a review from fortmarek February 24, 2026 10:57
@ChrisBenua ChrisBenua changed the title perf: Optimized CommentedString.validString perf: optimized validString in CommentedString Feb 24, 2026
@ChrisBenua
Copy link
Contributor Author

@fortmarek Marek, could you please re-run the workflows and re-approve PR? I apologize for not checking the linting rules before opening the PR. I've run mise run lint locally, and while it flagged several existing Swift files in XcodeProj for not conforming to current linting standards, I’ve verified that the changes in this PR adhere to the linting rules.

Thank you for your understanding!

@fortmarek
Copy link
Member

Thanks again for the PR 🙏

@fortmarek fortmarek merged commit 2832e79 into tuist:main Feb 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve CommentedString.validString performance

2 participants