Optimize HashBytes for 64-bit and remove dead code#129640
Open
AaronRobinsonMSFT wants to merge 4 commits into
Open
Optimize HashBytes for 64-bit and remove dead code#129640AaronRobinsonMSFT wants to merge 4 commits into
AaronRobinsonMSFT wants to merge 4 commits into
Conversation
On 64-bit hosts, process 8 bytes at a time using a multiply-xorshift mixing step instead of byte-at-a-time DJB2. The tail bytes (0-7) fall through to the original byte loop. 32-bit hosts are unchanged. Remove unused EEUnicodeHashTableHelper and EEUnicodeStringHashTable along with their method implementations. Fix stale comment in dynamicmethod.cpp that referenced the removed type. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @agocke |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates CoreCLR’s internal hashing helpers by optimizing HashBytes on 64-bit hosts and removing unused Unicode/string hashing helper types, primarily impacting hash-table usage in the VM and shared hash utilities.
Changes:
- Add a 64-bit
HashBytesfast-path inutilcode.hthat consumes data in 8-byte chunks before hashing the remaining tail bytes. - Remove the unused
EEUnicodeHashTableHelper/EEUnicodeStringHashTableimplementation and related dead code. - Remove unused string hashing helpers / traits and update a comment that referenced removed code.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/eehash.h | Removes unused Unicode hash table helper declaration/typedef. |
| src/coreclr/vm/eehash.cpp | Removes unused Unicode hash table helper implementation; keeps string literal helper. |
| src/coreclr/vm/dynamicmethod.cpp | Updates an in-code comment to no longer reference removed helper. |
| src/coreclr/inc/utilcode.h | Adds 64-bit chunking fast-path to HashBytes; removes unused string-hash helpers. |
| src/coreclr/inc/shash.h | Removes unused case-insensitive string compare/hash traits wrapper. |
Copilot's findings
- Files reviewed: 5/5 changed files
- Comments generated: 1
am11
reviewed
Jun 19, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrite HashBytes to use xxHash32 QueueRound/MixFinal primitives instead of byte-at-a-time DJB2. Extract xxHash32 code from typehashingalgorithms.h into a shared inc/dn_xxhash.h header. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Include clrtypes.h for UINT32/DWORD/UINT_PTR/_rotl/HOST_64BIT - Use UINT_PTR cast in MixPointerIntoHash for consistency - Fix typo: mixin -> mix in Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jkotas
reviewed
Jun 20, 2026
|
|
||
| inline static UINT32 XXHash32_MixEmptyState() | ||
| { | ||
| // Unlike System.HashCode, these hash values are required to be stable, so don't |
Member
There was a problem hiding this comment.
It is not clear why these hash values are required to be stable now that this is standalone and no longer part of typehashingalgorithms.h
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Optimize
HashBytes(src/coreclr/inc/utilcode.h) to use xxHash32 primitives (XXHash32_QueueRound/XXHash32_MixFinal) processing 4 bytes at a time, replacing the original byte-at-a-time DJB2 hash.HashBytesis used by the string literal interning path (EEUnicodeStringLiteralHashTableHelper::Hash) which hashes every string literal resolved during JIT compilation (ldstrtoken resolution). This makes it part of the steady-state cost for all managed applications.The xxHash32 primitives are extracted from
typehashingalgorithms.hinto a new shared headersrc/coreclr/inc/dn_xxhash.hso they can be reused by both the type hashing andHashBytespaths.Also removes unused dead code:
EEUnicodeHashTableHelper,EEUnicodeStringHashTable,EEStringData::IsOnlyLowChars,HashStringN,HashiStringA,HashiStringN, andCaseInsensitiveStringCompareHash.Measurement
A microbenchmark calling
string.IsInterned()200M times against a pool of 1000 non-interned strings was used to isolate the hashing cost. Profiled on macOS ARM64 (Apple M5) with a full Release build (clr+libs -c Release). Each row is the median of 5 runs.The direct primitives approach was chosen because the
xxHashclass queue machinery (position tracking, branching, accumulator state) adds overhead that erases the gain when used for bulk byte-stream hashing. The primitives inline cleanly into a tight loop.Note
This PR was generated with the assistance of GitHub Copilot.