Investigate RAPIDS string-ID cuGraph build regression#991
Draft
Investigate RAPIDS string-ID cuGraph build regression#991
Conversation
Contributor
Author
|
Not intended for merging, just an exploration |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Start
#977as a dedicated follow-on branch with the pure cuDF/cuGraph reproducer moved into a tracked benchmark path.Current committed scope:
benchmarks/gfql/filter_pagerank/pure_rapids_string_build_repro.pyRelated issues:
Why
We already narrowed the remaining RAPIDS
25.02 -> 26.02GPU regression to graph build / renumbering rather than the PageRank kernel. This script gives us a pure-RAPIDS artifact to revalidate and iterate on without PyGraphistry in the container.Fresh DGX revalidation
Primary repro shape on
dgx-spark:synthetic_string_gplus_shape10,000,000edges107,614unique verticesResults:
25.02-cuda12.8build:0.1866spagerank:0.0074stotal:0.1941s26.02-cuda13build:0.3130spagerank:0.0034stotal:0.3170sDelta:
build:+67.74%total:+63.32%The kernel is still not the source of the slowdown.
Controls
synthetic_offset(sparse integer IDs)25.02-cuda12.8:0.2224s26.02-cuda13:0.2173s-2.29%synthetic_string_offset(high-cardinality string IDs)25.02-cuda12.8:0.6864s26.02-cuda13:0.8303s+20.96%synthetic_string_gplus_shape --store-transposed25.02-cuda12.8:0.2006s26.02-cuda13:0.3166s+57.83%Current interpretation:
store_transposeddoes not materially change the storyNext steps on this branch