Skip to content

Investigate RAPIDS string-ID cuGraph build regression#991

Draft
lmeyerov wants to merge 1 commit intomasterfrom
feat/rapids-string-id-cugraph-repro
Draft

Investigate RAPIDS string-ID cuGraph build regression#991
lmeyerov wants to merge 1 commit intomasterfrom
feat/rapids-string-id-cugraph-repro

Conversation

@lmeyerov
Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov commented Mar 31, 2026

Summary

Start #977 as a dedicated follow-on branch with the pure cuDF/cuGraph reproducer moved into a tracked benchmark path.

Current committed scope:

  • add benchmarks/gfql/filter_pagerank/pure_rapids_string_build_repro.py
  • keep the branch focused on the string-ID graph-build regression itself
  • do not mix in Graphistry product optimizations, helper-tooling workflow, or cache/reuse work

Related issues:

  • #977: this repro / regression triage
  • #978: separate cache/reuse feature thread
  • #988: separate DGX helper-tooling thread

Why

We already narrowed the remaining RAPIDS 25.02 -> 26.02 GPU regression to graph build / renumbering rather than the PageRank kernel. This script gives us a pure-RAPIDS artifact to revalidate and iterate on without PyGraphistry in the container.

Fresh DGX revalidation

Primary repro shape on dgx-spark:

  • synthetic_string_gplus_shape
  • 10,000,000 edges
  • 107,614 unique vertices
  • repeated low-cardinality string/object IDs

Results:

  • 25.02-cuda12.8
    • build: 0.1866s
    • pagerank: 0.0074s
    • total: 0.1941s
  • 26.02-cuda13
    • build: 0.3130s
    • pagerank: 0.0034s
    • total: 0.3170s

Delta:

  • build: +67.74%
  • total: +63.32%

The kernel is still not the source of the slowdown.

Controls

  • synthetic_offset (sparse integer IDs)
    • 25.02-cuda12.8: 0.2224s
    • 26.02-cuda13: 0.2173s
    • delta: -2.29%
  • synthetic_string_offset (high-cardinality string IDs)
    • 25.02-cuda12.8: 0.6864s
    • 26.02-cuda13: 0.8303s
    • delta: +20.96%
  • synthetic_string_gplus_shape --store-transposed
    • 25.02-cuda12.8: 0.2006s
    • 26.02-cuda13: 0.3166s
    • delta: +57.83%

Current interpretation:

  • integer sparse IDs are not the problem
  • string/object IDs are implicated
  • repeated low-cardinality string/object IDs remain the strongest clean repro shape
  • store_transposed does not materially change the story

Next steps on this branch

  • decide whether this branch stays a pure upstream-facing repro package
  • if yes, keep the branch small and use it as the filing artifact
  • if no, justify one narrow local mitigation with fresh data before adding more code

@lmeyerov lmeyerov marked this pull request as ready for review March 31, 2026 07:24
@lmeyerov lmeyerov marked this pull request as draft March 31, 2026 09:57
@lmeyerov
Copy link
Copy Markdown
Contributor Author

Not intended for merging, just an exploration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant