Skip to content

Speed up scalar Sobol with closed-form evaluation#97

Open
wantonsushi wants to merge 3 commits into
AcademySoftwareFoundation:mainfrom
wantonsushi:sobol-fast-2d-ahmed
Open

Speed up scalar Sobol with closed-form evaluation#97
wantonsushi wants to merge 3 commits into
AcademySoftwareFoundation:mainfrom
wantonsushi:sobol-fast-2d-ahmed

Conversation

@wantonsushi

Copy link
Copy Markdown

Hello OpenQMC maintainers!

This PR implements the Ahmed 2024 paper mentioned in issue #67 for the scalar path of sobolReversedIndex. I used the existing benchmark tool to check timings and generate to confirm the output is unchanged.

Changes

  • include/oqmc/owen.h: the scalar path uses the closed form.
  • src/tools/cli/matrices.cpp: derives the steps from the generator matrices and prints them, the same way it already prints directions[].
  • src/tests/owen.cpp: OwenTest.SobolReversedIndex checks the result against the direction matrices for every 16-bit index and dimension.

I also tried to implement a SIMD version (sobol_simd_experiment.cpp). It's not in this PR since it doesn't beat the existing SIMD, but I'm sharing it so you can reproduce the SIMD timings below.

Timings

Using benchmark sobol samples, median of 9 runs, in microseconds:

target old new speedup
CPU scalar 60034 56482 1.06x
GPU (RTX 4070) 128830 124873 1.03x
CPU SSE 47545 62700 0.76x
CPU AVX 47627 62856 0.76x

The speedup is small because generation is only part of a draw, along with scrambling and state hashing. The SIMD implementation fits one dimension per lane (four lanes). The existing SIMD path is faster because it packs all sixteen matrix columns into a wider register, so I left the SIMD paths alone.

Verification

  • generate sobol output is byte-identical to before on scalar, SSE and AVX
  • Tests pass (168/168), clang-format and clang-tidy clean

@joshbainbridge

Copy link
Copy Markdown
Collaborator

Hi @wantonsushi. I've just given the PR a scan and it is looking good. This is great work, thank you.

Very interesting to see the comparison. I'll follow up with a closer read of the code details sometime tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants