ENH: add parameter finder for degrees or freedom for students_t distribution by dschmitz89 · Pull Request #1385 · boostorg/math

dschmitz89 · 2026-04-12T13:26:43Z

Towards #1305

This adds a parameter finder for the student t distribution with respect to the degrees or freedom.

Disclaimer: this PR is heavily LLM supported as C++ is not (yet?) my forte.

I verified the math with a simple Python program before. I also ran a simple sanity script to see that the function actually executes. I am not so confident about the tests though as I cannot really decipher the output of b2.

Approach: We use bracket_and_solve_root with an initial guess from a second order Edgeworth expansion of the CDF. The initial guess is very good for $df > 1$ where the t distribution behaves similar to the normal distribution but useless for low degrees of freedom (see the plot below). In this case or if it gives a relative error of the CDF worse than 10%, we fall back to an initial guess of 0.01.

Detailed approach for finding the initial guess

Given a quantile $x$ and a CDF value $p = P(T \leq x)$, we want to recover the degrees of freedom $\nu$.

Step 1 — Edgeworth warm start.

We use the 2nd-order Edgeworth expansion of the $t$ CDF in powers of $1/\nu$:

$$F(x;\nu) \approx \Phi(x) - \frac{\varphi(x)(x + x^3)}{4\nu} + \frac{\varphi(x)(3x + 5x^3 + 7x^5 - 3x^7)}{96\nu^2}$$
where $\Phi(x)$ is the standard normal CDF and $\phi(x)$ the standard normal PDF.
Setting $u = 1/\nu$ and $F(x;\nu) = p$ yields the quadratic

$$B u^2 - A u + (\Phi(x) - p) = 0$$

where

$$A = \frac{\varphi(x)(x + x^3)}{4}, \qquad B = \frac{\varphi(x)(3x + 5x^3 + 7x^5 - 3x^7)}{96}$$

The physically meaningful root (smallest positive $u$, i.e. largest $\nu$) gives a closed-form starting estimate $\hat\nu = 1/u$.

Step 2 — Validation.

We plug $\hat\nu$ into the exact $t$ CDF. If the relative residual $|F(x;\hat\nu) - p|/|p|$ exceeds 10%, we fall back to a safe low starting value $\nu_0 = 10^{-2}$.

dschmitz89 · 2026-04-15T19:55:41Z

CC @JacobHass8 would you have time to help out with a cursory initial review?

JacobHass8 · 2026-04-15T20:30:46Z

+       BOOST_CHECK_CLOSE(
+          students_t_distribution<RealType>::invert_probability_with_respect_to_degrees_of_freedom(
+             static_cast<RealType>(-6.96455673428326),
+             static_cast<RealType>(0.01)),
+          static_cast<RealType>(2),
+          tol_inv);


I see these test values are used in quite a few places. It would be nice to combine all these into a single function which checks every method. This ensures we only have to define the values for the spot check in one place. It also cuts down on the amount of code. Maybe this would be for a separate PR though?

JacobHass8 · 2026-04-15T20:38:38Z

+   //
+   // Step 1: Edgeworth warm start.
+   // F(x; nu) ~ Phi(x) - phi(x)(x + x^3)/(4*nu) + phi(x)(3x + 5x^3 + 7x^5 - 3x^7)/(96*nu^2)
+   // Substituting u = 1/nu and setting F(x; nu) = p gives:
+   //   B*u^2 - A*u + (Phi(x) - p) = 0
+   // where:
+   //   A = phi(x) * (x + x^3) / 4
+   //   B = phi(x) * (3x + 5x^3 + 7x^5 - 3x^7) / 96
+   //
+   normal_distribution<RealType, Policy> std_normal(0, 1);
+   RealType phi = pdf(std_normal, x_abs);
+   RealType Phi = cdf(std_normal, x_abs);
+
+   RealType x2 = x_abs * x_abs;
+   RealType x3 = x2 * x_abs;
+   RealType x5 = x3 * x2;
+   RealType x7 = x5 * x2;
+
+   RealType A = phi * (x_abs + x3) / 4;
+   RealType B = phi * (3 * x_abs + 5 * x3 + 7 * x5 - 3 * x7) / 96;
+   RealType C = Phi - p_adj;
+
+   RealType hint = static_cast<RealType>(0.01);
+   RealType discriminant = A * A - 4 * B * C;
+   if (discriminant >= 0 && B != 0)
+   {
+      RealType sqrt_disc = sqrt(discriminant);
+      // Two roots of B*u^2 - A*u + C = 0; pick the smallest positive u (largest nu = 1/u).
+      RealType u1 = (A - sqrt_disc) / (2 * B);
+      RealType u2 = (A + sqrt_disc) / (2 * B);
+      RealType u = -1;
+      if (u1 > 0 && u2 > 0)
+         u = (u1 < u2) ? u1 : u2;
+      else if (u1 > 0)
+         u = u1;
+      else if (u2 > 0)
+         u = u2;
+
+      if (u > 0)
+      {
+         RealType nu_hat = 1 / u;
+         // Step 2: validate by checking relative residual of the exact CDF.
+         students_t_distribution<RealType, Policy> t_hat(nu_hat);
+         RealType exact_cdf = cdf(t_hat, x_abs);
+         RealType residual = (exact_cdf > p_adj) ? (exact_cdf - p_adj) : (p_adj - exact_cdf);
+         RealType relative_residual = (p_adj != 0) ? residual / p_adj : residual;
+         if (relative_residual <= tools::epsilon<RealType>() * 4)
+             return nu_hat; // Edgeworth estimate is already exact to machine precision.
+         if (relative_residual <= static_cast<RealType>(0.1))
+            hint = nu_hat;
+      }
+   }


With all this in the function, it was initially hard for me to parse what was going on. Could this be put in a separate function called something like boost::math::detail::edgeworth_approximation? Then you could replace all this with hint = boost::math::detail::edgeworth_approximation(x_abs, p).

Maybe refactor out, and use in the existing function? That would get automatically tested in the existing tests as well.

jzmaddock · 2026-04-16T11:33:58Z

 }

+template <class RealType, class Policy>
+BOOST_MATH_GPU_ENABLED RealType students_t_distribution<RealType, Policy>::invert_probability_with_respect_to_degrees_of_freedom(


We already have a function for this here:

math/include/boost/math/distributions/students_t.hpp

Line 314 in 625d767

BOOST_MATH_GPU_ENABLED RealType students_t_distribution<RealType, Policy>::find_degrees_of_freedom(

However, you have a better initial guess to the root finder which would be a very welcome addition!

I found this function but was not sure how to map the inversion problem we have here (finding degrees of freedom given a probability and a quantile) to this one arising from sample size calculation.

The signatures are:
find_degrees_of_freedom(RealType difference_from_mean, RealType alpha, RealType beta, RealType sd, RealType hint)
invert_probability_with_respect_to_degrees_of_freedom(RealType x, RealType p)

jzmaddock · 2026-04-16T11:35:19Z

+   {
+      // CDF(0; df) = 0.5 for all df; only solvable when p == 0.5.
+      if (p_adj == static_cast<RealType>(0.5))
+         return boost::math::numeric_limits<RealType>::infinity();


Small point, this should return policies::raise_overflow_error for consistency.

jzmaddock · 2026-04-16T11:37:13Z

+   RealType x5 = x3 * x2;
+   RealType x7 = x5 * x2;
+
+   RealType A = phi * (x_abs + x3) / 4;


If an LLM wrote most of this then colour me quite impressed ;)

A small point of C++ etiquette: variables are normally all_lower_case and upper case is reserved for macros.

My workflow was basically:

work out the edgeworth expansion with claude opus iteratively (it got the signs wrong in the beginning, I simply plotted to check the correctness

asked claude sonnet to invert the expression

worked out a simple python implementation to see that it works

asked sonnet again to translate the python implementation to something that follows boost idioms

iterate on the PR code with manual interventions to get what we have now

…est with df=1e6

codecov · 2026-04-16T23:08:23Z

Codecov Report

❌ Patch coverage is 87.50000% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.34%. Comparing base (625d767) to head (b97dceb).

Files with missing lines	Patch %	Lines
include/boost/math/distributions/students_t.hpp	84.21%	9 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1385      +/-   ##
===========================================
- Coverage    95.35%   95.34%   -0.01%     
===========================================
  Files          825      825              
  Lines        68203    68269      +66     
===========================================
+ Hits         65033    65091      +58     
- Misses        3170     3178       +8

Files with missing lines	Coverage Δ
test/test_students_t.cpp	`99.55% <100.00%> (+0.03%)`	⬆️
include/boost/math/distributions/students_t.hpp	`89.04% <84.21%> (-2.15%)`	⬇️

... and 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 625d767...b97dceb. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ENH: add parameter finder for degrees or freedom

de6a562

dschmitz89 changed the title ~~ENH: add parameter finder for degrees or freedom~~ ENH: add parameter finder for degrees or freedom for students_t distribution Apr 12, 2026

dschmitz89 added 2 commits April 12, 2026 16:19

Add early exit if approximation is precise to machine precision

7b015ec

Loosen tolerance and add tests in the tails

a1609a9

dschmitz89 mentioned this pull request Apr 12, 2026

ENH: more inverses for distribution CDFs #1305

Open

8 tasks

JacobHass8 reviewed Apr 15, 2026

View reviewed changes

jzmaddock reviewed Apr 16, 2026

View reviewed changes

dschmitz89 added 7 commits April 16, 2026 21:26

Modularize and simplify

1cf9ea2

Use overflow error instead of inf

1132241

Relax all test tolerances

123fdd1

Should use a linter ..

24c88ba

Relax test tolerance for float precision and remove ill conditioned t…

0c8365a

…est with df=1e6

Fix indentation

1868814

Loosen tolerance for very high degrees of freedom

b97dceb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: add parameter finder for degrees or freedom for students_t distribution#1385

ENH: add parameter finder for degrees or freedom for students_t distribution#1385
dschmitz89 wants to merge 10 commits intoboostorg:developfrom
dschmitz89:invert_student_t_to_df

dschmitz89 commented Apr 12, 2026 •

edited

Loading

Uh oh!

dschmitz89 commented Apr 15, 2026

Uh oh!

JacobHass8 Apr 15, 2026

Uh oh!

JacobHass8 Apr 15, 2026

Uh oh!

jzmaddock Apr 16, 2026

Uh oh!

jzmaddock Apr 16, 2026

Uh oh!

dschmitz89 Apr 16, 2026

Uh oh!

jzmaddock Apr 16, 2026

Uh oh!

jzmaddock Apr 16, 2026

Uh oh!

dschmitz89 Apr 16, 2026

Uh oh!

codecov bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dschmitz89 commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschmitz89 commented Apr 15, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dschmitz89 commented Apr 12, 2026 •

edited

Loading

codecov bot commented Apr 16, 2026 •

edited

Loading