Skip to content

Document measured local build system requirements#60

Open
MaxGhenis wants to merge 1 commit into
mainfrom
docs/system-requirements
Open

Document measured local build system requirements#60
MaxGhenis wants to merge 1 commit into
mainfrom
docs/system-requirements

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

What

Adds SYSTEM_REQUIREMENTS.md — the measured memory, disk, and CPU footprint of
developing and building populace locally — and links it from the README. Docs
only; no code changes.

Why

There was no answer to "what does a machine need to build populace?" beyond the
solver docstring's dense-matrix estimate. This makes build-machine sizing and
contributor onboarding concrete, and surfaces one avoidable scaling ceiling.

Measured findings

On Apple M5 Max / 128 GB / Python 3.14.4 / torch 2.12 (full certificate + repro
scripts in the doc). Peak RSS via ru_maxrss, one config per process.

  • RAM is the binding resource, with two distinct ceilings:
    • Calibration compile peaks at ≈ 2 × n_targets × n_records × 8 bytes,
      because build_constraint_matrix densifies one row per target and vstacks
      before compressing to CSR. 300k × 6,288 targets → 29 GB; extrapolates to
      ~94 GB at 1M records and ~280 GB at 3M, so the generate-big rungs are
      cloud-only purely because of the compiler, not the solve.
    • SCF wealth imputation (27 chained QRF targets) → ~15 GB, >30 min,
      single-threaded
      — the heaviest single build stage.
  • Full contract suite peaks at 11 GB (data loader 9 GB + policyengine-us
    adapter 4 GB); pure-library suites are <0.5 GB.
  • Disk ≈ 18 GB (1 GB venv + 14 GB policyengine-us-data storage + 3.3 GB HF
    cache).
  • Phases run sequentially, so the end-to-end build peak is the max of stage
    peaks (~29 GB), not their sum.

Flagged follow-up

The calibration compile densification is the one avoidable ceiling: building the
CSR incrementally (accumulate data/indices/indptr per row, or
scipy.sparse.vstack per-row sparse rows) would cut compile RAM by ~n_records×
and move the 1M–3M rungs within reach of a large box. The existing
test_sparse_solve.py equivalence tests cover the behavior this must preserve.
Noted in the doc's Follow-up section; not addressed here.

Reproducibility

The doc carries an environment certificate and the two benchmark scripts
(calibration + imputation), so the numbers can be refreshed on any machine.

🤖 Generated with Claude Code

Add SYSTEM_REQUIREMENTS.md: measured memory/disk/CPU footprint of developing
and building populace locally, plus a hardware-budget table. Link it from the
README development section.

Key measured findings (M5 Max, 128 GB, Python 3.14.4 / torch 2.12):
- RAM is the binding constraint, with two distinct ceilings:
  - calibration compile peaks at ~2 x n_targets x n_records x 8 bytes
    (build_constraint_matrix densifies rows then vstacks before CSR):
    300k x 6,288 targets -> 29 GB; extrapolates to ~94 GB at 1M records.
  - SCF wealth imputation (27 chained QRF targets) -> ~15 GB, >30 min,
    single-threaded.
- Full contract suite peaks at 11 GB (data loader 9 GB + PE-US adapter 4 GB);
  pure-library suites are <0.5 GB.
- Disk: ~18 GB (1 GB venv + 14 GB us-data storage + 3.3 GB HF cache).
- Sequential phases -> end-to-end build peak is the max of stages (~29 GB).

Includes an environment certificate and the reproduction scripts. Flags the
calibration compile densification as an avoidable ceiling (build CSR
incrementally) that currently caps local builds below the generate-big rungs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the docs/system-requirements branch from e146904 to 83bee59 Compare June 15, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant