Context
Per AGENTS.md ("keep the US pack thin; push shared abstractions upstream into core; if a seam is useful for both UK and US, move it to microplex"), several heavy US-local modules reimplement — or could move to — core surfaces. Core already owns sources, targets, fusion, calibration (Calibrator + microcalibrate adapter), reweighting (reweighting.py, targets/reweighting.py, targets/bundles.py), and an eval harness (eval/harness.py, eval/reweighting_benchmark.py).
Candidates ranked by leverage. Verify each US version genuinely duplicates core (vs intentional country-specific extension) before relocating.
1. eCPS-replacement comparison harness → core eval (biggest win)
src/microplex_us/pipelines/ecps_replacement_comparison.py (1,607 lines) is ~90% country-agnostic: ~28 of ~33 functions are matched-household sampling (_write_matched_dataset, _household_weights, _entity_*), symmetric refit (_fit_dense_refit, _objective), holdout (_build_holdout_target_mask, _validate_common_targets, _filter_loss_inputs_by_scope), scoring/diagnostics (_target_loss_diagnostics, _refit_matrix_score_*, _target_family_breakdown, _target_bucket_breakdown, _diagnostic_unweighted_msre, _protected_family_losses), and utils (_sha256, _dataset_descriptor, …).
The only US-specific seam is _extract_pe_native_loss_inputs (shells to the PE-US scorer + US target DB to build the loss matrix) plus the US bad-target list.
→ Move the harness to microplex.eval (likely merging with the existing, apparently-overlapping eval/reweighting_benchmark.py), parameterized by a loss-input extractor protocol + target provider + baseline resolver. US becomes a ~200-line provider implementing the PE-US extractor. Also unblocks #117 (CI eval).
2. PE-native refit solver → core reweighting
src/microplex_us/pipelines/pe_native_optimization.py (optimize_pe_native_loss_weights — the monotone accelerated projected-gradient refit; rewrite_policyengine_us_dataset_weights). AGENTS.md says reweighting/solver belongs in core and local code "should remain a thin adapter over core bundle/reweighting surfaces." The projected-gradient + simplex-projection solver is pure numerics (loss matrix → weights) → core. Keep only the PE-entity weight I/O (household → tax_unit/spm_unit/family/marital rewrite) local, parameterized by the entity list (the #221 empty-derived-weight-group guard generalizes).
3. CPS-passthrough / income-split mechanism → core fusion/donor
(#226–#228) The splitter — preserve survey-measured totals when collapsing donor clones onto a survey scaffold; derive component splits from the survey total; impute only donor-specific detail + clone records — is identical for UK FRS + admin clones. US keeps only the variable specs + split fractions. This is the core-targeted version of #229.
4. De-dup drifted modules
pe_targets.py, target_registry.py, unified_calibration.py, supabase_targets.py exist in both microplex core and microplex_us. Confirm whether the US copies are drifted duplicates and collapse to a single source of truth in core.
Refs: #229 (passthrough extraction — target should be core), #117 (CI eval).
Context
Per
AGENTS.md("keep the US pack thin; push shared abstractions upstream into core; if a seam is useful for both UK and US, move it tomicroplex"), several heavy US-local modules reimplement — or could move to — core surfaces. Core already owns sources, targets, fusion, calibration (Calibrator+ microcalibrate adapter), reweighting (reweighting.py,targets/reweighting.py,targets/bundles.py), and an eval harness (eval/harness.py,eval/reweighting_benchmark.py).Candidates ranked by leverage. Verify each US version genuinely duplicates core (vs intentional country-specific extension) before relocating.
1. eCPS-replacement comparison harness → core
eval(biggest win)src/microplex_us/pipelines/ecps_replacement_comparison.py(1,607 lines) is ~90% country-agnostic: ~28 of ~33 functions are matched-household sampling (_write_matched_dataset,_household_weights,_entity_*), symmetric refit (_fit_dense_refit,_objective), holdout (_build_holdout_target_mask,_validate_common_targets,_filter_loss_inputs_by_scope), scoring/diagnostics (_target_loss_diagnostics,_refit_matrix_score_*,_target_family_breakdown,_target_bucket_breakdown,_diagnostic_unweighted_msre,_protected_family_losses), and utils (_sha256,_dataset_descriptor, …).The only US-specific seam is
_extract_pe_native_loss_inputs(shells to the PE-US scorer + US target DB to build the loss matrix) plus the US bad-target list.→ Move the harness to
microplex.eval(likely merging with the existing, apparently-overlappingeval/reweighting_benchmark.py), parameterized by aloss-input extractorprotocol + target provider + baseline resolver. US becomes a ~200-line provider implementing the PE-US extractor. Also unblocks #117 (CI eval).2. PE-native refit solver → core reweighting
src/microplex_us/pipelines/pe_native_optimization.py(optimize_pe_native_loss_weights— the monotone accelerated projected-gradient refit;rewrite_policyengine_us_dataset_weights).AGENTS.mdsays reweighting/solver belongs in core and local code "should remain a thin adapter over core bundle/reweighting surfaces." The projected-gradient + simplex-projection solver is pure numerics (loss matrix → weights) → core. Keep only the PE-entity weight I/O (household → tax_unit/spm_unit/family/marital rewrite) local, parameterized by the entity list (the #221 empty-derived-weight-group guard generalizes).3. CPS-passthrough / income-split mechanism → core fusion/donor
(#226–#228) The splitter — preserve survey-measured totals when collapsing donor clones onto a survey scaffold; derive component splits from the survey total; impute only donor-specific detail + clone records — is identical for UK FRS + admin clones. US keeps only the variable specs + split fractions. This is the core-targeted version of #229.
4. De-dup drifted modules
pe_targets.py,target_registry.py,unified_calibration.py,supabase_targets.pyexist in bothmicroplexcore andmicroplex_us. Confirm whether the US copies are drifted duplicates and collapse to a single source of truth in core.Refs: #229 (passthrough extraction — target should be core), #117 (CI eval).