Skip to content

Make microunit the default tax-unit constructor (#113)#123

Merged
MaxGhenis merged 5 commits into
mainfrom
prototype/microunit-activation
May 31, 2026
Merged

Make microunit the default tax-unit constructor (#113)#123
MaxGhenis merged 5 commits into
mainfrom
prototype/microunit-activation

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

What

Makes microunit the default tax-unit constructor for microplex's PolicyEngine entity tables — it is microplex's required tax-unit engine (#113), not an optional prototype. Builds on the now-merged #114 (the delegation seam).

Supersedes #116, which GitHub auto-closed when its base branch wire-microunit (the #114 head) was deleted on merge. Same reviewed content, rebased onto main.

  • High-fidelity adapter is default-on when the real CPS-derived fields (person_number, spouse_person_number, family_relationship) are present — which the production candidate carries. microunit re-partitions each household and replaces the unreliable CPS-provided tax_unit_id (Census TAX_ID) — "replace the CPS tax units, keep the CPS SPM units."
  • The CPS adapter triggers only on CPS-derived frames, so PUF/Forbes frames (meaningful per-return tax_unit_id, no CPS fields) are untouched. SPM/family/marital group IDs are preserved separately (Restore eCPS export and entity ID parity #112).
  • The coarse relationship_to_head-only heuristic stays opt-in (config.microunit_construct_from_normalized); legacy role-flag reconstruction is a fail-safe only (no silent legacy substitution when microunit input is sufficient).

filing_status is PE's job (delegated)

microplex does not export filing_status — it's a PE formula variable, computed from the partition + marital units. microunit's filing_status_input is internal bookkeeping only, so the entity tests no longer assert it.

Review cycle (independent review, then fixes)

A two-round independent review ran on this change. Round 1 raised three findings; all addressed (commit "Fix cycle review findings"):

  1. Documented that microunit intentionally replaces the CPS tax_unit_id even with policyengine_prefer_existing_tax_unit_ids=True (that path is a fallback for households microunit doesn't build), and added a lock-in test.
  2. Real bug fixed: the high-fidelity adapter assumed only the 1-based CPS A_FAMREL coding, silently mis-coding children as spouses on 0-based family_relationship frames (which the pipeline supports elsewhere). Now normalizes the coding scheme per household.
  3. Added regression tests for both boundaries.

Round 2 verdict: no actionable findings.

Tests / behavior

  • test_us.py 160 passed; delegation suite 13 passed; ruff clean.
  • tax_units/hh ≈ 1.42 vs authoritative 1.38 (legacy under-split at 1.16).

⚠️ Entity-convergence (#113)

microunit is eCPS's tax-unit engine, so any loss change from this is entity-convergence toward eCPS, not independent MP improvement.

Tracked follow-up

Thread A_HSCOL/enrollment into the adapter so microunit's qualifying-child-to-24 student extension fires (currently under-claims 19–23-yo student own-child dependents vs eCPS) → #122.

🤖 Generated with Claude Code

MaxGhenis and others added 5 commits May 31, 2026 16:36
…ized frame

Synthesize microunit's CPS input columns (PH_SEQ/A_LINENO/A_AGE/A_MARITL/
A_SPOUSE/PEPAR1/PEPAR2/A_EXPRRP) from microplex's normalized materialization
columns (household_id/age/relationship_to_head), so the microunit tax-unit
delegation can fire on real synthesized data. Gated OFF by default via
allow_normalized_adapter / config.microunit_construct_from_normalized.

HEURISTIC AND UNVALIDATED: the relationship->A_EXPRRP and married->A_MARITL maps
are approximate and PEPAR1/PEPAR2 assume a child's parents are the household head
and spouse. Fidelity must be validated against the legacy reconstruction before
trust (see #115). Default behavior is unchanged with the flag off.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… CPS data)

Running the adapter against real CPS ASEC surfaced two issues:
- microunit requires one reference person per household; guarantee the line-1
  member is the single head and demote spurious extra heads (multi-family).
- microunit still raises on households it cannot resolve, so wrap
  construct_tax_units in a fail-safe: log and return None (caller falls back to
  the legacy reconstruction) instead of crashing materialization.

10 tests passing (adds fail-safe + head-promotion coverage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ds (#115)

The heuristic adapter (from collapsed relationship_to_head) crashed microunit on
real heterogeneous households. microplex actually carries the real CPS-derived
fields at materialization, so build microunit's contract from those instead:
- A_LINENO from person_number (real 1-based within-household line number)
- A_SPOUSE from spouse_person_number (real spouse line pointer)
- A_EXPRRP from family_relationship; A_MARITL from spouse presence / surviving
- person_number==1 always anchors a valid reference person (no more crashes)
PEPAR1/PEPAR2 stay heuristic (child's parents = household head + spouse).

The normalized adapter now dispatches to this path when the real fields are
present, else the relationship_to_head heuristic. Validated on 8000 real CPS
households: microunit constructs without crashing, tax_units/hh=1.42 vs
authoritative 1.38 (vs legacy reconstruction's ~1.27 under-split). 12 tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
microunit is microplex's required tax-unit engine, not an optional prototype.
Default the high-fidelity path ON when the real CPS-derived fields
(person_number/spouse_person_number/family_relationship) are present -- which the
production candidate carries -- while the coarse relationship_to_head-only
heuristic stays opt-in via config so minimal frames don't get the lossy path.

Tests: the four build_policyengine_entity_tables role-flag tests now exercise the
microunit default path. filing_status is PE-computed (delegated; microplex does
not export it), so its non-authoritative internal value is no longer asserted.
The young-adult-child case asserts microunit's genuine qualifying-child age rule
(a 19+ non-student own-child gets its own unit, not folded). Adds a legacy-
fallback test (no high-fidelity fields). Full test_us.py: 159 passed; delegation
suite 12 passed; ruff clean.

Follow-up: thread A_HSCOL/enrollment into the adapter so microunit's
qualifying-child-to-24 student extension fires (currently under-claims 19-23yo
student own-child dependents vs eCPS).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s/tests

An independent review of the microunit delegation+activation change surfaced
three findings; fixes:

1. (docs / intended-behavior) microunit's default-on path replaces the
   CPS-provided tax_unit_id (Census TAX_ID) even when
   policyengine_prefer_existing_tax_unit_ids is True. This is the intended
   "replace the CPS tax units, keep the SPM units" behavior -- the existing-ID
   path is a fallback for households microunit does not construct -- but the
   method docstring wrongly claimed the authoritative-ID path "is never routed
   here". Rewrote the docstring to document the override + fallback ordering and
   the separate SPM/family/marital preservation.

2. (real bug) The high-fidelity adapter assumed only the 1-based CPS A_FAMREL
   coding, so a 0-based family_relationship frame -- which the pipeline supports
   elsewhere (see _normalize_relationship_to_head and data_sources.cps) --
   silently mis-coded children as spouses and dropped their parent pointers.
   Normalize the coding scheme per household (shift 0-based households up by one)
   before the A_EXPRRP / parent-pointer mapping.

3. (missing coverage) Added two regression tests for the previously-untested
   boundaries: bad CPS tax_unit_id [100,100,200] + high-fidelity fields ->
   microunit folds to one unit with spm_unit_id preserved; and a 0-based
   family_relationship frame -> same A_EXPRRP/parent pointers and partition as
   the 1-based frame.

test_us.py 160 passed; delegation suite 13 passed; ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis merged commit 0ac4d16 into main May 31, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the prototype/microunit-activation branch May 31, 2026 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant