Publish Microplex target diagnostics artifact for aggregate analysis

## Problem

The calibration diagnostics dashboard can consume the public `microplex-us` artifacts under `artifacts/`, but those files only expose summary-level information:

- `pe_us_data_rebuild_parity.json`
- `live_pe_us_data_rebuild_checkpoint_modelpass_regression_summary_20260410.json`
- `live_pe_us_data_rebuild_checkpoint_national_irs_other_drilldown_20260410.json`

The current public JSONs report headline counts and losses, e.g. `n_targets_kept`, `n_national_targets`, `n_state_targets`, win rates, and broad native-loss metrics. They do not include the target-level rows needed to understand Microplex on its own: which aggregate values it produces, how those compare to the target oracle, and which target families drive error.

The us-data baseline is useful comparator context, but it should not be the only framing. Analysts need a Microplex-first target performance table.

## Requested artifact

Please publish a run-level artifact that contains Microplex aggregate performance against the active PolicyEngine target oracle. Either of these would work:

1. A full target diagnostics JSON/CSV, committed or otherwise publicly downloadable.
2. The Microplex output H5, publicly downloadable, so downstream services can compute the same diagnostics.

## Suggested target diagnostics schema

A row-oriented JSON or CSV would be easiest for dashboards/API consumers. Suggested fields:

- `run_id` / `artifact_id`
- `candidate_dataset`, e.g. `microplex`
- `target_id`
- `variable`
- `entity`
- `geography` / `geo_level` / `state` where applicable
- `period`
- `target_value`
- `microplex_aggregate`
- `microplex_absolute_error`
- `microplex_relative_error`
- `loss_contribution` or equivalent weighted term
- `family` / target group
- `in_loss`
- `supported_by_microplex`

Optional comparator fields are also useful, but should be treated as secondary context:

- `baseline_dataset`, e.g. `enhanced_cps_2024.h5`
- `us_data_aggregate`
- `us_data_absolute_error`
- `us_data_relative_error`
- `delta_absolute_error`
- `delta_relative_error`

## Why this matters

The calibration diagnostics dashboard now has a Microplex target-performance page and API endpoint, but it can only show rollups from the current public artifacts. Analysts want to answer questions like:

- How does Microplex perform against the target oracle by aggregate?
- Which aggregates drive Microplex native loss?
- Are failures concentrated in IRS SOI, state AGI distributions, ACA, SNAP, CTC, etc.?
- For a state/federal reform analysis, which upstream calibration nodes are trustworthy or suspect?
- Where is us-data useful comparator context, and where is Microplex simply unsupported or wrong against the target value?

Without target-level rows or an H5, the dashboard has to label the view as aggregate-only and cannot provide a full per-target/per-aggregate diagnostic view.

## Downstream consumer

This is needed by `PolicyEngine/calibration-diagnostics` PR #9 and follow-up API work. Once this artifact exists, the dashboard can add an endpoint like `/microplex/targets` or `/microplex/diff` to expose the full Microplex aggregate performance table, with optional us-data comparator columns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish Microplex target diagnostics artifact for aggregate analysis #133

Problem

Requested artifact

Suggested target diagnostics schema

Why this matters

Downstream consumer

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Publish Microplex target diagnostics artifact for aggregate analysis #133

Description

Problem

Requested artifact

Suggested target diagnostics schema

Why this matters

Downstream consumer

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions