Add Wooldridge ETWFE LinkedIn carousel by igerber · Pull Request #272 · igerber/diff-diff

igerber · 2026-04-05T13:56:25Z

Summary

Add 9-slide LinkedIn carousel PDF announcing the Wooldridge ETWFE estimator (v2.9)
Light theme with emerald green accent, targeting applied researchers
Narrative arc: nonlinear DiD gap → Wooldridge (2023, 2025) ETWFE → three likelihoods (OLS/logit/Poisson) → ASF + delta-method SEs → event study → code example → milestone + community credit → CTA
All claims validated against source code, methodology docs, and external academic literature
Softened initial claims after validation: "When Linear DiD Isn't Enough" (not "breaks"), "Standard parallel trends may not hold on the level scale" (not "biased ATT")

Methodology references (required if estimator / math changes)

Method name(s): N/A - no methodology code changes (carousel/marketing only)
Paper / source link(s): N/A
Any intentional deviations from the source (and why): N/A

Validation

Tests added/updated: No test changes (PDF generator script only)
Backtest / simulation / notebook evidence (if applicable): N/A
Manual verification: PDF renders 9 pages, all text readable, equations render cleanly, no footer overlap

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

github-actions · 2026-04-05T14:03:09Z

Overall Assessment

⚠️ Needs changes — one unmitigated P1 methodology documentation mismatch.

No estimator code or inference code changed in this PR. The review risk is in newly added docs/marketing copy drifting from the Methodology Registry and current API.

Executive Summary

The estimator implementation itself looks unchanged; I did not find a new numerical regression in ETWFE code paths.
P1: new Wooldridge/ETWFE copy repeatedly says OLS uses ASF-based ATT, but the registry and source limit ASF to the nonlinear logit/Poisson paths; OLS uses direct β_{g,t} extraction.
P2: new survey-support copy says all estimators accept survey_design / survey weights, but WooldridgeDiD.fit() has no survey_design parameter and the same PR documents that gap.
P2: the new SurveyDesign.subpopulation() example in llms-full shows the wrong return contract.
I did not find a security or performance blocker in the new carousel generator.

Methodology

P1 WooldridgeDiD / ETWFE is described incorrectly in the new public copy.
Impact: the changed docs/marketing now describe ETWFE as “ASF-based ATT” for OLS/logit/Poisson, which contradicts both the Methodology Registry and the implementation. The registry limits ASF to the nonlinear extensions, while the OLS path takes the estimated treatment-cell coefficients directly. Affected changed files: ROADMAP.md, docs/choosing_estimator.rst, docs/llms.txt, docs/llms-full.txt, carousel/generate_wooldridge_carousel.py. Source of truth: docs/methodology/REGISTRY.md, docs/methodology/REGISTRY.md, diff_diff/wooldridge.py, diff_diff/wooldridge.py.
Concrete fix: rewrite the changed copy to distinguish paths explicitly, e.g. “OLS estimates ATT(g,t) directly from the saturated regression; logit and Poisson use ASF-based ATT with delta-method SEs.”

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings. A basic strings scan of the committed PDF did not show obvious secrets or local-path leakage. Full PDF metadata inspection was limited because pdfinfo is not available in this sandbox.

Documentation/Tests

P2 The new survey-support copy overstates current support.
Impact: multiple changed files now say all estimators accept survey weights or survey_design, but WooldridgeDiD.fit() does not accept that parameter today. This will cause users/agents to generate invalid code and overstate supported weighting/inference paths. Problematic changed copy: ROADMAP.md, docs/llms.txt, docs/llms-full.txt. Contradictory source in the same PR: ROADMAP.md, docs/survey-roadmap.md, diff_diff/wooldridge.py.
Concrete fix: change the new copy to say WooldridgeDiD is the current exception, or state the supported count explicitly instead of “all estimators.”
P2 The new subpopulation() example is wrong.
Impact: docs/llms-full.txt shows sd_female = sd.subpopulation(...), but SurveyDesign.subpopulation() returns a (SurveyDesign, DataFrame) pair, and the returned DataFrame must be used with the returned design. This will mislead users and LLM consumers of this file. Source of truth: diff_diff/survey.py, docs/methodology/REGISTRY.md.
Concrete fix: replace the example with sd_sub, data_sub = sd.subpopulation(...) and show fit(..., survey_design=sd_sub) on data_sub.

Path to Approval

Fix the Wooldridge/ETWFE methodology wording in the changed docs/carousel so ASF is attached only to the nonlinear logit/Poisson paths, not OLS.
Fix the new survey-support copy in the changed files to explicitly exclude WooldridgeDiD until survey support is implemented.
Correct the subpopulation() example in docs/llms-full.txt to unpack the returned (SurveyDesign, DataFrame) pair and use the returned DataFrame in fit().

9-slide carousel (light theme, emerald accent) announcing the WooldridgeDiD estimator in v2.9. Covers nonlinear DiD methodology, three likelihoods (OLS/logit/Poisson), ASF-based ATT with delta-method SEs, event study visualization, and community contribution credit. All claims validated against source code and academic literature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

OLS extracts ATT(g,t) directly from saturated regression coefficients; ASF-based ATT with delta-method SEs applies only to the nonlinear logit/Poisson paths. Updated slide 1 teaser and slide 5 to clearly distinguish the two paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-04-05T14:13:58Z

/ai-review

github-actions · 2026-04-05T14:20:00Z

🔁 AI review rerun (requested by @igerber)

Head SHA: bc1beea3d5b7551af49723165d272deb07539fa7

Overall Assessment

⚠️ Needs changes — the prior ASF/OLS blocker appears resolved, but one unmitigated P1 methodology mismatch remains in the new carousel copy.

Executive Summary

The previous P1 about ASF wording looks fixed: the current carousel now separates OLS direct-coefficient ATT from nonlinear ASF ATT.
The earlier survey-doc findings from the last review are no longer in the current diff; this re-review is limited to the new carousel script/PDF, and no estimator or inference code changed here.
P1 [Newly identified] Slide 3 presents an OLS-only unit-FE ETWFE equation as the general method while adjacent copy frames the estimator as one framework for OLS, logit, and Poisson.
P2 The slide 7 code example uses results.aggregate('event').summary(), but summary() defaults to the simple overall ATT table, not the event-study table.
P2 The generator script depends on PIL and fpdf, but those packages are not declared in pyproject.toml, so the committed PDF is not reproducible from the repo’s declared environment.
No security blocker found. The committed PDF is structurally 9 pages, and a basic strings scan did not reveal obvious secrets or local-path leakage.

Methodology

P1 [Newly identified] carousel/generate_wooldridge_carousel.py:L393-L413 shows Y_it = α_i + γ_t + ... as “Wooldridge ETWFE” while the same slide positions the estimator as covering OLS, logit, and Poisson.
Impact: that equation matches the OLS ETWFE path in the registry (docs/methodology/REGISTRY.md:L1094-L1103), but the documented nonlinear implementation uses additive cohort and time dummies rather than unit fixed effects (docs/methodology/REGISTRY.md:L1152-L1154, diff_diff/wooldridge.py:L735-L739, diff_diff/wooldridge.py:L923-L929). That distinction is methodological, not just cosmetic notation.
Concrete fix: label the displayed equation as the OLS specification only, or replace it with wording/notation that matches the nonlinear paths as implemented: additive cohort/time effects plus ASF ATT for logit/Poisson.

Code Quality

No findings.

Performance

No findings.

Maintainability

P2 carousel/generate_wooldridge_carousel.py:L8-L14 imports matplotlib, PIL, and fpdf, but pyproject.toml:L46-L75 only declares matplotlib as an optional dependency; Pillow and the PDF package are undeclared.
Impact: contributors cannot reliably regenerate carousel/diff-diff-wooldridge-carousel.pdf from the repo’s declared environment.
Concrete fix: add the exact image/PDF packages to a suitable extra (dev, docs, or a new carousel extra), or add a short carousel/README with the required install command and regeneration step.

Tech Debt

No findings.

Security

No findings. A basic scan of carousel/diff-diff-wooldridge-carousel.pdf showed a 9-page page tree and no obvious embedded secrets. Full metadata inspection was limited because pdfinfo/exiftool are not available in this environment.

Documentation/Tests

P2 carousel/generate_wooldridge_carousel.py:L680-L682 uses results.aggregate('event').summary(), but diff_diff/wooldridge_results.py:L165-L171 and diff_diff/wooldridge_results.py:L203-L214 default summary() to aggregation="simple".
Impact: a user who copies the slide code will see the simple overall ATT table, not the event-study output the slide is advertising.
Concrete fix: change the example to print(results.aggregate('event').summary('event')) or split it into results.aggregate('event') followed by print(results.summary('event')).

Path to Approval

Update slide 3 in carousel/generate_wooldridge_carousel.py:L393-L413 so the displayed model matches the documented estimator variants: either explicitly label it as the OLS ETWFE equation, or replace it with copy that correctly states logit/Poisson use additive cohort/time effects plus ASF ATT. Then regenerate carousel/diff-diff-wooldridge-carousel.pdf.

- Slide 3: label equation as OLS specification, add emerald note that logit/Poisson use cohort+time dummies with ASF for ATT - Slide 7: fix .summary() to .summary('event') so copied code produces the event-study table, not the default simple ATT - Tighten slide 3 bullet spacing to prevent footer overlap Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-04-05T14:33:16Z

/ai-review

github-actions · 2026-04-05T14:39:32Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 8e58af655975b26b3562ed3a4d08d4571d94153d

Overall Assessment

✅ Looks good — this re-review clears the prior P1 methodology issue and the prior slide-7 API-example issue. The only unmitigated item I found is P2 reproducibility for the generator environment.

Executive Summary

The prior P1 is resolved: slide 3 now explicitly labels the displayed equation as the OLS ETWFE specification and separately states the nonlinear logit/Poisson path as cohort+time dummies plus ASF ATT (carousel/generate_wooldridge_carousel.py:L395-L414; docs/methodology/REGISTRY.md:L1094-L1103, docs/methodology/REGISTRY.md:L1152-L1154).
The prior slide-7 example issue is also resolved: the carousel now uses .summary('event'), which matches the actual API behavior (carousel/generate_wooldridge_carousel.py:L693-L697; diff_diff/wooldridge_results.py:L165-L258).
No estimator, weighting, variance, or identification code changed in this PR; methodology risk is confined to the claims in the new carousel assets.
One prior P2 remains unresolved: the new generator imports PIL and fpdf, but those packages are still not declared in pyproject.toml, so the PDF is not reproducible from the repo’s declared environment (carousel/generate_wooldridge_carousel.py:L8-L14; pyproject.toml:L52-L75).
P3 informational: slide 5’s “Matches Stata jwdid_estat output” wording is slightly broader than the registry-supported claim, because the registry documents a small nonlinear QMLE SE deviation from Stata (carousel/generate_wooldridge_carousel.py:L615-L619; docs/methodology/REGISTRY.md:L1115-L1121, docs/methodology/REGISTRY.md:L1123-L1130).

Methodology

No unmitigated P0/P1 findings. The previous slide-3 methodology mismatch is fixed in both the source and the committed PDF.

Severity P3 — carousel/generate_wooldridge_carousel.py:L615-L619
Impact: “Matches Stata jwdid_estat output” reads like exact parity, but docs/methodology/REGISTRY.md:L1115-L1121 documents a small nonlinear QMLE SE difference from Stata, while docs/methodology/REGISTRY.md:L1123-L1130 supports parity on aggregation structure/weighting. This is informational only because the deviation is explicitly documented in the registry.
Concrete fix: tighten the wording to “Matches Stata jwdid_estat aggregation types/weighting” or add a small footnote pointing readers to the registry note on SE differences.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity P2 — unresolved from the prior review — carousel/generate_wooldridge_carousel.py:L8-L14; pyproject.toml:L52-L75
Impact: the new generator depends on matplotlib, PIL, and fpdf, but the repo only declares matplotlib in optional extras and does not declare the image/PDF packages at all. In the current environment, those imports are absent, so contributors cannot regenerate carousel/diff-diff-wooldridge-carousel.pdf from the project’s declared setup.
Concrete fix: add the missing packages backing PIL and fpdf to an explicit extra such as carousel or dev, and document the install command; if the team wants to defer that, track it in TODO.md so the limitation is explicit.

Tech Debt

No findings.

Security

No findings. I decompressed the PDF streams and verified the binary has a 9-page page tree, contains the updated slide text, and does not expose obvious local paths or secret-like strings.

Documentation/Tests

Severity P3 — carousel/generate_wooldridge_carousel.py:L805-L826; tests/test_imputation.py:L1248-L1255
Impact: this PR adds a new scripted artifact plus committed PDF, but no smoke test covers importability/regeneration or even a basic page-count check. The repo already has precedent for validating marketing claims with tests, so this remains manual-only.
Concrete fix: add a lightweight optional-deps smoke test that imports the generator and verifies a 9-page export/page tree when the relevant extra is installed, or at minimum add a test that the script’s dependencies are declared.

igerber and others added 2 commits April 5, 2026 10:10

igerber force-pushed the wooldridge-carousel branch from bee1e7c to bc1beea Compare April 5, 2026 14:13

igerber added the ready-for-ci Triggers CI test workflows label Apr 5, 2026

igerber merged commit 53b0e6f into main Apr 5, 2026

igerber deleted the wooldridge-carousel branch April 5, 2026 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Wooldridge ETWFE LinkedIn carousel#272

Add Wooldridge ETWFE LinkedIn carousel#272
igerber merged 3 commits intomainfrom
wooldridge-carousel

igerber commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

igerber commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

igerber commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Apr 5, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

igerber commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

igerber commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant