Skip to content

Add legal artifact presets, FOSSA-compatible outputs#199

Merged
lelia merged 37 commits into
mainfrom
lelia/add-legal-checks
May 29, 2026
Merged

Add legal artifact presets, FOSSA-compatible outputs#199
lelia merged 37 commits into
mainfrom
lelia/add-legal-checks

Conversation

@lelia
Copy link
Copy Markdown
Contributor

@lelia lelia commented May 11, 2026

Summary

Introduces a compliance-oriented --legal workflow to socketcli and an opt-in --legal-format fossa mode for producing FOSSA-compatible artifact shapes.

Changes

The --legal workflow enables license generation and default artifact output for:

  • socket-report.json
  • socket-summary.txt
  • socket-report-link.txt
  • socket-sbom.json
  • socket-license.json

The new --legal-format fossa mode adapts those outputs to match the structural shapes the real FOSSA CLI emits — captured from a UiPath Azure DevOps FOSSA pipeline as reference (CE-199):

  • fossa-analyze.json — the composed wrapper FOSSA pipelines actually produce: {project, vulnerability[], licensing[], quality[]}. The project sub-object is the 6-key fossa analyze --json shape with id formatted as <projectLocator>$<revision>. vulnerability[] items follow the /api/v2/issues shape (28 fields including source, depths, statuses, projects[], remediation, metrics, epss, etc.).
  • fossa-sbom.json — the fossa report --json attribution shape: 5 top-level keys (copyrightsByLicense, deepDependencies, directDependencies, licenses, project). Per-Dependency entries are the 14-key FOSSA attribution shape, with attribution text sourced from Package.licenseAttrib[].attribText, direct/deep partitioning by Package.direct, and dependencyPaths as <ancestor> > <package> chains computed from topLevelAncestors.
  • FOSSA-like default filenames: fossa-analyze.json, fossa-test.txt, fossa-link.txt, fossa-sbom.json. The Socket-side --sbom-file slot is suppressed in fossa mode (the FOSSA "SBOM" artifact is the attribution payload).
  • Consistent JSON formatting: both JSON outputs written with indent=2.

Adds

  • Explicit file output support for JSON reports, summary text, and report links
  • Hardened legal artifact generation for sparse scan paths so artifact creation completes safely even when SBOM/package data is incomplete

Documented gaps

Fields with no Socket data source are emitted as consistent documented defaults (see module docstring at top of socketsecurity/fossa_compat.py). Examples: vulnerability[].epss, cvssVector, exploitability, cveStatus, published, customRiskScore, project timestamps, semver-distance labels; per-dependency description, downloadUrl, projectUrl, hash, isGolang, notes, otherLicenses; top-level copyrightsByLicense and licenses body-text map. partialFix and completeFix collapse to the same value since Socket has only one fix-version concept.

Testing

  • Unit test coverage for:
    • --legal and --legal-format defaults
    • FOSSA-compatible analyze report shape (top-level keysets, vulnerability/licensing/quality item shapes)
    • FOSSA attribution shape (5 top-level keys, 14-field per-Dependency entries, direct/deep partitioning, dependency-path computation, license attribution sourcing with fallback chain)
    • Sparse-data scenarios
  • Structural parity tests (tests/unit/test_fossa_parity.py) that load real FOSSA artifacts captured from the UiPath pipeline (committed to tests/fixtures/fossa/) and assert our builder output's keysets match at every level (top-level, project, dependency). These guard against future drift from FOSSA's actual shape.
  • 232 unit tests passing.

Test plan

  • Unit tests pass (uv run pytest tests/)
  • Structural parity tests assert keyset equality against real FOSSA fixtures
  • Manual end-to-end against a real Socket scan with --legal-format fossa and confirm outputs satisfy the customer's validation pipeline gate (file exists, non-empty, parseable JSON for the two JSON files)

lelia added 3 commits May 11, 2026 12:04
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
@lelia lelia requested a review from Douglas (dacoburn) May 11, 2026 16:13
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

🚀 Preview package published!

Install with:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple socketsecurity==2.2.91.dev1

Docker image: socketdev/cli:pr-199

lelia added 8 commits May 11, 2026 12:15
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
@lelia lelia changed the title [DRAFT] Simplify compliance workflow with legal preset and artifacts Add legal artifact presets, FOSSA-compatible outputs May 18, 2026
@lelia lelia marked this pull request as ready for review May 18, 2026 18:55
@lelia lelia requested a review from a team as a code owner May 18, 2026 18:55
lelia added 5 commits May 18, 2026 19:15
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia and others added 6 commits May 21, 2026 15:49
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Real FOSSA artifacts use \$ as the revision separator in project.id, not
\-. Update _build_project_metadata and add two tests that pin the correct
separator and fallback behaviour.
Adds customRiskScore: None to vulnerability entries (FOSSA samples
include this field, sometimes null). Documents all gap fields and their
defaults in the module docstring. Locks the new key in EXPECTED_VULNERABILITY_KEYS.
Replaces the 2-key {project, dependencies} shape with the real FOSSA
attribution shape: copyrightsByLicense, deepDependencies,
directDependencies, licenses, project.

The SBOM project field is now the 2-key {name, revision} subset rather
than the 6-key analyze project shape. _partition_dependencies is a stub
returning ([], []) until Tasks 7-9 fill in per-dependency entries.
Add _build_dependency_entry and _build_dependency_licenses to produce
the 14-key per-dependency dict that matches real FOSSA attribution
output. License entries prefer licenseAttrib (full attribText + spdxExpr),
fall back to declared license string, or emit [] when unlicensed.

Also removes the stale test_fossa_attribution_payload_shape_is_stable
test, which asserted the pre-Task-6 two-key shape and was already
failing.
Replaces the stub that always returned [package.name] with real logic:
direct deps emit just their name; transitive deps emit one
"<ancestor> > <package>" chain per top-level ancestor, falling back to
name-only when ancestors are absent or not in the lookup.
Pin project.id to dollar separator, replace 2-key SBOM with 5-key
shape, and update per-dependency assertions to the 14-key
_build_dependency_entry contract.
@lelia
Copy link
Copy Markdown
Contributor Author

lelia commented May 27, 2026

Eric Hibbs (@flowstate) review notes:

  • resolve the existing merge conflicts (pyproject.toml, uv.lock, socketsecurity/__init__.py)
  • sanitize any customer references/artifacts (eg. tests/fixtures/fossa/README.md‎)
  • the --legal-format fossa format only includes new_alerts (plus unchanged_alerts with --strict-blocking enabled) which could under-represent full findings compared to the typical FOSSA pipeline
    • affected areas: _iter_selected_issues(), build_fossa_report_payload())
    • this could create a parity risk for FOSSA users expecting project-wide issue sets
  • README phrasing doesn't match the FOSSA SBOM shape - the code uses directDependencies / deepDependencies instead of a single dependencies key (L159 of README)
    • affected areas: fossa_compat.py and test_fossa_parity.py
    • this could cause confusion for users integrating against the schema

FOSSA's /api/v2/issues endpoint returns a point-in-time snapshot of all
issues at the scan revision, not only diff-new ones. The previous
implementation only included unchanged alerts when --strict-blocking
was set, causing FOSSA-mode output to under-represent project-wide
findings compared to the typical FOSSA pipeline.
Replace customer org ID and project name with generic placeholders
(1234/example-validation-project) across all four fixtures and the README.
Structural shape, key sets, value types, and per-field cardinality are
unchanged. Parity tests assert keysets only, so the substitution is
transparent to test behavior.
The SBOM artifact now matches FOSSA's `report --json attribution`
shape with five top-level keys, not the previously documented
`project` / `dependencies` two-key payload.
- pyproject.toml + socketsecurity/__init__.py: keep this branch's version
  (2.2.91, bumped for the legal-artifacts release)
- uv.lock: regenerated via `uv lock` after merge (don't hand-merge lockfiles)
@flowstate
Copy link
Copy Markdown
Contributor

lelia thanks for the review. Addressed all four points in commits ff9edd17444463:

  • Merge conflicts — merged main in (7444463). Kept this branch's version (2.2.91) in pyproject.toml and socketsecurity/__init__.py; regenerated uv.lock via uv lock rather than hand-merging.
  • Customer references — sanitized the fixture JSONs and README (fee62de). All four fixtures now use 1234/example-validation-project instead of the captured customer identifiers. Structural shape and key sets are unchanged; the parity tests assert keysets only so this is transparent to them. Confirmed no DevTools-Validation-Pipeline, 6060/DevTools, or saml/6060 markers remain.
  • _iter_selected_issues parity risk — fixed (ff9edd1). The strict_blocking gate is removed; unchanged_alerts always flows into the FOSSA payload, matching /api/v2/issues?...&scope[revision]=...'s point-in-time-snapshot semantics. Added a regression test asserting both strict_blocking=True and =False produce the same CVE set in FOSSA mode. The data was already on diff_report.unchanged_alerts so no SDK changes needed.
  • README phrasing — updated L159 (aa472cb) from the old project / dependencies two-key shape to the actual five-key copyrightsByLicense / deepDependencies / directDependencies / licenses / project shape.

234 unit tests passing on the merged branch.

@lelia lelia merged commit d502ab3 into main May 29, 2026
15 checks passed
Eric Hibbs (flowstate) added a commit that referenced this pull request May 29, 2026
#199 landed on main between the original 2.2.91 bump and this PR opening,
so 2.2.91 ties main and fails check_version. Bump to 2.2.92.
Eric Hibbs (flowstate) added a commit that referenced this pull request May 29, 2026
#199 landed on main between the original 2.2.91 bump and this PR opening,
so 2.2.91 ties main and fails check_version. Bump to 2.2.92.
lelia pushed a commit that referenced this pull request May 29, 2026
* test: failing repro for empty title on gptDidYouMean alerts

* test: failing repro for empty title on unknown alert types

* test: lock in licenseSpdxDisj title fallback

* feat(core): add alert-type humanizer and override-map plumbing

* fix(core): fall back to humanized title for unmapped alert types

Resolves CUS2-2: gptDidYouMean and any future alert type without SDK
metadata previously rendered as a blank Alert column in the CLI output
table, SARIF report, and PR/security comments. Title resolution now
falls back through an explicit override map and a generic humanizer.

* test: hoist _humanize_alert_type import to module scope

* chore(release): bump to 2.2.92 to clear main collision

#199 landed on main between the original 2.2.91 bump and this PR opening,
so 2.2.91 ties main and fails check_version. Bump to 2.2.92.
lelia added a commit that referenced this pull request May 29, 2026
Patch release. Scope is maintenance only: dependency bundle + Dependabot
review hardening + housekeeping + CHANGELOG backfill. No behavior changes.

Targets 2.2.93 (not 2.2.92) to stay ahead of an in-flight 2.2.92 bug-fix
release landing separately.

CHANGELOG: 2.2.93 entry for this PR, plus backfilled entries for 2.2.81,
2.2.85, 2.2.86, 2.2.88, 2.2.89, and 2.2.91 (the #180 backfill covered
2.2.74-2.2.80; main reached 2.2.91 via #199 without a CHANGELOG note).

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock per the version-incrementation CI check.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia added a commit that referenced this pull request May 29, 2026
Patch release. Scope is maintenance only: dependency bundle + Dependabot
review hardening + housekeeping + CHANGELOG backfill. No behavior changes.

Targets 2.2.93 (not 2.2.92) to stay ahead of an in-flight 2.2.92 bug-fix
release landing separately.

CHANGELOG: 2.2.93 entry for this PR, plus backfilled entries for 2.2.81,
2.2.85, 2.2.86, 2.2.88, 2.2.89, and 2.2.91 (the #180 backfill covered
2.2.74-2.2.80; main reached 2.2.91 via #199 without a CHANGELOG note).

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock per the version-incrementation CI check.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia added a commit that referenced this pull request May 29, 2026
Patch release. Scope is maintenance only: dependency bundle + Dependabot
review hardening + housekeeping + CHANGELOG backfill. No behavior changes.

Targets 2.2.93 (not 2.2.92) to stay ahead of an in-flight 2.2.92 bug-fix
release landing separately.

CHANGELOG: 2.2.93 entry for this PR, plus backfilled entries for 2.2.81,
2.2.85, 2.2.86, 2.2.88, 2.2.89, and 2.2.91 (the #180 backfill covered
2.2.74-2.2.80; main reached 2.2.91 via #199 without a CHANGELOG note).

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock per the version-incrementation CI check.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia added a commit that referenced this pull request May 29, 2026
Patch release. Scope is maintenance only: dependency bundle + Dependabot
review hardening + housekeeping + CHANGELOG backfill. No behavior changes.

Targets 2.2.93 (not 2.2.92) to stay ahead of an in-flight 2.2.92 bug-fix
release landing separately.

CHANGELOG: 2.2.93 entry for this PR, plus backfilled entries for 2.2.81,
2.2.85, 2.2.86, 2.2.88, 2.2.89, and 2.2.91 (the #180 backfill covered
2.2.74-2.2.80; main reached 2.2.91 via #199 without a CHANGELOG note).

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock per the version-incrementation CI check.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
lelia added a commit that referenced this pull request May 29, 2026
* chore: prettify, sort, and round out .gitignore

Reorganizes .gitignore into labeled sections (Python cache, venvs, build
artifacts, IDE, OS, logs, env files, generated output, project scratch,
Conductor) with sorted entries within each group and trailing slashes on
directory patterns for clarity.

Folds in three smaller intents that would otherwise be separate commits:
- Add .context/ for Conductor workspaces (collaboration scratch)
- Add coverage.xml + .pytest_cache/ to fully cover pytest-cov outputs
  (.coverage.* and htmlcov/ were already on main from prior work)
- Add *.swp / *.swo for vim swap files

Drops the stale `*.cpython-312.pyc\`` line with a literal-backtick typo;
it wasn't matching anything and `*.pyc` already covers the case.

No behavior changes anyone would notice from the resulting rule set.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* ci: add .github/dependabot.yml to tame Dependabot PR noise

The repo had no explicit Dependabot config, so Dependabot ran on full
defaults: one PR per package per manifest, across every manifest in
the tree -- including the e2e test fixtures that are intentionally
crafted to exercise Socket's scanner. The cumulative result was the
"PR pileup" this PR is consolidating.

New config:
- uv ecosystem (main app): grouped weekly into ONE minor/patch PR and
  one major PR; matches the existing python:uv labeling
- github-actions: grouped weekly into ONE minor/patch PR
- docker: separate weekly PR per Dockerfile change
- 7-day cooldown across all ecosystems to give upstream time to pull
  bad releases
- e2e fixtures (tests/e2e/fixtures/{simple-npm,simple-pypi}) are
  INTENTIONALLY excluded -- their pins should be chosen for supply-
  chain signal, not auto-bumped (this is why we had three fixture
  PRs in the cleanup)

Pattern adapted from SocketDev/socket-basics.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* ci: add dependabot-review workflow with Socket Firewall smoke jobs

For every Dependabot-authored PR, inspect what changed and conditionally
run Socket Firewall (sfw) install smoke jobs against the affected
manifests. Because sfw uses the anonymous Socket public-data API it
needs NO secret, so this runs cleanly under the standard `pull_request`
context -- no pull_request_target, no token-leak surface.

Jobs (all conditional on file diff):
- python-sfw-smoke:      pyproject.toml / uv.lock -> `sfw uv sync` plus
                         an import smoke on the modules that depend on
                         the upgraded packages (cryptography, gitpython,
                         requests, ...). Catches API-removal breaks
                         from minor/patch deprecations.
- fixture-npm-sfw-smoke: tests/e2e/fixtures/simple-npm/** -> `sfw npm
                         install` in a clean cwd.
- fixture-pypi-sfw-smoke: tests/e2e/fixtures/simple-pypi/** -> `sfw pip
                         install -r requirements.txt` in a clean venv.
- dockerfile-smoke:      `docker build --pull` (no push) when the
                         Dockerfile changes.
- workflow-notice:       Flag Dependabot PRs that touch workflow or
                         dependabot config files for explicit human
                         review (anti-supply-chain-confusion guardrail).

Pattern adapted from SocketDev/socket-basics dependabot-review.yml.
Action SHAs match the pins already in python-tests.yml and e2e-test.yml
so zizmor stays happy.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* ci: add lock-drift, import-smoke, and pip-audit; skip e2e on dependabot

python-tests.yml:
- `uv lock --locked` -- fails if uv.lock has drifted from pyproject.toml.
  Prevents the "forgot to commit the lockfile" class of mistake.
- Import smoke step that loads every top-level module touching the
  upgraded packages (cryptography, gitpython, requests, urllib3, ...).
  Catches API-removal breaks from minor/patch deprecations that the
  unit suite alone wouldn't surface.
- `uvx pip-audit --strict` against the synced env -- light CVE check
  on the resolved transitive tree. Runs in seconds via uv's caching.

e2e-test.yml:
- Skip e2e on Dependabot PRs. They don't have access to the Socket API
  secret so e2e would always fail on them, polluting the PR check UI.
  Supply-chain risk for dep bumps is covered by dependabot-review.yml's
  Socket Firewall smoke jobs, which need no secrets.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* ci: fix pip-audit invocation to scan exported requirements

`uvx pip-audit --disable-pip` requires `-r` plus either hashed
requirements or `--no-deps`. The previous invocation crashed at start.

Now: export the locked deps via `uv export --no-hashes --no-emit-project`
into a tmp requirements file (skipping the local editable install of
the project itself), then feed that to pip-audit with `--disable-pip
--no-deps`. Verified locally -- no known vulnerabilities found across
the 85 locked transitive deps.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* chore(deps): bump 9 main-app dependencies to latest

Bundles the nine open Dependabot PRs against the main app into a single
uv.lock regeneration. Where Dependabot's target trailed the latest published
release, we went to the current latest and re-verified through sfw:

- urllib3       2.6.3   -> 2.7.0     (closes #200)
- gitpython     3.1.46  -> 3.1.50    (closes #198)
- python-dotenv 1.2.1   -> 1.2.2     (closes #190)
- pytest        9.0.2   -> 9.0.3     (closes #188)
- uv            0.9.21  -> 0.11.17   (closes #210; Dependabot targeted 0.11.15)
- cryptography  46.0.5  -> 46.0.7    (closes #181)
- pygments      2.19.2  -> 2.20.0    (closes #177)
- requests      2.32.5  -> 2.33.0    (closes #175)
- idna          3.11    -> 3.15      (closes #205, CVE-2026-45409)

idna 3.14 fixed CVE-2026-45409 -- a quadratic-time DoS via oversized inputs
that bypassed the earlier CVE-2024-3651 mitigation. The rest are hygiene.

All nine final versions verified clean through Socket Firewall (sfw) on the
full transitive tree.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* chore(deps): bump e2e fixture manifests

Closes the open Dependabot PRs against the e2e test fixtures. axios went to
the current latest (1.16.1) rather than Dependabot's 1.16.0 target:

- tests/e2e/fixtures/simple-npm:  axios    1.15.0 -> 1.16.1  (closes #209)
- tests/e2e/fixtures/simple-pypi: requests 2.31.0 -> 2.33.0  (closes #187)
- tests/e2e/fixtures/simple-pypi: flask    3.0.0  -> 3.1.3   (closes #186)

These fixtures were stale rather than intentionally pinned. Socket Firewall
verified the install paths. The new .github/dependabot.yml intentionally
excludes tests/e2e/fixtures/** from future auto-bumps.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

* chore(release): 2.2.93 with CHANGELOG backfill

Patch release. Scope is maintenance only: dependency bundle + Dependabot
review hardening + housekeeping + CHANGELOG backfill. No behavior changes.

Targets 2.2.93 (not 2.2.92) to stay ahead of an in-flight 2.2.92 bug-fix
release landing separately.

CHANGELOG: 2.2.93 entry for this PR, plus backfilled entries for 2.2.81,
2.2.85, 2.2.86, 2.2.88, 2.2.89, and 2.2.91 (the #180 backfill covered
2.2.74-2.2.80; main reached 2.2.91 via #199 without a CHANGELOG note).

Version refs synced across pyproject.toml, socketsecurity/__init__.py, and
uv.lock per the version-incrementation CI check.

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>

---------

Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants