regression tests from cyber2 and cyber3 by AlexandreYang · Pull Request #267 · DataDog/rshell

AlexandreYang · 2026-05-19T20:49:33Z

What does this PR do?

Adds vulnerability-hunting test coverage from the 2026-05-19-gpt-5.5-cyber-2 campaign across the shell interpreter and several builtins. Includes both Go tests and YAML scenarios.

Areas covered:

Interpreter:
- Brace groups: redirect blocking (file / /dev/null / dynamic), fd duplication restoring streams, pipeline exit isolation, input redirect isolation, expansion output as data
- Heredocs: blocked output redirects, pipe reader closing early, quoted unsupported param literals, readonly in unquoted cmdsubst, stdin restoration after statement
- Functions: blocked in subshell, blocked with redirects (/dev/null and file), blocked with readonly body
- until loops: composition with readonly body, redirected loop body writes, redirect to /dev/null, subshell isolation
- while loops: additional clause coverage
- Command substitution: hardening tests
- Variable assignment: same-name multi-assign restores prior value
- Signal handling: pipeline panic recovery
- Variable size limits
Builtins:
- cut: sandbox and special-file coverage
- df: more resilient POSIX/-h parsing in tests
- grep: numeric overflow rejection
- pwd: symlink cwd coverage
- strings: vuln hunt coverage

Motivation

Part of the ongoing vuln-hunt campaign to harden the safe shell interpreter against attacks via shell control structures, redirects, and builtin edge cases.

Testing

New scenario YAMLs are asserted against bash by default (those that don't intentionally diverge for sandbox reasons).
New Go tests run with go test ./....

Checklist

Tests added/updated
Documentation updated (if applicable)

datadog-prod-us1-4 · 2026-05-19T20:51:05Z

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 4 Pipeline jobs failed

CI | Test (windows-latest)

🔧 Fix in code (Fix with Cursor).
5 tests failed due to unexpected errors related to file operations and syntax issues in test cases under Windows environment.

Compliance | compliance

🔧 Fix in code (Fix with Cursor).
Header line mismatch in multiple test files. Expected license header not found.

Fuzz Tests | Fuzz Differential (wc)

🔧 Fix in code (Fix with Cursor).
wc -w stdout mismatch: expected '1 input.txt\n' but got '2 input.txt\n' in wc_differential_fuzz_test.go:171

View all 4 failed jobs.

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 710b093 | Docs | Datadog PR Page | Give us feedback!}

… panic recovery

…jection

…arios

…ile tests

…erage

AlexandreYang · 2026-05-20T07:56:31Z

@codex review this PR

Please use the following severity format for all findings:

P0 (red): Exploitable vulnerability with high impact (RCE, sandbox bypass, data breach). Blocking merge.
P1 (orange): Likely exploitable or high-risk — correctness bugs vs bash, data races, panics.
P2 (yellow): Potential vulnerability, bash divergence, missing test coverage, missing docs.
P3 (blue): Style, minor simplification, hardening suggestion, nice-to-have test.

Prefix each finding title with its priority label, e.g. "P0: ...", "P1: ...", etc.
Include a summary table at the top with columns: # | Priority | File | Finding.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e72f2db57

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-20T08:01:07Z

+    { cat; } </etc/passwd
+    cat <ok.txt
+expect:
+  stdout: "ok-payload"


P3: Use required block scalar for scenario output

AGENTS.md for /workspace/rshell says to “Always use the YAML |+ block scalar for input.script, expect.stdout, and expect.stderr values, even single-line ones.” This new scenario uses an inline scalar for expect.stdout here, and the same pattern appears in several of the added scenario files, so the tests do not follow the repository’s documented scenario format.

Useful? React with 👍 / 👎.

…paths

…ttacks

…acks

empty

dc38576

AlexandreYang added 9 commits May 19, 2026 23:39

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] signal-handling: test pipeline…

dbef982

… panic recovery

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] heredoc: add scenario coverage

d8dad6b

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] grep: test numeric overflow re…

1928976

…jection

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] function: add composition scen…

1c83a99

…arios

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] cut: add sandbox and special-f…

6d77f25

…ile tests

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] strings: add vuln hunt coverage

343b7d6

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] until: add composition scenarios

d022afd

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] pwd: add symlink cwd coverage

ba84b09

[vuln-hunt 2026-05-19-gpt-5.5-cyber-2] brace group: add vuln hunt cov…

3e72f2d

…erage

AlexandreYang changed the title ~~campaign with gpt 5.5 cyber~~ vuln-hunt: add coverage for brace groups, functions, heredocs, loops, and builtins May 20, 2026

AlexandreYang changed the title ~~vuln-hunt: add coverage for brace groups, functions, heredocs, loops, and builtins~~ test: add coverage for brace groups, functions, heredocs, loops, and builtins May 20, 2026

chatgpt-codex-connector Bot reviewed May 20, 2026

View reviewed changes

AlexandreYang added 15 commits May 20, 2026 11:01

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] negation: test blocked probes

8a3f455

test: add procnet reader vuln hunt tripwires

c245faf

test: cover procnet socket entry cap

e6373f3

test: add callctx openfile vuln hunt tripwires

726facc

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] field_splitting: test blocked …

338723d

…paths

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] testcmd: blocked tests

ebb369c

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] until_clause: blocked tests

897999b

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] empty_script: blocked tests

fbb803b

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] ping: test blocked attacks

656a842

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] ls: test blocked attacks

10642d2

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] simple_command: test blocked a…

80e5834

…ttacks

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] environment: test blocked attacks

ae3ff66

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] redirections: test blocked att…

0f72b99

…acks

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] case_clause: test blocked attacks

ee382e3

[vuln-hunt 2026-05-20-gpt-5.5-cyber-3] help: blocked tests

a390972

AlexandreYang added 27 commits May 20, 2026 20:27

test: add parser lexer vuln hunt regressions

19b2a5d

test: add df mount enumeration vuln hunt regressions

049b0e0

test: add signal handling vuln hunt regressions

0c8edc6

test: add uname vuln hunt regressions

e2e7bb4

test: add allowed commands vuln hunt regressions

15fb3b0

test: add sort vuln hunt regressions

b13ac9a

test: add while clause vuln hunt regressions

e3e5d77

test: add executor context cancellation regression

6911a18

test: add tr vuln hunt regressions

424c8fd

test: add pwd vuln hunt regressions

1f5096d

test: add globbing vuln hunt regressions

de15796

test: add find vuln hunt regressions

8d29c55

test: add wc timeout regression

72b0d86

test: add false vuln hunt tripwires

a2fcfbd

test: add printf vuln hunt tripwires

cf2750e

test: add builtin import allowlist tripwires

6dbf0f0

test: add blocked commands vuln hunt tripwires

6699b7f

test: add blocked redirects vuln hunt tripwires

4962d56

test: add for clause vuln hunt tripwires

757f529

test: add readonly vuln hunt tripwires

9cff33f

test: add comments vuln hunt tripwires

6968529

test: add heredoc dash vuln hunt tripwires

b4b783f

test: add logic ops vuln hunt tripwires

93ab0ce

test: add continue vuln hunt tripwires

27a4eca

test: add output buffer vuln hunt tripwires

43c2e6f

test: add var expansion vuln hunt tripwires

a8df567

test: add true vuln hunt tripwires

710b093

AlexandreYang closed this May 21, 2026

AlexandreYang reopened this May 21, 2026

AlexandreYang changed the title ~~test: add coverage for brace groups, functions, heredocs, loops, and builtins~~ regression tests from cyber2 and cyber3 May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regression tests from cyber2 and cyber3#267

regression tests from cyber2 and cyber3#267
AlexandreYang wants to merge 73 commits into
mainfrom
alex/vuln-hunt-codex-cyber

AlexandreYang commented May 19, 2026 •

edited

Loading

Uh oh!

datadog-prod-us1-4 Bot commented May 19, 2026 •

edited

Loading

Uh oh!

AlexandreYang commented May 20, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AlexandreYang commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Testing

Checklist

Uh oh!

datadog-prod-us1-4 Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

Uh oh!

AlexandreYang commented May 20, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AlexandreYang commented May 19, 2026 •

edited

Loading

datadog-prod-us1-4 Bot commented May 19, 2026 •

edited

Loading