Add `atomvm:posix_kill/2` by pguyot · Pull Request #2348 · atomvm/AtomVM

pguyot · 2026-07-01T06:24:02Z

Also stop leaking socat processes in tests

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

petermm · 2026-07-01T12:54:27Z

correct commit this time around! pardon the brain-lapse

Amp - never fails on pedantic things, pick and choose:

PR Review — `Add atomvm:posix_kill/2`

Commit: 6802275 — Add atomvm:posix_kill/2 (also stop leaking socat processes in tests)
Author: Paul Guyot

Summary

Adds a new NIF atomvm:posix_kill/2 wrapping POSIX kill(2), wired through the
gperf NIF table with a HAVE_KILL feature guard (disabled on bare‑metal rp2/stm32).
Also fixes a socat process leak in the UART / serial‑dist tests by exec‑ing socat
(so the subprocess pid is socat, not the wrapping shell) and killing it with the
new NIF instead of spawning /bin/kill.

The integration (feature detection, gperf entry, platform guards) is correct, and
the socat exec + posix_kill fix is a real improvement. However, the NIF has
unchecked integer‑narrowing paths that can send the wrong signal to the wrong
target, and the accepted pid range includes POSIX process‑group semantics that
are undocumented.

Recommendation

Request changes — Finding 1 (integer truncation) is a correctness/safety bug
in a kill(2) wrapper and should be fixed before merge. Findings 2–3 are
API/documentation decisions worth making now; 4–6 are nice‑to‑haves.

Findings (ranked)

1. High — oversized pid/signal can be truncated and sent to the wrong target/signal

nif_atomvm_posix_kill validates the pid with term_is_any_integer and extracts
it with term_maybe_unbox_int, and validates the signal with term_is_non_neg_int
then term_to_int:

VALIDATE_VALUE(argv[0], term_is_any_integer);
VALIDATE_VALUE(argv[1], term_is_non_neg_int);

if (UNLIKELY(kill((pid_t) term_maybe_unbox_int(argv[0]), term_to_int(argv[1])) != 0)) {

src/libAtomVM/posix_nifs.c#L961-L969

Problems:

On 64‑bit builds, an unboxed integer can exceed C int/typical pid_t. The cast
truncates silently, so a huge pid/signal can collapse to a small value
(0, 1, 15, …). For kill, a truncated pid of 0 targets the caller's own
process group.
On 32‑bit builds, a boxed int64 value passes term_is_any_integer, but
term_maybe_unbox_int reads only one boxed word — garbage pid.
The signal path has the same narrowing issue (term_to_int after a non‑neg check
that doesn't bound to INT_MAX).

Fix: validate representability before casting, extract via term_to_int64,
and bound the signal to [0, INT_MAX].

--- a/src/libAtomVM/posix_nifs.c
+++ b/src/libAtomVM/posix_nifs.c
@@ (top includes)
 #include <errno.h>
+#if HAVE_KILL
+#include <limits.h>
+#endif
@@ #if HAVE_KILL
+static bool term_is_valid_pid(term t)
+{
+    if (!term_is_int64(t)) {
+        return false;
+    }
+    int64_t value = term_to_int64(t);
+    pid_t pid = (pid_t) value;
+    return ((int64_t) pid) == value;
+}
+
+static bool term_is_signal_number(term t)
+{
+    if (!term_is_int64(t)) {
+        return false;
+    }
+    int64_t value = term_to_int64(t);
+    return value >= 0 && value <= INT_MAX;
+}
+
 static term nif_atomvm_posix_kill(Context *ctx, int argc, term argv[])
 {
     UNUSED(argc);
-    VALIDATE_VALUE(argv[0], term_is_any_integer);
-    VALIDATE_VALUE(argv[1], term_is_non_neg_int);
+    VALIDATE_VALUE(argv[0], term_is_valid_pid);
+    VALIDATE_VALUE(argv[1], term_is_signal_number);
+
+    pid_t pid = (pid_t) term_to_int64(argv[0]);
+    int signal = (int) term_to_int64(argv[1]);
 
-    if (UNLIKELY(kill((pid_t) term_maybe_unbox_int(argv[0]), term_to_int(argv[1])) != 0)) {
+    if (UNLIKELY(kill(pid, signal) != 0)) {
         return errno_to_error_tuple_maybe_gc(ctx);
     }
     return OK_ATOM;
 }

term_is_int64 / term_to_int64 exist in
term.h#L743 and
term.h#L1376.

2. Medium — negative/zero pid enables process‑group semantics, undocumented

The spec is OsPid :: integer() and the NIF accepts all signs. POSIX kill gives
special meaning to non‑positive pids: 0 = caller's process group, -1 = all
permitted processes, < -1 = a specific process group. The docs describe it only
as "operating system process id, as returned by subprocess/4":

libs/eavmlib/src/atomvm.erl#L523-L534

Decide the intended surface:

Safe subprocess‑cleanup API (recommended): restrict to pid > 0 and change
the spec to pos_integer(). In the helper from Finding 1:

-    return ((int64_t) pid) == value;
+    return value > 0 && ((int64_t) pid) == value;

--spec posix_kill(OsPid :: integer(), Signal :: non_neg_integer()) ->
+-spec posix_kill(OsPid :: pos_integer(), Signal :: non_neg_integer()) ->

Full kill(2) wrapper: keep negative pids but document the 0 / -1 / < -1
semantics and the blast radius.

3. Medium/low — clarify what `ok` means

Docs say the function is used to "terminate a process". kill returning success
only means the request was accepted; the target may catch, block, or ignore the
signal, and signal 0 sends nothing (existence/permission probe only).

 %% @doc     Send a signal to a process using kill(2). Typically used to
-%%          terminate a process started with `subprocess/4'.
+%%          terminate a process started with `subprocess/4'. A return value of
+%%          `ok' means kill(2) accepted the request; it does not guarantee the
+%%          target has terminated, since the signal may be caught, blocked, or
+%%          ignored. Signal 0 performs a POSIX existence/permission check without
+%%          sending a signal.

4. Low — test coverage gap + platform skip

test_subprocess.erl
is good — the esrch case with pid 536870911 (2^29‑1, the 32‑bit small‑int max)
plus signal 0 nicely exercises boxed‑integer handling. Two improvements:

It only tests non‑integer badargs; add an oversized‑signal case to lock in
Finding 1.
It only skips BEAM ("ATOM" guard). If eavmlib tests ever run on an AtomVM
platform without subprocess/kill (emscripten, bare‑metal), this fails rather
than skips. Gate on atomvm:platform() =:= generic_unix (exists at
atomvm.erl#L132).

 test() ->
     case erlang:system_info(machine) of
-        "ATOM" ->
+        "ATOM" when atomvm:platform() =:= generic_unix ->
             ok = test_posix_kill(),
             ok = test_posix_kill_badarg(),
             ok = test_posix_kill_esrch(),
             ok;
         _ ->
             ok
     end.

 test_posix_kill_badarg() ->
-    ok =
-        try
-            atomvm:posix_kill(not_a_pid, 15),
-            fail
-        catch
-            error:badarg -> ok
-        end,
-    ok =
-        try
-            atomvm:posix_kill(1, not_a_signal),
-            fail
-        catch
-            error:badarg -> ok
-        end,
-    ok.
+    ok = expect_badarg(fun() -> atomvm:posix_kill(not_a_pid, 15) end),
+    ok = expect_badarg(fun() -> atomvm:posix_kill(1, not_a_signal) end),
+    ok = expect_badarg(fun() -> atomvm:posix_kill(1, 1 bsl 40) end),
+    ok.
+
+expect_badarg(Fun) ->
+    try Fun() of
+        _ -> fail
+    catch
+        error:badarg -> ok
+    end.

(If you adopt the positive‑pid API from Finding 2, also add badarg tests for 0
and -1.)

5. Low — errno atom mapping guard omits `HAVE_KILL`

posix_errno_to_term only atomizes errno values when file/time POSIX features are
present; otherwise it returns a raw integer:

#if HAVE_OPEN && HAVE_CLOSE || defined(HAVE_CLOCK_SETTIME) || defined(HAVE_SETTIMEOFDAY)
...
#else
    return term_from_int(err);
#endif

src/libAtomVM/posix_nifs.c#L76,
#L186-L189

On a hypothetical kill‑only platform, {error, esrch} would degrade to
{error, 3}. In practice any platform with kill also has open/close, so this
is theoretical — but cheap to make correct:

-#if HAVE_OPEN && HAVE_CLOSE || defined(HAVE_CLOCK_SETTIME) || defined(HAVE_SETTIMEOFDAY)
+#if HAVE_OPEN && HAVE_CLOSE || defined(HAVE_CLOCK_SETTIME) || defined(HAVE_SETTIMEOFDAY) || HAVE_KILL

6. Low — socat cleanup swallows the pre‑existing case

The socat fix is directionally correct: exec socat … makes the subprocess pid be
socat itself, so SIGTERM actually reaches it.

test_serial_dist_socat.erl#L223-L240,
test_uart.erl#L96-L121

Minor robustness: stop_socat asserts ok = atomvm:posix_kill(...). If socat
already exited, {error, esrch} will crash cleanup and can mask the real test
failure. Tolerating esrch in cleanup makes failures easier to diagnose:

 stop_socat({OsPid, Fd}) ->
-    ok = atomvm:posix_kill(OsPid, 15),
+    case atomvm:posix_kill(OsPid, 15) of
+        ok -> ok;
+        {error, esrch} -> ok
+    end,
     atomvm:posix_close(Fd).

What's correct (no action needed)

Feature detection & registration: HAVE_KILL probe in
CMakeLists.txt#L261,
IF_HAVE_KILL macro in
nifs.c#L1036-L1040,
gperf entry, and extern decl in the header.
Bare‑metal disabling: HAVE_KILL forced off on
rp2 and
stm32,
matching the newlib stub rationale for the neighboring symbols.
CHANGELOG entry present and accurate.

petermm · 2026-07-03T06:50:09Z

clean Amp (not a continuation of above):

PR Review — `Add atomvm:posix_kill/2` (also stop leaking socat processes in tests)

Commit: 36de127c5de8104277bfd1651f61c7343760c5cb

Summary

Adds an atomvm:posix_kill(OsPid, Signal) NIF wrapping kill(2) and returning
ok | {error, posix_error()}, gated behind a new HAVE_KILL feature check
(disabled on rp2/stm32). Uses the new NIF to stop leaking socat subprocesses in
the UART and serial-dist tests, switching the shell wrappers to exec socat ...
so the returned OS pid is socat itself rather than the wrapping shell.

Overall: the core NIF is sound (validation, errno mapping, and feature gating
are correct). Two test-side issues should be addressed before merge.

Verdict

Change requested — one test reintroduces the wrapper-shell pid bug the commit
otherwise fixes, and the UART tests still leak socat on failure paths.

Findings

#	Severity	Area	Issue
1	Major	`test_subprocess.erl`	Kills the wrapping shell, not `sleep`; test can hang/leak
2	Major	`test_uart.erl`	socat still leaks on assertion/error paths
3	Minor	`posix_nifs.c`	Local var `signal` shadows libc `signal(3)`

What is correct (no change needed)

pid_t round-trip check in term_is_valid_pid correctly rejects integers
that would truncate to a different (possibly own) process.
Signal <= INT_MAX boundary in term_is_signal_number is the right cap
before casting to int; oversized signals are rejected, not truncated
(covered by the 1 bsl 40 test case).
Allowing negative OsPid matches documented kill(2) process-group semantics.
The spec Signal :: non_neg_integer() / OsPid :: integer() is consistent
with the docs — not a mismatch.
|| HAVE_KILL added to the posix_errno_to_term guard is appropriate so
{error, esrch | eperm | einval} can be returned on kill-only POSIX builds.
Tolerating {error, esrch} in stop_socat is safe: eperm/einval still
crash the case expression, so real failures are not masked.

1. Major — `test_subprocess` kills the shell, not `sleep`

test_posix_kill/0 starts a subprocess via /bin/sh -c "sleep 30".
subprocess/4 returns the /bin/sh pid, and sleep inherits the stdout pipe's
write end. Killing only the shell does not reliably kill sleep, so
read_until_eof(Fd) can block until sleep 30 exits (or the 5s select timeout
fires). This contradicts the commit's own socat fix, which uses exec
specifically so the returned pid is the target process.

--- a/tests/libs/eavmlib/test_subprocess.erl
+++ b/tests/libs/eavmlib/test_subprocess.erl
     {ok, OsPid, Fd} = atomvm:subprocess(
-        "/bin/sh", ["sh", "-c", "sleep 30"], undefined, [stdout]
+        "/bin/sh", ["sh", "-c", "exec sleep 30"], undefined, [stdout]
     ),

2. Major — `test_uart` still leaks socat on failure paths

Each UART test calls start_socat() and only reaches stop_socat(SocatFd) at
the successful end of the function. If any assertion in between fails, the new
SIGTERM cleanup is skipped and socat keeps holding both ptys — exactly the
cascading pty-exhaustion the commit set out to prevent. The commit fixes
successful-run leaks but not failure-run leaks.

Suggested direction: guarantee cleanup with try ... after, e.g. a helper:

with_socat(Fun) ->
    {SocatFd, PtyA, PtyB} = start_socat(),
    try
        Fun(PtyA, PtyB)
    after
        stop_socat(SocatFd)
    end.

and wrap per-test socat usage (also closing any opened Fd via after):

test_posix_tcgetattr() ->
    with_socat(fun(PtyA, _PtyB) ->
        {ok, Fd} = atomvm:posix_open(PtyA, [o_rdwr, o_noctty]),
        try
            {ok, Tio} = atomvm:posix_tcgetattr(Fd),
            true = is_map(Tio),
            ok
        after
            atomvm:posix_close(Fd)
        end
    end).

Scope is moderate since several tests should be wrapped consistently.

3. Minor — `int signal` shadows libc `signal(3)`

In nif_atomvm_posix_kill, the local int signal shadows signal(3) from
<signal.h>. Legal C, but may trip -Wshadow and is easily avoided.

--- a/src/libAtomVM/posix_nifs.c
+++ b/src/libAtomVM/posix_nifs.c
     pid_t pid = (pid_t) term_to_int64(argv[0]);
-    int signal = (int) term_to_int64(argv[1]);
+    int signo = (int) term_to_int64(argv[1]);

-    if (UNLIKELY(kill(pid, signal) != 0)) {
+    if (UNLIKELY(kill(pid, signo) != 0)) {
         return errno_to_error_tuple_maybe_gc(ctx);
     }

Also stop leaking socat processes in tests Signed-off-by: Paul Guyot <pguyot@kallisys.net>

pguyot force-pushed the w25/posix-kill branch from 6802275 to 36de127 Compare July 3, 2026 05:51

pguyot force-pushed the w25/posix-kill branch 3 times, most recently from 5b353b5 to 11ae01c Compare July 3, 2026 21:42

Add atomvm:posix_kill/2

aaac61b

Also stop leaking socat processes in tests Signed-off-by: Paul Guyot <pguyot@kallisys.net>

pguyot force-pushed the w25/posix-kill branch from 11ae01c to aaac61b Compare July 4, 2026 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `atomvm:posix_kill/2`#2348

Add `atomvm:posix_kill/2`#2348
pguyot wants to merge 1 commit into
atomvm:release-0.7from
pguyot:w25/posix-kill

pguyot commented Jul 1, 2026

Uh oh!

petermm commented Jul 1, 2026

Uh oh!

petermm commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

pguyot commented Jul 1, 2026

Uh oh!

petermm commented Jul 1, 2026

PR Review — Add atomvm:posix_kill/2

Summary

Recommendation

Findings (ranked)

1. High — oversized pid/signal can be truncated and sent to the wrong target/signal

2. Medium — negative/zero pid enables process‑group semantics, undocumented

3. Medium/low — clarify what ok means

4. Low — test coverage gap + platform skip

5. Low — errno atom mapping guard omits HAVE_KILL

6. Low — socat cleanup swallows the pre‑existing case

What's correct (no action needed)

Uh oh!

petermm commented Jul 3, 2026

PR Review — Add atomvm:posix_kill/2 (also stop leaking socat processes in tests)

Summary

Verdict

Findings

What is correct (no change needed)

1. Major — test_subprocess kills the shell, not sleep

2. Major — test_uart still leaks socat on failure paths

3. Minor — int signal shadows libc signal(3)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PR Review — `Add atomvm:posix_kill/2`

3. Medium/low — clarify what `ok` means

5. Low — errno atom mapping guard omits `HAVE_KILL`

PR Review — `Add atomvm:posix_kill/2` (also stop leaking socat processes in tests)

1. Major — `test_subprocess` kills the shell, not `sleep`

2. Major — `test_uart` still leaks socat on failure paths

3. Minor — `int signal` shadows libc `signal(3)`