clone3 is not implemented — thread creation fails on modern-glibc binaries
Describe the bug
clone3 (the modern variant of clone, syscall 435 on most arches, 5435 on MIPS n64) has no handler in Qiling. It appears in the number→name maps (qiling/os/linux/map_syscall.py) but there is no ql_syscall_clone3:
$ grep -rn "def ql_syscall_clone" qiling/
qiling/os/posix/syscall/sched.py:13:def ql_syscall_clone(...) # only the legacy clone
Recent glibc (e.g. the toolchains shipping with Ubuntu 24.04 / glibc 2.39) issues clone3 from pthread_create, falling back to the legacy clone only if clone3 returns -ENOSYS. So on any guest built against a recent glibc, pthread_create hits the unimplemented clone3 and thread creation fails:
[!] 0x...: syscall ql_syscall_clone3 number = 0x153b(5435) not implemented
Create pthread error!
Crucially, the clone3→clone fallback never kicks in, because Qiling does not return -ENOSYS for unimplemented syscalls — QlOsPosix just logs a warning and leaves the return register untouched (qiling/os/posix/posix.py):
self.ql.log.warning(f'... syscall {syscall_name} ... not implemented')
glibc only falls back on a literal -ENOSYS, so it sees a garbage return value, assumes clone3 "worked" (or failed for a real reason), and gives up instead of retrying with clone.
Repro
Any multithreaded binary built with a recent glibc reproduces it. With the legacy-clone test binaries currently in the rootfs (older glibc) the path is not exercised, which is why CI doesn't catch it. The symptom for a modern-glibc pthread binary is the warning above followed by Create pthread error! (or, if creation is allowed to proceed with a bogus return, a crash during thread setup).
Expected behavior
clone3 should create a thread just like clone does, so pthread_create works on guests built with a modern glibc.
Proposed fix
Add ql_syscall_clone3 that unpacks struct clone_args and delegates to the existing ql_syscall_clone. The non-obvious translations:
child_stack = stack + stack_size — clone3 passes the stack base plus a separate size; legacy clone wants the highest stack address.
exit_signal is its own struct clone_args field in clone3; legacy clone packs it into the low CSIGNAL byte of flags.
- x8664 pre-swap —
ql_syscall_clone swaps newtls<->child_tidptr to undo x8664's raw-syscall register order; since clone3 hands over already-logical args, pre-swap on x8664 so it cancels out.
Implemented on retrocpugeek:fix/clone3-syscall (off dev):
- Fix:
Implement the clone3 syscall
- Regression test:
test_elf_multithread.ELFTest.test_clone3_translates_to_clone — drives ql_syscall_clone3 directly and asserts the translation for both the generic path and the x8664 swap. Self-contained, runs on stock unicorn (no clone3 binary needed).
Verified end-to-end separately: a real MIPS64 BE glibc pthread binary now creates and joins threads correctly with this handler in place.
Happy to open a PR against dev.
clone3is not implemented — thread creation fails on modern-glibc binariesDescribe the bug
clone3(the modern variant ofclone, syscall 435 on most arches, 5435 on MIPS n64) has no handler in Qiling. It appears in the number→name maps (qiling/os/linux/map_syscall.py) but there is noql_syscall_clone3:Recent glibc (e.g. the toolchains shipping with Ubuntu 24.04 / glibc 2.39) issues
clone3frompthread_create, falling back to the legacycloneonly ifclone3returns-ENOSYS. So on any guest built against a recent glibc,pthread_createhits the unimplementedclone3and thread creation fails:Crucially, the
clone3→clonefallback never kicks in, because Qiling does not return-ENOSYSfor unimplemented syscalls —QlOsPosixjust logs a warning and leaves the return register untouched (qiling/os/posix/posix.py):glibc only falls back on a literal
-ENOSYS, so it sees a garbage return value, assumesclone3"worked" (or failed for a real reason), and gives up instead of retrying withclone.Repro
Any multithreaded binary built with a recent glibc reproduces it. With the legacy-
clonetest binaries currently in the rootfs (older glibc) the path is not exercised, which is why CI doesn't catch it. The symptom for a modern-glibc pthread binary is the warning above followed byCreate pthread error!(or, if creation is allowed to proceed with a bogus return, a crash during thread setup).Expected behavior
clone3should create a thread just likeclonedoes, sopthread_createworks on guests built with a modern glibc.Proposed fix
Add
ql_syscall_clone3that unpacksstruct clone_argsand delegates to the existingql_syscall_clone. The non-obvious translations:child_stack = stack + stack_size—clone3passes the stack base plus a separate size; legacyclonewants the highest stack address.exit_signalis its ownstruct clone_argsfield inclone3; legacyclonepacks it into the lowCSIGNALbyte offlags.ql_syscall_cloneswapsnewtls<->child_tidptrto undo x8664's raw-syscall register order; sinceclone3hands over already-logical args, pre-swap on x8664 so it cancels out.Implemented on
retrocpugeek:fix/clone3-syscall(offdev):Implement the clone3 syscalltest_elf_multithread.ELFTest.test_clone3_translates_to_clone— drivesql_syscall_clone3directly and asserts the translation for both the generic path and the x8664 swap. Self-contained, runs on stock unicorn (noclone3binary needed).Verified end-to-end separately: a real MIPS64 BE glibc pthread binary now creates and joins threads correctly with this handler in place.
Happy to open a PR against
dev.