M4 Phase 6: Integration Testing and Three Critical Bug Fixes

With all the individual M4 subsystems in place — epoll, signalfd, timerfd, eventfd, Unix sockets, filesystem mounting, prctl, and capabilities — it was time to wire them together and prove they actually work in concert. Writing mini_systemd.c immediately uncovered three subtle bugs that had been lurking in the codebase.

The Downcast Bug: Method Resolution vs. Trait Objects

The most insidious bug: file.as_any().downcast_ref::<EpollInstance>() always returned None, even though Debug output showed type=EpollInstance. I spent hours assuming this was TypeId instability with custom target specs.

The real cause was Rust method resolution. Given file: &Arc<dyn FileLike>:

file.as_any()
  → Arc<dyn FileLike>: Downcastable (blanket impl, since Arc is Sized+Any+Send+Sync)
  → returns &dyn Any wrapping Arc<dyn FileLike> itself
  → downcast_ref::<EpollInstance>() fails — inner type is Arc, not EpollInstance

The blanket impl<T: Any + Send + Sync> Downcastable for T applies to Arc<dyn FileLike> because Arc is Sized + 'static + Send + Sync. Method resolution finds this before auto-derefing through Arc to dyn FileLike.

The fix is explicit deref: (**file).as_any() dispatches through the dyn FileLike vtable to the concrete type's as_any(), returning the actual EpollInstance wrapped in &dyn Any.

This affected every downcast_ref call site in the codebase — epoll, timerfd, and the existing sendmsg/recvmsg SCM_RIGHTS code (which had been silently failing).

Signal Bitmask Off-by-One

waitpid was returning EINTR even though SIGCHLD was blocked via sigprocmask(SIG_BLOCK, ...). The cause: an off-by-one between internal and userspace signal bitmask conventions.

  • Internal signal_pending: 1 << signal (SIGCHLD=17 → bit 17)
  • Userspace sigset_t: 1 << (signal-1) (SIGCHLD=17 → bit 16)

has_pending_signals() compared them directly: pending & !blocked. Bit 17 (pending SIGCHLD) was never masked by bit 16 (blocked SIGCHLD). Fix: align internal representation to userspace convention using 1 << (signal - 1).

socketpair and Timer Overflow

Two simpler fixes: implemented socketpair(AF_UNIX, SOCK_STREAM) by exposing UnixStream::new_pair() (the building block already existed), and fixed a subtract with overflow panic in elapsed_msecs() with saturating_sub.

mini_systemd: 15 Tests, All Green

The integration test exercises the same codepaths as systemd PID 1 initialization:

TestWhat it exercises
mount_proc, mount_meminfo, mount_mounts/proc filesystem
prctl_name, prctl_subreaperPR_SET_NAME, PR_SET_CHILD_SUBREAPER
capabilitiescapget with v3 protocol
uid_gidgetuid/geteuid/getgid/getegid
epoll_createepoll_create1(EPOLL_CLOEXEC)
signalfdsignalfd4 + epoll_ctl
timerfdtimerfd_create + timerfd_settime + epoll_ctl
eventfdeventfd2 + write + epoll_ctl
unix_socketsocketpair + write + read
fork_execfork + _exit(42) + waitpid
epoll_eventfd, epoll_timerfdIntegrated epoll_wait loop

All 15 tests pass under KVM. M4 is complete.