thoughts on jobserver implementation

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

thoughts on jobserver implementation

Rasmus Villemoes
This is something I've had on my mind for a while, so the LKML thread
made me try to write it down.

The reason I started digging into the jobserver code is the following:
Currently, it's very hard to interact with the GNU Make jobserver, for
the simple reason that one doesn't know if the pipe is blocking or
non-blocking, meaning "downstream" users will have to implement two
widely different strategies. On Linux, that can be worked around by
opening /proc/self/fd/NN with O_NONBLOCK or not O_NONBLOCK as desired,
since that gives a new file description (struct file* in kernel-speak).
But it's even worse for "upstream", i.e. a build system (say, Yocto)
that wants to set up a jobserver - without knowing whether Make expects
a blocking or nonblocking pipe, I don't see how one can actually set
oneself up as a top-level jobserver.

So, here's a proposal that I'm sure is flawed, but I won't learn how
unless I send it out:

(1) keep SIGCHLD, SIGINT, SIGTERM (and whatever other signals that needs
handling) blocked everywhere except where noted below - no need for
SA_RESTART.

(2) have a standard self-pipe for handling signals, all handled signals
(including SIGCHLD) use the same handler which simply does

  write(sigpipe[1], &sig, sizeof(sig))

no signal-unsafe stuff at all.

(3) use (and expect to inherit) a non-blocking jobserver pipe.

(4) main loop (very roughly, of course)

while ((jobs_running || eligible_jobs) && !stop) {
  struct pollfd pfd[2];

  if (!quitsigs && eligible_jobs && !jobs_running)
     start_a_job();

  add sigpipe[0] to pfd;
  if (eligible_jobs)
    add jobserver[0] to pfd;

  unblock_signals();
  ret = poll(pfd, nfd, perhaps a timeout to deal with the "only if load
is below foo" option);
  block_signals();

  if (EINTR)
     continue; // or perhaps just a EINTR loop around poll

  if (sigpipe[0] is readable) {
    while (read sig != -EAGAIN) {
      switch (sig) {
      case SIGCHLD:
        reap_children(); // update jobs_running and eligible_jobs, write
back tokens as appropriate, deal with a failed job, etc.
         break;
      case SIGINT:
      case SIGTERM:
        if (!quitsigs++) {
           print(waiting for jobs);
        } else {
          stop = 1;
        }
      }
    }
  }

  if (eligible_jobs && jobserver[0] is readable) {
     while (eligible_jobs && read a token) start_a_job();
  }
}

This way, there's only one single place where we block, namely in the
poll() call, and I don't see how we can miss an event (a child dying or
a SIGTERM/SIGINT): If there are no eligible jobs, we will only return
from poll() once we get a signal (first return may be EINTR, then we
loop around and see the sigpipe is readable). If we do have eligible
jobs, we may return from poll() because there's a token available, and
then a signal may come in right after, before we block signals. In that
case, we'll just do the "try to get a token (knowing that it may have
been snatched by someone else), start a job", but the signal will be
handled in the next loop iteration, and it's indistinguishable from the
signal coming in right after blocking signals.

Obviously, start_a_job() must (reset signal handlers and) unblock
signals after fork() so the child inherits an expected environment, and
the above implicitly relies on WNOHANG being available so we don't have
to rely on fragile signal counting. But even legacy platforms without
WNOHANG should be able to do the above: When handling a SIGCHLD, instead
of a WNOHANG loop, just do blocking wait() (still with signals blocked)
as long as jobs_running > 0. That will likely keep the tokens
under-utilized, but it only affects a tiny minority of platforms [*].

In any case, I'd really appreciate if the jobserver protocol became more
strictly defined, especially so that things above GNU Make could set up
a jobserver. Perhaps (if GNU Make can actually always be made to work
with a O_NONBLOCK pipe) as some base rules:

(1) create a O_NONBLOCK pipe
(2) fill the pipe initially with '+' tokens
(3) always write back the token that was read, unless following a later
revision of these rules that assign different meanings to certain tokens.

Even just "these are the rules followed by GNU Make on modern platforms
that have feature this and that" would be very helpful.

Thanks,
Rasmus

[*] and it won't react to SIGTERM during the wait(), hrrmm... I think I
have a way around that, but it's ugly and there's already way too many
places above where I'm probably wrong.