Sunday, September 2, 2007

Zombie Processes


Using fork to create processes can be very useful, but you must keep track of child processes.

There is some Information related to termination of child process—such as whether it exited normally and, if so, what its exit status.

When a child process terminates, an association with its parent survives until the parent in turn either terminates normally or calls wait.

If a child process terminates while its parent is calling a wait function, the child process vanishes and its termination status is passes to it parent via the wait call.

But what happens when a child process terminates and the parent is not calling wait? Does it simply vanish? No, because then information about its termination—such as whether it exited normally and, if so, what its exit status is—would be lost. Instead, when a child process terminates, is becomes a zombie process.

The child process entry in the process table is therefore not freed up immediately. Although no longer active, the child process is still in the system because its exit code needs to be stored

in case the parent subsequently calls wait. It becomes what is known as defunct, or a zombie process. If the child prints fewer messages than the parent, it will finish first and will exist as a zombie until the parent has finished.

If the parent then terminates abnormally, the child process automatically gets the process with PID 1 (init) as parent. The child process is now a zombie that is no longer running but has been inherited by init because of the abnormal termination of the parent process. When a program exits, its children are inherited by a special process, the init program, which always runs with process ID of 1 (its the first process started when Linux boots). The init process automatically cleans up any zombie child processes that it inherits. The zombie will remain in the process table until collected by the init process. The bigger the table, the slower this procedure. You need to avoid zombie processes, as they consume resources until init cleans them up.

A zombie process is a process that has terminated but has not been cleaned up yet. It is the responsibility of the parent process to clean up its zombie children. The wait functions do this, too, so it’s not necessary to track whether your child process is still executing before waiting for it. Suppose, for instance, that a program forks a child process, performs some other computations, and then calls wait. If the child process has not terminated at that point, the parent process will block in the wait call until the child process finishes. If the child process finishes before the parent process calls wait,the child process becomes a zombie.When the parent process calls wait, the zombie child’s termination status is extracted, the child process is deleted, and the wait call returns immediately.

(zombie.c) Making a Zombie Process

#include

#include

int main ()

{

pid_t child_pid;

/* Create a child process. */

child_pid = fork ();

if (child_pid > 0)

{

/* This is the parent process. Sleep for a minute. */

sleep (60);

}

else { /* This is the child process. Exit immediately. */

exit (0);

}

return 0;

}

Cleaning Up Children Asynchronously

If you’re using a child process simply to exec another program, it’s fine to call wait immediately in the parent process, which will block until the child process completes. But often, you’ll want the parent process to continue running, as one or more children execute synchronously. How can you be sure that you clean up child processes that have completed so that you don’t leave zombie processes, which consume system resources, lying around?

One approach would be for the parent process to call wait3 or wait4 periodically, to clean up zombie children. Calling wait for this purpose doesn’t work well because, if no children have terminated, the call will block until one does. However, wait3 and wait4 take an additional flag parameter, to which you can pass the flag value WNOHANG. With this flag, the function runs in nonblocking mode—it will clean up a terminated child process if there is one, or simply return if there isn’t.The return value of the call is the process ID of the terminated child in the former case, or zero in the latter case.

A more elegant solution is to notify the parent process when a child terminates. There are several ways to do this using the methods discussed in “Interprocess Communication,” but fortunately Linux does this for you, using signals. When a child process terminates, Linux sends the parent process the SIGCHLD signal. The default disposition of this signal is to do nothing, which is why you might not have noticed it before. Thus, an easy way to clean up child processes is by handling SIGCHLD. Of course, when cleaning up the child process, it’s important to store its termination status if this information is needed, because once the process is cleaned up using wait, that information is no longer available. Listing 3.7 is what it looks like for a program to use a SIGCHLD handler to clean up its child processes.

Listing 3.7 (sigchld.c) Cleaning Up Children by Handling SIGCHLD

#include

#include

#include

#include

sig_atomic_t child_exit_status;

void clean_up_child_process (int signal_number)

{

/* Clean up the child process. */

int status;

wait (&status);

/* Store its exit status in a global variable. */

child_exit_status = status;

}

int main ()

{

/* Handle SIGCHLD by calling clean_up_child_process. */

struct sigaction sigchld_action;

memset (&sigchld_action, 0, sizeof (sigchld_action));

sigchld_action.sa_handler = &clean_up_child_process;

sigaction (SIGCHLD, &sigchld_action, NULL);

/* Now do things, including forking a child process. */

/* ... */

return 0;

}

Note how the signal handler stores the child process’s exit status in a global variable, from which the main program can access it. Because the variable is assigned in a signal handler, its type is sig_atomic_t.

SIGCHLD can be useful for managing child processes. It’s ignored by default. The remaining signals cause the process receiving them to stop, except for SIGCONT, which causes the process to resume. They are used by shell programs for job control and are rarely used by user programs.

0 comments;Click here for request info on this topic:

Post a Comment