2009年8月26日 星期三

如何防止彊屍程序(zombie)的產生??

當執行program ex1程式片段後, 試著以kill prog1的方式終止prog1之程序,
結果查詢目前各程序執行狀況(ps -ef), 發現出現[prog1],
接下來無論怎麼kill prog1, 都無法將其消滅, 唯有中止main(), 它才會消失...

原來它就是僵屍程序(zombie), 何謂僵屍程序呢??
在fork()/execve()過程中,假設子程序結束時父程序仍存在,
而父程序fork()之前既沒設置SIGCHLD信號處理函數調用waitpid()等待子進程結束,
又沒有設置忽略該信號,則子程序成為僵屍程序,無法正常結束,
即使是root身份kill -9也不能殺死僵屍程序。

補救辦法是殺死僵屍程序的父程序(僵屍程序的父程序必然存在),
僵屍程序成為"孤兒程序",過繼給pid=1的程序init,init始終會自行負責清理僵屍進程。


僵屍程序之範例:

int main (int argc, char *argv[])
{
char *prog_list = {"prog1","prog2", "prog3"};

for (int i=0; i<3; i++) {
char *arg_list = {prog_list[i], NULL};
forkExecProc (prog_list[i], arg_list);
}

while (true) {
sleep (2);
}

return 0;
}

int forkExecProc (char *prog, char **arg_list)
{
pid_t child;

/* check if fork fail that first child is not created */
if ((child = fork ())< 0) {
fprintf (stderr, "fork error");
} else if (child == 0) { /* run into first child */
fprintf (stdout, "fork to execute : [%s]\n", arg_list);
/* replaces the current process image with a new process image */
execvp(prog, arg_list);

/* if execvp() is return, mean that on error */
fprintf(stderr, "execvp error");
exit(0);
}

return child;
}

解決一.
在父程序設置SIGCHLD信號的處理函式, 使其父程序自動忽略子程序的狀態變更,
但長期常駐程式, 可能不適用, 難保不會出現系統資源耗用的現象

int main (int argc, char *argv[])
{
signal (SIGCHLD,SIG_IGN); //設置SIGCHLD
char *prog_list = {"prog1","prog2", "prog3"};

for (int i=0; i<3; i++) {
char *arg_list = {prog_list[i], NULL};
forkExecProc (prog_list[i], arg_list);
}

while (true) {
sleep (2);
}
return 0;
}
int forkExecProc (char *prog, char **arg_list)
{
...
}

解決二.
在父程序設置SIGCHLD信號的處理函式, 並呼叫waitpid(), 等待捕獲子程序的返回狀態


void sig_fork(int signo)
{
pid_t pid;
int stat;
// 呼叫waitpid(),等待子程序返回, 若無子程序返回, 也不一直等待
pid=waitpid(0,&stat,WNOHANG);

return;
}


int main (int argc, char *argv[])
{
signal (SIGCHLD, sig_fork); // 設置SIGCHLD, 並呼叫waitpid(), 捕獲子程序的返回狀態

char *prog_list = {"prog1","prog2", "prog3"};

...

return 0;
}


int forkExecProc (char *prog, char **arg_list)
{
pid_t child;

/* check if fork fail that first child is not created */
if ((child = fork ())< 0) {
fprintf (stderr, "fork error");
} else if (child == 0) { /* run into first child */
fprintf (stdout, "fork to execute : [%s]\n", arg_list);
/* replaces the current process image with a new process image */
execvp(prog, arg_list);

/* if execvp() is return, mean that on error */
fprintf(stderr, "execvp error");
exit(0);
}

/* no block to wait for first child chang state */
/* must be use signal (SIGCHLD, xxx) to fetch child change state */
waitpid (-1, NULL, WNOHANG); // 父程序呼叫waitpid(),不阻塞等待子程序的返回狀態, 待引發SIGCHLD

return child;
}

解決三.
呼叫2次fork(), 父程序呼叫fork(第一次)產生子程序, 子程序再呼叫fork(第二次)產生孫程序,
隨即子程序終結死亡, 此時孫程序變為"孤兒程序",init程序會接管孫程序, 變成它的父程序,
而init程序會自行負責處理SIGCHLD信號


int main (int argc, char *argv[])
{
char *prog_list = {"prog1","prog2", "prog3"};

for (int i=0; i<3; i++) {
char *arg_list = {prog_list[i], NULL};
forkExecProc (prog_list[i], arg_list);
}

while (true) {
sleep (2);
}

return 0;
}

int TaskHandler::forkExecProc (char *prog, char **arg_list)
{
pid_t child;

/* check if fork fail that first child is not created */
if ((child = fork ())< 0) { // 產生子程序
fprintf (stderr, "fork error");
} else if (child == 0) { /* run into first child */

/* check if fork fail that second child is not created */
if ((child = fork ())< 0) { // 產生孫程序
fprintf (stderr, "fork error");
}
else if (child > 0) { /* run into parent of second child whick is first child */
/* terminate the first child, in order that second child's parent becomes init */
exit(0); // 子程序自行終結, 此時孫程序被init接管為它的父程序
}
else { /* run into second child */
// 孫程序繼續執行下列步驟
fprintf (stdout, "fork to execute : [%s]\n", arg_list);
/* replaces the current process image with a new process image */
execvp(prog, arg_list);

/* if execvp() is return, mean that on error */
fprintf(stderr, "execvp error");
exit(0);
}
}

/* wait for first child chang status */
waitpid (child, NULL, 0); // 父程序呼叫waitpid(), 等待子程序終結,並捕獲返回狀態

return child;
}

沒有留言: