Fishtrap

php and other stuff I know

Parallel PHP processes with pcntl_fork

| 0 comments

Forking the PHP process requires the pcntl extension. This extension is limited to *nix operating systems. Once installed the key function is pcntl_fork(). The manual contains this excellent example of its use

$pid = pcntl_fork();
 if ($pid == -1) {
      die('could not fork');
 } else if ($pid) {
 // we are the parent
     pcntl_wait($status); //Protect against Zombie children
 } else {
      // we are the child
 }

if we insert a known delay in there and time it we can get an idea of whether we are running in parallel.

$pid = pcntl_fork();
 if ($pid) {
      sleep(10);
     pcntl_wait($status);
} else {
     sleep(10);
}
Mr-McHughs-MacBook-Pro:$ time php pcntl_fork.php
real	0m10.060s
user	0m0.020s
sys	0m0.025s

Excellent so it would appear we are running two PHP processes in parallel. Using a sleep() in this way we model an expensive and parallelisable function call for instance a heavy calculation or http request. By using a known delay instead of the real call we can work out how long it should take to execute and check our logic.

What about more than two processes

The first thing I tried was to put the previous code in a for loop with a couple of obvious alterations

for ($i=0; $i < 5; $i++) {
        $pid = pcntl_fork();
        if ($pid) {
                pcntl_wait($status);
        } else {
                echo 'starting child ',$i,PHP_EOL;
                sleep(10);
                die();
        }
}

The major alteration you will notice is that we exit or die after each child has done it’s thing. Otherwise each child will go through the remaining loops of the for loop and the script will run for a very long time. The other change is that we are no longer doing any work in the parent. There is no real reason for this other than we would always have to remember to add one to the the number of calls we were expecting to make. Even with these alterations this script has a major problem, one, that you get a good idea about if you watch it run.

Mr-McHughs-MacBook-Pro:$ time php pcntl_fork.php
starting child 0
starting child 1
starting child 2
starting child 3
starting child 4
real	0m50.113s
user	0m0.038s
sys	0m0.056s

Watching it in the terminal it becomes obvious that each child only gets started after the previous one has finished. The problem is the call to pcntl_wait($status) in the parent section. What this is doing is waiting for each child to end and hence stopping execution of the parent until it receives a signal to say a child has finished.

The solution is to put all calls to pcntl_wait in the parent but outside the loop that forks each process. The above script fixed is.

for ($i=0; $i < 5; $i++) {
        $pid = pcntl_fork();
       if ($pid) {
        } else {
                echo 'starting child ',$i,PHP_EOL;
                sleep(10);
                die();

        }
}
for ($i=0; $i <5; $i++) {
        pcntl_wait($status);
}

This way we run all 5 sleep commands in parallel

Mr-McHughs-MacBook-Pro:$ time php pcntl_fork.php
starting child 0
starting child 1
starting child 2
starting child 3
starting child 4

real	0m10.077s
user	0m0.037s
sys	0m0.056s

The neatest solution is to use an array to register the pid of each child we create and then check it to see if we have any open children.

<?php
$forks = array();
for ($i=0; $i < 5; $i++) {
        $pid = pcntl_fork();
        if ($pid == -1) {
                die('could not fork');
        } else if (0 === $pid) {
                $forks[] = $pid;
                echo 'starting child ',$i,PHP_EOL;
                sleep(10);
                die();

        }
}
do {
        pcntl_wait($status);
} while (count($forks) > 0);

Leave a Reply

Required fields are marked *.

*