ppss/wiki/Design.wiki

#summary Design and technical overview.
#labels Phase-Design

= Introduction =

This wiki page describes how PPSS is designed, how it works and which techniques are used.

= Design =

There are two main ingredients that must be supplied to PPSS

  # A list of items that must be processed:
    * either a text file containing one item per line. These items can represent whatever you want;
    * or a directory containing files that must be processed.
  # A command that must be executed for each item.

For every item the specified command will be executed with the item supplied as an argument.

  * At any given moment there will be no more commands running in parallel other than specified by the command-line or based on the detected number of cpu cores.
  * Two parallel running processes should never interfere or collide with each other by processing the same item
  * PPSS should not poll but wait for events to occur and 'do nothing' if there is nothing to do.

One of the main difficulties for shell scripts is inter-process communication. There is no communication mechanism for child and parent processes to communicate with each other. A solution might be the use of signals with the 'trap' command to catch events, however tests have proven that this is not reliable. The trap mechanism that bash employs is inherently unreliable (by design). During the time period the trap command is processing a trap,  additional traps are ignored. Therefore, it is not possible to create a reliable mechanism using signals. There is actually a parallel processing shell script available on the web that is based on signals, and suffers exactly from this problem, which makes it unreliable.

However, repeated tests have determined that communication between processes using a FIFO named pipe is reliable and can be used for inter-process communication.

== Technical design ==

http://home.quicknet.nl/mw/prive/nan1/got/process.png

=== Function: get_all_items ===

The first step of PPSS is to read all items that must be processed into an array. This array will be used by the get_item function.

=== Function: listen_for_job ===

The second step is to start the listener. This is a process running in the background that listens on a FIFO special file (named pipe).

For every messages that is received, the listener will execute a function (called 'commando') as a background process (with '&') with the received message supplied as an argument to this function.

The whole function is executed as a background process, not just the user-supplied command.

=== Function: start_all_workers ===

For every available cpu core, a parallel thread will be started. If a user manually specifies a number of threads, that number will override the detected number of cores.

So the start_single_worker function is called, this function requests an item with the get_item function and sends the item to the FIFO. There, it will be picked up by the listener process, which will execute the commando function to process the item.

=== Command function ===

The command function performs the following tasks:

  * check if a supplied item has been processed already, if so, skip it.
  * execute the user-supplied command with the item as an argument
  * execute the 'start_single_worker' function to start a new job for a new item.

The third option is the most relevant. After the command finishes, it calls the start_single_worker function.

=== start_single_worker function ===

The start_single_worker function will request an item with the get_item function. If the get_item function returns an item, this item will be echoed to the FIFO special file that the listener is reading from. So that item will be picked up by the listener, and the whole cycle will repeat.

If the list of items have been processed, the get_item function will not return a value and exits with a non-nul return code. No new items will be echoed to the FIFO special file and the cycle will stop. Slowly all cycles that have been initiated by the start_all_workers function will die out.

=== get_item function ===

All items are read from the user-supplied file or directory into an array. If an item is requested, an item will be read from the array and an array_pointer is increased, so the next time the function is executed, the next item on the list is returned.