From 67835f0c97fa25509473bfa6163c8d8d7970fe71 Mon Sep 17 00:00:00 2001 From: Louwrentius Date: Sun, 18 Jul 2010 13:03:11 +0000 Subject: [PATCH] Edited wiki page through web user interface. --- wiki/Manual1.wiki | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/wiki/Manual1.wiki b/wiki/Manual1.wiki index faf481c..58d6d74 100644 --- a/wiki/Manual1.wiki +++ b/wiki/Manual1.wiki @@ -261,16 +261,31 @@ Next, just run PPSS as usual. == Daemon mode (2.63 and onward) == -PPSS can be run as a daemon, monitoring a file or directory for new items. If (new) input is found, it is processed. +PPSS can be run as a daemon, monitoring a file or directory for new items. If (new) input is found, it is processed. If multiple items are put into the directory at once, they are processed in parallel. === IMPORTANT === -Before you put a file in a directory monitored by PPSS, you must create a directory (mkdir) called "INPUT_LOCK" inside this directory. Once the file transfer is complete, you can remove this directory. This is required to prevent race conditions. It can get messy if PPSS starts processing a file while the file in question is still being copied into the directory. +There is a risk that as soon PPSS detects a new file, it starts processing, while the file has not been fully written to disk. To prevent this risk, PPSS uses the 'stat' command to determine the time since it was last modified. By default, a file must have an age of 4 seconds before it is processed. If you want to wait longer or a shorter time period, use the --file-age (seconds) parameter. The --polling-interval option allows you to specify how often PPSS should check for new files within the directory. The default is to check for new files every 10 seconds. -Thus, in order to use the daemon feature of PPSS, you must insert a mkdir and remove dir command into your own scripts. But it gets nasty. +Please note that checking for new files on a directory with many files will stress the CPU as PPSS must determine for each file found if it is processed or not. So it is advised to remove items from the directory once they are processed. Also, don't set the polling interval to short or the system is only busy polling and can't do any actual work. If a short polling interval is required, consider using the Linux inotify option as described below. -PPSS also locks the directory while processing items, to prevent race conditions. This implies that your script may fail if the lock of PPSS is present. So your script must contain something like this (almost identical code from PPSS): +*Linux inotify* +A regular daemon just polls every x seconds for new files, but this polling is not very efficient. A robust and fast mechanism for monitoring of file system events is [http://en.wikipedia.org/wiki/Inotify inotify]. By default, the inotify program does nothing and just waits for a file system event to occur. Thus when using PPSS, PPSS will do absolutely nothing unless a file system event occurs. Only 'close' events are noticed by PPSS, making dead sure that only files are processed that have been closed and are not being operated upon. + +Inotify is enabled by default if PPSS detects that inotify is installed and PPSS is run as a daemon. + +To use inotify on a Linux system, you must install it first. For Debian-based operating systems, this can be done with: + + apt-get install inotify-tools + +Inotify is regarded as the best option for running the daemon mode, however it requires additional software. The standard mechanism that just polls the directory at a regular interval and verifies the modification date of a file may be sufficient for many, so it is not required. The benefit of inotify is that it makes PPSS fast to respond to filesystem events. PPSS doesn't need to wait for the next polling event to pick up new items. They are processed as soon as they arrive. + +Inotify can be explicitly disabled with the --disable-inotify option. + +*locking mechanism* + +If you want to be dead sure that no race condition can occur and 'inotify' cannot be used, use the additional locking mechanism that is build-in into PPSS. The --enable-input-lock option forces PPSS to claim the input directory with a lock file called INPUT_LOCK. If this directory exists, PPSS will not process items. Once this directory is removed, PPSS will start processing. This way, you can lock the input directory in your script and make sure that all processes on files are finished before PPSS starts processing items. For this feature, your script needs some additional logic like this (almost identical code from PPSS): {{{ # 1 - try to obtain lock.