Look for file changes and kick off actions with Watchman
On Guard
Watchman is an open source tool developed by Facebook and released under the terms of Apache License 2.0. The Watchman website [1] states: "Watchman exists to watch files and record when they change. It can also trigger actions (such as rebuilding assets) when matching files change."
Written in C, Watchman is multiplatform: It works on Linux, Mac OS X, BSD, and Illumos/Solaris. Windows is not listed as a supported operating system. Although the main goal of Watchman might not be as I describe in this article, its job is to trigger actions when a file changes. This functionality perfectly fits various needs.
Watchman might seem complex at first glance, with its options and configuration directives (yes, it is), but I'll start with some simple examples; then, you can delve deeper into the program on your own.
Replicating Files
Working with distributed or replicated systems is a common task these days. The magic word nowadays – the cloud – is an unclear term, but if you want to simplify the concept, the cloud is what years ago was called a "cluster." If you want to replicate a directory on a cluster of servers, a lot of software, besides shared storage like NFS, is at your disposal. However, often the learning curve is steep, yet the goal is so simple.
The basic scenario is as follows: You have a directory on a server and you want to replicate its content on another server. The simple way to achieve this is with a copy command (maybe using the rsync tool) triggered by a job scheduled in a crontab. Obviously this solution is not optimal, because the scheduled job will be executed even if the directory has not changed, the time between the execution of a task and a following one could be too long or too short, or the copy starts while a file is in a transitional stage, leading to some sort of inconsistency at a given time. Also, why wait until night to perform a backup? Sometimes it might be useful to copy a file as soon as possible.
In such a case, a handy tool like Watchman, which watches the tree of a directory (or a number of directories) and triggers an action when a change is detected, is a good solution.
Triggering Rsync
Install Watchman as instructed in the "Installation" box and start it with the command:
watchman watch /opt/repos
The path /opt/repos
is an example of a directory that you might want to replicate to another server. In Watchman it is called the "root."
Installation
No prepackaged versions of Watchman are available, but it is pretty straightforward to compile. On CentOS, enter the following commands:
yum install git autogen autoconf automake gcc cd /var/tmp git clone https://github.com/facebook/watchman.git cd watchman ./autogen.sh ./configure make sudo make install
The path of the executable program will be /usr/local/bin/watchman
. Watchman also has some compile time options (e.g., installation paths) you can use to customize your installation [2]. Watchman has two ways in which you can tell it which directories it must monitor and what actions it has to take: via the command line while the daemon is running in background or via a configuration file. The most convenient way to configure it the first time is via the command line, which will write a configuration file in JSON format. The resulting configuration file will be /usr/local/var/run/watchman/<username>.state
; in the same place you can find the logfile <username>.log
.
At this point the /usr/local/var/run/watchman/<username>.state
configuration file will look like Listing 1. The Watchman process is up and running, but right now, it is simply watching the directory and reporting changes in the logfiles. (You can also query the daemon for changed files from the command line).
Listing 1
Configuration File
01 { 02 "version": "3.0.0", 03 "watched": [ 04 { 05 "path": "/opt/repos", 06 "triggers": [] 07 } 08 ] 09 }
The next step is to define what action to trigger. The syntax is:
watchman -- trigger /opt/repos 'repos-sync' \ -- /usr/local/sbin/sync.sh /opt/repos
Where 'repos-sync'
is the name of the trigger and /usr/local/sbin/sync.sh
is the script to invoke. Now the configuration file looks like Listing 2.
Listing 2
Configuration File with Trigger Action
01 { 02 "version": "3.0.0", 03 "watched": [ 04 { 05 "path": "/opt/repos", 06 "triggers": [ 07 { 08 "name": "repos-sync", 09 "command": [ 10 "/usr/local/sbin/sync.sh" 11 ], 12 "append_files": true, 13 "stdin": [ 14 "name", 15 "exists", 16 "new", 17 "size", 18 "mode" 19 ] 20 } 21 ] 22 } 23 ] 24 }
The script that performs the rsync operation, sync.sh
, will look like Listing 3. Your script might be more complex (e.g., by adding some sort of logging or notification via email). Please note that the environment variable WATCHMAN_ROOT
is set by Watchman itself and contains the "root" (i.e., the watched directory). Make sync.sh
executable by entering:
chmod +x /usr/local/sbin/sync.sh
However, you also need to implement SSH key authentication to use rsync over SSH without the need for a password. If you don't know how to do this, you'll find several articles on the Internet (e.g., at the TeachMeJoomla site [3]) describing this procedure.
Listing 3
sync.sh
01 #!/bin/bash 02 03 twinserver="remoteserver.your.domain" 04 05 rsync -avp --delete -e "ssh -i ~/.ssh/id_rsa_rsync" \ ${WATCHMAN_ROOT}/ ${twinserver}:${WATCHMAN_ROOT}/
Watching Multiple Directories
If you need to monitor more than one directory, you can reuse the same script, because the aforementioned WATCHMAN_ROOT
variable contains the correct path. To do so, you need to run the commands as follows:
watchman watch /opt/repos2 watchman -- trigger /opt/repos2 'repos2-sync' \ -- /usr/local/sbin/sync.sh
Now run the commands
watchman watch-list watchman trigger-list <root>
to see the list of watched directories (without going to the .state
file) and the list of triggers.
Buy this article as PDF
(incl. VAT)