The Cron is a Linux/Unix utility which allows you to schedule a specific scripts to be run at a specified time – a Cron job. This makes it a perfect tool for automation of maintenance- or synchronisation-scripts. Here I will share some of my experiences with Cron and how I use it.

Setting up a Cron job

The crontab

To setup up a Cron job you need to edit the crontab. That is where you specify which scripts to run and when. To edit the crontab for your current user, type this in your terminal:

crontab -e

This should open the crontab file for editing using nano.

On Linux systems you can set up a crontab for each user, which means you can have a personal crontab and use whatever system accounts are available for more general purposes. One of the benefits of using the system accounts is that the crontab will be available to other maintainers as well. To edit the crontab for the root user, type:

sudo crontab -e

To edit the crontab for a specific user, let's say deploy, use:

sudo -u deploy crontab -e

The cron jobs

In the crontab you can add as many cron jobs as you desire. A cron job is added as one-lines in the crontab and contains an expression and an command. The expression specifies how often the command should be repeated. The command is a command-line command to execute when the expression is met. You can use any terminal command in a cron job.

It could look like this:

45 23 * * * wget http://parentnode.dk

In this case 45 23 * * * is the expression and wget http://parentnode.dk is the command. In effect it will perform a wget, fetching http://parentnode.dk every day at 23:45.

The expression details

The cron expression consists of 5 variables: Minute (m), Hour (h), Day of month (dom), Month (mon) and Day of week (dow) in that order. The * is used as wildcard in the expression and can be used for each variable, thus using * * * * * will effectively execute your command every minute around the clock.

Each variable can represent a range or frequency by using "/" (slash), "," (comma) or "-" (hyphen).

For example, to run a script at every 20 minutes from 10:00 to 10:59, effectively at 10:00, 10:20 and 10:40:

*/20 10 * * *

Every third hour:

0 */3 * * *

Every 10 minutes, from  4 to 6:59:

*/10 4-6 * * *

Every 30 minutes at 5, 7 and 9 on Fridays, effectively 5:00, 5:30, 7:00, 7:30 9:00 and 9:30:

*/30 5,7,9 * * 5

At 01:00 on the 7th every month:

0 1 7 * *

The command details

Any shell script can be executed as a cron command. Like creating a gzipped tar ball from your /srv/sites folder:

tar -czvf /backup/sites.tar.gz /srv/sites

If you want to perform some more complex task, like synchronising data with another host, you can also execute a custom shell script using the cron job.

In many cases my synchronisation tasks are already implemented in the the backend of my projects. This a nice because I can allow soft administrations to invoke the synchronisation manually via the backend as well as execute them via a cron job. I use wget for that requesting pages available on my webserver. It could look like this:

wget http://domain/some-page

If the task is extensive you might want to add the --timeout=0 parameter to the wget command to avoid wget timeouts. Like this:

wget http://domain/some-page --timeout=0

When setting up a cron job like that I recommend saving the output to make it easier to validate if the webserver execution went as planned later. It also gives you a kind of log of the operation. To do that add the --directory-prefix parameter to the wget command, like this: 

wget http://domain/some-page --directory-prefix=/srv/crons/project/some-page --timeout=0

This wget command requests the specified url and returns the content to a new file in /srv/crons/project/. Each cron execution will generate a new some-page file postfixed with an incrementing number, like some-page.1, some-page.2, etc. It also means you should probably clean up this folder once in a while.

If you are requesting a page which is protected by HTACCESS, remember to add the username and password to the wget command.

wget http://project-domain/some-page --http-user=#username# --http-password=#password#

That should allow the wget command to log in seamlessly.

The Cron log

As default the cron log is added to the /var/log/syslog. You can update the syslog configuration (/etc/rsyslog.d/50-default.conf), to enable a separate cron log. Remember to restart the syslog and cron services after updated the configuration file.

Cron permissions

You can specify cron permissions for different users, using /etc/cron.allow and /etc/cron.deny. These files may not exist on your system, but you can safely create them if you need to customise cron permissions. Also if your cron job isn't running, check these files to see if your user isn't allowed to run cron jobs.

How do I know if a cron job is running?

As long as a cron job is running you can see the process by using:

ps aux

This command will list all current processes and if your cron command is listed, the cron job is still running. 

To know if the cron job has been run, check the log file or in the case of a wget command, check the output file generated by wget.