Table of contents
- 1 Problem
- 2 Goal
- 3 Example configuration of scheduled
- 4 Example job configuration
You never know when a job has finished...
backups are often done with rsync or some other file copy tool. rsync isvery useful for that. in many cases jobs are not running efficient whenrunning in parallel. it could be the storage system which becames slow whentwo prozesses are writing to it at the same time and it could be the networkis under too heavy load when more then N jobs are running on the samesubnet.
- is not reliable because the jobs do not have dependencies
- you never know when a job is done and are starting new jobs before adependent job is done. this could be the job itself if it runs daily andmore then 24 hours or a job which need another job be finished because elseit makes no sense to run. of course you could write wrapper scripts aroundthat but you need to write a one script for each process to determine itsstatus.
- No maximum execution time
The maximum usage of CPU cores in systems doing scheduled tasks!
The goal is, to create an application scheduler with some more intelligence thencron. cron is wonderfull for what it is created but you are not able to maximizethe CPU utilization with cron. This is the goal of scheduled!
Control the execution of jobs, _not_inside_the_jobs_ ! This is because jobscould be a simple shell command like "cp" and you need to write a wrapperaround to control this simple command.
Example configuration of scheduled
The configuration file is case insensitive.
RCPT=root,firstname.lastname@example.org,email@example.com USERS=hanez UPDATE=5 MAX_PROC=16 MAX_PROC_COMMUNITY=4
Users with own .schedule.d/ directory in $HOME
Update queue every N minutes (reread /etc/schedule.d/) or NONE. When NONEupdate only at restart
Example job configuration
Job configuration files are case insensitive.
CMD=rsync -a hanez.org:/ /tmp/backups/hanez.org/ START=2200:2300 STOP=0300:0300 KILL=FALSE DEPENDS=NONE FORCE=TRUE COMMUNITY=10.0.1.0
CMD=rsync -a example.org:/ /tmp/backups/example.org/ START=2200:2400 STOP=0300:0300 KILL=FALSE DEPENDS=NONE FORCE=FALSE COMMUNITY=10.0.2.0
CMD=rsync -a test.example.org:/ /tmp/backups/test.example.org/ START=2200:2200 STOP=0300:0300 KILL=TRUE DEPENDS=NONE FORCE=FALSE WARN=TRUE RCPTfirstname.lastname@example.org COMMUNITY=10.0.2.0
GROUP=backup-hanez.org \ backup-example.org backup-test.example.org
CMD=gz /tmp/backups | tar -foo -bar /dev/tape0 START=2200:2200 STOP=0600:0600 MAXSTOP=0900:0900 KILL=FALSE DEPENDS=group-servers,hanez@my-user-job WARN=TRUE RCPTemail@example.com,firstname.lastname@example.org FORCE=TRUE FORCE_AFTER=1500 FORCE_DEPENDS=maybe-some-other-group-or-individual-job MIN_EXEC_TIME=60 # Run mininum N seconds, else report
RCPT=hanez CMD=rsync -a home.hanez.org:/ /home/hanez/tmp/backups/home.hanez.org/ START=0900:1800 STOP=0900:1800 KILL=FALSE DEPENDS=NONE FORCE=TRUE REPEAT=3600 # repeat every N seconds whithin $START AND $STOP
Job Configuration Options
Users who always should be informed
The command to be executed
Kills the job when not finished at $STOP
Another job located in /etc/schedule.d/ with the namejob.xyz; user@job or NONE. User jobs could never blocksystem jobs, but jobs will report when user dependendjobs has not finished!
Alerts when not not finished until $MAXSTOP and kills
Alerts when job has not finished until $STOP
Start before $START when all $DEPENDS are finished
Do not force before $FORCENAFTER
A community identification string like the IP or hostname or something self defined