scheduled
A job scheduling daemon
Problem
You never know when a job has finished...
Backups
backups are often done with rsync or some other file copy tool. rsync is very useful for that. in many cases jobs are not running efficient when running in parallel. it could be the storage system which becames slow when two prozesses are writing to it at the same time and it could be the network is under too heavy load when more then N jobs are running on the same subnet.
cron
- is not reliable because the jobs do not have dependencies
- you never know when a job is done and are starting new jobs before a dependent job is done. this could be the job itself if it runs daily and more then 24 hours or a job which need another job be finished because else it makes no sense to run. of course you could write wrapper scripts around that but you need to write a one script for each process to determine its status.
- No maximum execution time
Goal
The maximum usage of CPU cores in systems doing scheduled tasks!
The goal is, to create an application scheduler with some more intelligence then cron. cron is wonderfull for what it is created but you are not able to maximize the CPU utilization with cron. This is the goal of scheduled!
Control the execution of jobs, _not_inside_the_jobs_ ! This is because jobs could be a simple shell command like "cp" and you need to write a wrapper around to control this simple command.
Example configuration of scheduled
The configuration file is case insensitive.
/etc/scheduled.conf
RCPT=root,you@hanez.org,someone@hanez.org USERS=hanez UPDATE=5 MAX_PROC=16 MAX_PROC_COMMUNITY=4
Configuration Options
RCPT
TODO!
USERS
USERS=hanez,mom,dad
Users with own .schedule.d/ directory in $HOME
UPDATE
UPDATE=5
Update queue every N minutes (reread /etc/schedule.d/) or NONE. When NONE update only at restart
MAX_PROC
TODO!
MAX_PROC_COMMUNITY
TODO!
Example job configuration
Job configuration files are case insensitive.
/etc/schedule.d/backup-hanez.org
CMD=rsync -a hanez.org:/ /tmp/backups/hanez.org/ START=2200:2300 STOP=0300:0300 KILL=FALSE DEPENDS=NONE FORCE=TRUE COMMUNITY=10.0.1.0
/etc/schedule.d/backup-example.org
CMD=rsync -a example.org:/ /tmp/backups/example.org/ START=2200:2400 STOP=0300:0300 KILL=FALSE DEPENDS=NONE FORCE=FALSE COMMUNITY=10.0.2.0
/etc/schedule.d/backup-test.example.org
CMD=rsync -a test.example.org:/ /tmp/backups/test.example.org/ START=2200:2200 STOP=0300:0300 KILL=TRUE DEPENDS=NONE FORCE=FALSE WARN=TRUE RCPT=test@hanez.org COMMUNITY=10.0.2.0
/etc/schedule.d/group-servers
GROUP=backup-hanez.org \ backup-example.org backup-test.example.org
/etc/schedule.d/write-backups-to-tape
CMD=gz /tmp/backups | tar -foo -bar /dev/tape0 START=2200:2200 STOP=0600:0600 MAXSTOP=0900:0900 KILL=FALSE DEPENDS=group-servers,hanez@my-user-job WARN=TRUE RCPT=backup@hanez.org,tapes@hanez.org FORCE=TRUE FORCE_AFTER=1500 FORCE_DEPENDS=maybe-some-other-group-or-individual-job MIN_EXEC_TIME=60 # Run mininum N seconds, else report
/home/hanez/.schedule.d/my-user-job
RCPT=hanez CMD=rsync -a home.hanez.org:/ /home/hanez/tmp/backups/home.hanez.org/ START=0900:1800 STOP=0900:1800 KILL=FALSE DEPENDS=NONE FORCE=TRUE REPEAT=3600 # repeat every N seconds whithin $START AND $STOP
Job Configuration Options
RCPT
RCPT=root
Users who always should be informed
CMD
CMD=df -h
The command to be executed
START
START=0000:0100
STOP
STOP=2359:2359
KILL
KILL=TRUE/FALSE
Kills the job when not finished at $STOP
DEPENDS
DEPENDS=job.xyz/user@job/NONE
Another job located in /etc/schedule.d/ with the name job.xyz; user@job or NONE. User jobs could never block system jobs, but jobs will report when user dependend jobs has not finished!
MAXSTOP
MAXSTOP=TIME
Alerts when not not finished until $MAXSTOP and kills
WARN
WARN=TRUE/FALSE
Alerts when job has not finished until $STOP
FORCE
FORCE=TRUE/FALSE
Start before $START when all $DEPENDS are finished
FORCE_AFTER
FORCE_AFTER=1500
Do not force before $FORCENAFTER
COMMUNITY
COMMUNITY=foo
A community identification string like the IP or hostname or something self defined
REPEAT
TODO!