[mdlug] Better job scheduler
Michael ORourke
mrorourke at earthlink.net
Mon Mar 16 23:46:14 EDT 2015
Lug Nuts,
Our company has a requirement for a job scheduler that runs on linux
which would have a simple web UI that can show the status of running
jobs, display a list of scheduled jobs along with a way to
add/change/delete jobs. No advanced ACLs are needed at this time. But
something that is easy to use, much like the Windows task scheduler.
However one feature that was requested is for a job crash notification.
Sounds simple enough, but what constitutes a job "crash"? I'm glad you
asked...
Say you have a scheduled job that runs at 1:00pm every day, and it takes
around 15 mins to execute. Suppose at 1:03pm, an admin logs into the
box and does a quick reboot, not realizing the 1:00pm job was currently
running. So because the job normally spams the support staff, they
don't even notice that the 1:00pm job never finishes, until someone
calls the support line the next day and complains that we did not
process their 1:00pm file from yesterday. Okay, maybe that's not the
exact scenario, but you get the idea.
A better scenario, after the box is rebooted, the scheduler wakes up and
looks for leftover PID files under /var/run or similar. When it can't
find the corresponding running job, then it pages the support on-call,
"job #???? was terminated unexpectedly", or something like that.
Sending a success/fail email is just not reliable enough, and these
notifications tend to get ignored anyways.
Any suggestions/recommendations/ideas?
Thanks,
Mike
More information about the mdlug
mailing list