[mdlug] Better job scheduler

Michael ORourke mrorourke at earthlink.net
Mon Mar 16 23:46:14 EDT 2015


Lug Nuts,

Our company has a requirement for a job scheduler that runs on linux 
which would have a simple web UI that can show the status of running 
jobs, display a list of scheduled jobs along with a way to 
add/change/delete jobs.  No advanced ACLs are needed at this time. But 
something that is easy to use, much like the Windows task scheduler.  
However one feature that was requested is for a job crash notification.
Sounds simple enough, but what constitutes a job "crash"?  I'm glad you 
asked...
Say you have a scheduled job that runs at 1:00pm every day, and it takes 
around 15 mins to execute.  Suppose at 1:03pm, an admin logs into the 
box and does a quick reboot, not realizing the 1:00pm job was currently 
running.  So because the job normally spams the support staff, they 
don't even notice that the 1:00pm job never finishes, until someone 
calls the support line the next day and complains that we did not 
process their 1:00pm file from yesterday. Okay, maybe that's not the 
exact scenario, but you get the idea.
A better scenario, after the box is rebooted, the scheduler wakes up and 
looks for leftover PID files under /var/run or similar.  When it can't 
find the corresponding running job, then it pages the support on-call,  
"job #???? was terminated unexpectedly", or something like that.  
Sending a success/fail email is just not reliable enough, and these 
notifications tend to get ignored anyways.
Any suggestions/recommendations/ideas?

Thanks,
Mike



More information about the mdlug mailing list