Lead Image © homestudio, 123RF.com

Lead Image © homestudio, 123RF.com

Enterprise job scheduling with schedulix

Computing with a Plan

Article from ADMIN 31/2016
By
Cron is the simplest job scheduler on Unix and Linux systems, but if you are looking for an enterprise-level solution that offers unique, sophisticated sequencing and monitoring functions, schedulix might just fit the bill.

When IT people hear "enterprise job scheduling," they think of software tools for planning, controlling, monitoring, and automating the execution of mutually independent programs or processes. Although job scheduling always has been indispensable on mainframes and midrange systems, automatically controlled workflows are also quite popular on servers.

A job scheduling system can do far more than the Cron service, which simply acts as a timer to start processes. More than half of all mission-critical operations – starting with archiving, through backups and reports, to managing inventories – in companies throughout all branches of industry are designed to run as batch processes. According to a study by BMC, the vendor of an "agentless" scheduling solution, every single web transaction generates on average more than 10 batch processes [1].

To ensure stable operation of independent jobs, however, you need more functionality – for example, the ability to pass control information, to choose a monitoring option, or to request operator intervention. On top of this, resource control, parallel task processing, and distributed execution are all desirable. Little wonder, then, that products optimized in this way exist on the market, including IBM Tivoli Workload Scheduler [2], Entire Operations [3] by Software AG, or BMC's Control-M Suite [4]. All told, the number of available solutions with and without enterprise resource planning (ERP) support is not exactly small [5]. The programs cited here typically work across operating systems and can monitor and control the execution of programs on Windows, Unix, and Linux.

The product discussed in this article, schedulix [6], is a free enterprise resource scheduling system targeting small to medium-sized enterprises, designed for Linux environments, and available under an open source license.

Widespread Scripting

Many small to medium-sized enterprises use scripts (Bash, Perl, Python, etc.) to solve problems in workflow control. Workflow controls implemented in this way ensure that the system coordinates and synchronizes each process so that it completes the processes in the right order. If you want two processes (e.g., A and B) to run consecutively, you could simply bundle them into a shell script, but to make sure process B is working with valid data, you would need to take care that B does not launch if A returns an error. Fielding these errors obviously makes the script more complex in terms of testing and more difficult to read and maintain.

At the same time, you would need to extend the script so that it remembers which parts have been processed. Without this kind of history, the person responsible for job management would need to restart the script manually if, for example, program B terminates with an error after program A has been running for four hours. The manager could then decide whether to live with hours of lost work or to comment out the parts of the scripts that have already been processed, although this process is extremely prone to error.

You could implement a script history with a little help from a step file, which would need to be initialized when called the first time and after each termination. Additionally, you would need error handling when reading and writing progress reports.

This example clarifies how script-based process control can turn into a genuine programming project, even for very simple tasks, and it would be difficult to maintain, as well.

Practical applications show that script-based process control is manageable for a handful of jobs, but it becomes a very complex development task if you need to manage thousands of processes. This scenario is precisely the occasion for enterprise job scheduling.

schedulix and BICsuite

The open source schedulix has a commercial counterpart named BICsuite [7]. Both were created from the job specifications of many IndependIT Integrative Technologies GmbH [8] customers. IndependIT was founded in 1997 as a service provider for consulting projects in the database landscape. Since 2001, is has looked after BICsuite and schedulix exclusively.

According to the vendor, the current software is not just a byproduct of the business, but a totally new development created exclusively on Linux. Here, "new" is obviously relative given a development period of a decade. The company's new major customers include an international telecommunications group and a social network provider in Germany.

The schedulix software is designed for Linux environments and is exclusively delivered as source code. However, if you want to run a schedulix server and agents on a Windows system, you can register for a three- to four-day workshop with IndependIT. The company then installs a version of the Basic edition of BICsuite free of charge in the environment of your choice, be it Windows or Solaris.

Approach

In contrast to legacy job scheduling systems, schedulix takes a dynamic approach and computes the processes to execute on the basis on boundary conditions such as priorities or availability of resources. More specifically, schedulix works with the user-defined exit status model. In addition to freely defined exit states, defined jobs consider dependencies of other jobs, so that execution order can be controlled in stages.

When modeling workflows, users can draw on variables and parameters, sequences, branches to alternative partial sequences, or loops. The loop or branch condition can be an exit status or configurable trigger. Batches or jobs can be statically or dynamically parameterized when submitted. The tool also supports users in handling synchronization and exceptions. Two special features of the software are worthy of note: hierarchic workflow modeling and the ability to break down programs or scripts into smaller units with clearly segregated functionality (see the box "Process Decomposition").

Process Decomposition

The principle of process decomposition envisages small and thus easily manageable programs with a short run time to ensure flexible use. Isolating the functionality of these programs reduces the overhead for development and maintenance and simplifies monitoring, troubleshooting, and restarting. Process decomposition also has more potential for parallel processing and load distribution over multiple systems than a multifunctional application.

Process decomposition can only be leveraged fully if the code of the sub-processes is free of process control aspects. If you manage the individual programs with traditional operating system tools (i.e., with Cron or a simple job scheduling framework), you need to control the workflow with a superordinate script or implement the process controls in the programs themselves. This would compromise many of the benefits of process decomposition and make the task of centralizing process control more difficult.

When it comes to parallel processes, schedulix looks to be pretty well equipped. The dynamic submit feature dynamically submits (partial) job workflows and parallelizes them. It is also possible to automate dynamic batch job submits via triggers that depend on the exit status. For example, you can implement messaging for similar automated responses to sequence events in this way. Beyond this, users can break down their resources into units with the help of system resources and assign them to match the execution environment. A Resource Requirement lets you define the load level for a resource for each job (load control ).

Jobs can be assigned priorities relative to other jobs for cases in which resources are low. The documentation describes an option for automatically distributing jobs to multiple execution environments, depending on current resource availability, through the interaction of static and system resources; in other words, it is possible to set up job load balancing.

Additionally, schedulix also supports synchronizing resources , which request different lock modes (No Lock, Shared, Exclusive), which in turn retroactively synchronize what were originally independent workflows. Administrators can also assign a state model to a synchronizing resource, thus defining the resource requirement in a status-dependent manner. This means that automatic state changes can be set up depending on the exit status of a job.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Linux I/O Schedulers
    The Linux kernel has several I/O schedulers that can greatly influence performance. We take a quick look at I/O scheduler concepts and the options that exist within Linux.
  • Linux I/O Schedulers

    The Linux kernel has several I/O schedulers that can greatly influence performance. We take a quick look at I/O scheduler concepts and the options that exist within Linux.

  • Professional PowerShell environments
    The stability, portability, and scalability of PowerShell scripts is becoming increasingly important as automation scripts start to resemble mission-critical apps.
  • The New Hadoop

    Hadoop version 2 expands Hadoop beyond MapReduce and opens the door to MPI applications operating on large parallel data stores.

  • Manage updates and configuration with Azure Automation
    Microsoft Azure Automation provides a cloud-based service for handling automation tasks, managing updates for operating systems, and configuring Azure and non-Azure environments. We focus on VM update management and restarting VMs.
comments powered by Disqus