Optimizing utilization with the EDF scheduler
Mr. Efficient
Linux kernel 3.14 added an optional scheduling method known as Earliest Deadline First (EDF). EDF scheduling uses a scheme where the process closest to its deadline is the next process scheduled for execution.
EDF is mainly used in systems that need to meet time requirements, and it has been known to handle tasks on time even in cases for which common priority-driven scheduling fails (see Figure 1 and the "Utilization" box).
Utilization
The utilization of a single-core system may usually only be at a maximum of 69 percent for priority-driven scheduling and still process tasks on time. The Earliest Deadline First scheduler, on the other hand, allows almost 100 percent utilization of the system.
Utilization is a measure of whether the tasks of a real-time system comply with time limits in all circumstances. Utilization is similar to load, but it is calculated according to a slightly different formula,
where the numerator is the maximum processing time (Worst Case Execution Time, WCET) of task j , t _{Dmax ,j } is its maximum completion time (maximum deadline), and t _{Pmin ,j } is its minimum time interval before waking the same task (period).
If, with priority-controlled scheduling, utilization is smaller or equal as a function of the number of tasks,
it will also adhere to all above-mentioned time limits in the worst-case scenario. There is a utilization limit of approximately 69 percent for a large number of tasks (i.e., a large n ). On-time processing is still possible, even if the utilization is above the given limits (for EDF on a single-core system at 100 percent). In such cases, however, a more detailed study would be necessary.
Existing interfaces, which select a scheduling process and assign parameters, do not yet support the EDF process. The development in Userland might be lagging behind the kernel, but admins can still use Earliest Deadline First scheduling if they put in a little manual work. This work involves defining required data structures and interface functions. The code from the kernel documentation [1] in Listing 1 shows how to set up the necessary structures.
Listing 1
dl_test.c
001 #define _GNU_SOURCE 002 #include <unistd.h> 003 #include <stdio.h> 004 #include <stdlib.h> 005 #include <string.h> 006 #include <time.h> 007 #include <linux/unistd.h> 008 #include <linux/kernel.h> 009 #include <linux/types.h> 010 #include <sys/syscall.h> 011 #include <pthread.h> 012 013 #define gettid() syscall(__NR_gettid) 014 015 #define SCHED_DEADLINE 6 016 017 /* XXX use the proper syscall numbers */ 018 #ifdef __x86_64__ 019 #define __NR_sched_setattr 314 020 #define __NR_sched_getattr 315 021 #endif 022 #ifdef __i386__ 023 #define __NR_sched_setattr 351 024 #define __NR_sched_getattr 352 025 #endif 026 #ifdef __arm__ 027 #define __NR_sched_setattr 380 028 #define __NR_sched_getattr 381 029 #endif 030 031 static volatile int done; 032 033 struct sched_attr { 034 __u32 size; 035 036 __u32 sched_policy; 037 __u64 sched_flags; 038 039 /* SCHED_NORMAL, SCHED_BATCH */ 040 __s32 sched_nice; 041 042 /* SCHED_FIFO, SCHED_RR */ 043 __u32 sched_priority; 044 045 /* SCHED_DEADLINE (nsec) */ 046 __u64 sched_runtime; 047 __u64 sched_deadline; 048 __u64 sched_period; 049 }; 050 051 int sched_setattr(pid_t pid, const struct sched_attr *attr, unsigned int flags) 052 { 053 return syscall(__NR_sched_setattr, pid, attr, flags); 054 } 055 056 int sched_getattr(pid_t pid, struct sched_attr *attr, unsigned int size, unsigned int flags) 057 { 058 return syscall(__NR_sched_getattr, pid, attr, size, flags); 059 } 060 061 void *run_deadline(void *data) 062 { 063 struct sched_attr attr; 064 int x = 0, ret; 065 unsigned int flags = 0; 066 067 printf("deadline thread start %ld\n", 068 gettid()); 069 070 attr.size = sizeof(attr); 071 attr.sched_flags = 0; 072 attr.sched_nice = 0; 073 attr.sched_priority = 0; 074 075 /* creates a 10ms/30ms reservation */ 076 attr.sched_policy = SCHED_DEADLINE; 077 attr.sched_runtime = 10 * 1000 * 1000; 078 attr.sched_period = 30 * 1000 * 1000; 079 attr.sched_deadline= 30 * 1000 * 1000; 080 081 ret = sched_setattr(0, &attr, flags); 082 if (ret < 0) { 083 done = 0; 084 perror("sched_setattr"); 085 exit(-1); 086 } 087 088 while (!done) { 089 x++; 090 } 091 return NULL; 092 } 093 094 int main (int argc, char **argv) 095 { 096 pthread_t thread; 097 098 printf("main thread [%ld]\n", gettid()); 099 pthread_create(&thread, NULL, run_deadline, NULL); 100 sleep(10); 101 done = 1; 102 pthread_join(thread, NULL); 103 printf("main dies [%ld]\n", gettid()); 104 return 0; 105 }
The block of code starting with line 17 defines the required system call numbers for three platforms (ARM, 32 bit, and 64 bit). The central data structure that follows, struct sched_attr
(line 33), establishes the scheduling parameters [2]. The third part implements the system call selections (line 51).
The new system call
sched_setattr(pid_t pid,const structsched_attr * attr,unsigned int flags)
activates both EDF and the other scheduling method (line 51). It gets three parameters along the way. The first specifies the thread for which the scheduling applies. The second parameter selects the scheduling process and any required parameters. The third parameter is intended for potential enhancements;
is currently passed.
The return value
indicates that the scheduling was successfully selected and parameterized. If an error like -1
is returned, the global variable errno
contains an error code, (e.g., for a Permission denied
). In the last case, the job is presumably not running with the root privileges required to select the scheduling. The call of the function sched_setattr()
takes place in the example within the separate function run_deadline()
, which the code starts in line 99 as a separate thread.
In total, four values are important for parameterizing the EDF: the scheduling process itself (Policy), the processing time in the worst-case scenario (WCET), the shortest period in which the computing process is reactivated after the last activation (period), and finally the actual deadline. These times are stored in 64-bit variables and are specified in nanoseconds.
The task programmed in Listing 1 is active at a minimum distance of 30 milliseconds (line 78) for a maximum of 10 milliseconds (line 77). The calculations (the cycle) must be completed after 30 milliseconds at the latest (line 79). The Linux EDF scheduler recalculates the deadline every time a task is woken.
No Cheating
Specifying the deadline for parameterizing the EDF should be sufficient, but the Linux kernel has supplemented the classical EDF with a workload control (bandwidth control) that requires knowledge of the WCET and period. The quotient of the two results in utilization.
Linux ensures that the utilization of jobs selected by EDF by default does not add up to more than 95 percent of the available computing time. That leaves at least 5 percent for tasks that manage another scheduling algorithm. This setting can be changed by using the proc files:
/proc/sys/kernel/sched_rt_period_us
/proc/sys/kernel/sched_rt_runtime_us
The admin writes a -1
in the /proc/sys/kernel/sched_rt_runtime_us
proc file to decommission this utilization control.
Linux very drastically stalls jobs that make false claims in the WCET or period and take longer to compute than requested. CPU time is even withdrawn. It also refuses to specify overly large (more than 2^63) deadlines or periods. Finally, this causes an error in calculating time differences. Linux will also prevent the admin from cheating the system by calling fork()
, whereby the child process normally inherits the attributes of the parent process. The
int sched_getattr(pid_t pid, struct sched_attr *attr,unsigned int size, \ unsigned int flags)
(line 56) system call newly introduced with kernel 3.14 also reads the once-set scheduling parameters. The parameter pid
references the task again, where
represents the calling task. attr
concerns a memory area's address of size size
, from which the program is supposed to store the parameters. A
must be passed for flags
.
Kernel Exchange
The command
make LDLIBS="-lpthread" dl_test
makes an executable program from the source code in Listing 1. This command requires a kernel version 3.14 or beyond. By default, the kernel version only comes in with Ubuntu 14.10 (Utopic Unicorn). However, a current kernel can also be installed on older versions, especially version 14.04 with Long Term Support.
You can install kernel 3.18.1 for a 64-bit Ubuntu quite easily via the command shown in Listing 2. A developer who wants to use the 32-bit version can simply exchange the text amd64
for i386
in the filenames from Listing 2.
Listing 2
Installing the Kernel from a PPA
01 wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18.1-vivid/ linux-image-3.18.1-031801-generic_3.18.1-031801.201412170637_amd64.deb 02 sudo dpkg -i linux-image-3.18.1-031801-generic_ 3.18.1-031801.201412170637_amd64.deb
The commands that represent active tasks are not yet familiar with the deadline scheduling and therefore display the kernel internal number, #6
(Figure 2). This number is also visible if the admin reads a job's sched
file in the directory /proc/PID/
, which EDF manages. The entry in Figure 3 indicates a priority of -1
. This entry is also indicative of deadline scheduling because to indicate and simplify internal processes, EDF jobs are assigned a the priority -1
.
Class Society
The whole scheduling problem has been on a new footing thanks to scheduling classes (Figure 4) – especially since the Completely Fair Scheduler (CFS), which makes it possible to use various scheduling techniques simultaneously in one system, was merged into kernel version 2.6.23.
EDF is embedded within the new deadline scheduling class [3]. Linux then processes all classes one after another: first – hence the highest priority – the new deadline scheduling class, then the real-time class and, finally, the Completely Fair Scheduler.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.