« Previous 1 2
Visualizing kernel scheduling
Behind Time
Going Further: NUMA Issues
Google has also highlighted SchedViz's ability to do more than just look at how processing resources are shared according to the CPU time given to each. SchedViz also provides a powerful way to visualize the way larger systems work with NUMA nodes.
Larger servers often have several NUMA nodes that are a subset of DRAM memory assigned to particular cores that can be accessed more quickly than the general memory pool. Cores can access NUMA nodes assigned to other cores, and often do if the cores are overworked; however, this process is much slower than if the cores stuck to their own NUMA node. This nonuniformity is a practical consequence of growing core count, but it brings challenges.
If a core jumps to a different NUMA node, performance can be affected significantly, because it will then have to pay an extra tax for each DRAM access. SchedViz can help identify cases like this, making it clear when a thread has had to migrate across NUMA boundaries. Moreover, SchedViz can show you which NUMA nodes are in use at any one time, helping to identify situations in which one part of your machine is overtaxed while the other part sits idle. A typical trace of that situation will look like Figure 4. SchedViz can identify an unbalanced system like this, so the sys admin can adjust the NUMA behavior [11] to fix the issue.
Further Resources
If you want to explore the features that come packaged with SchedViz, take a look at the detailed features walkthrough [12] provided by Google. This document will show you how to collect various types of traces and how to use the tools available for analyzing them.
Another very useful feature provided by the kernel is a debug feature [13] that can analyze trace data and stream it to a buffer for later analysis, providing you with a quick way of highlighting scheduling or scheduling rules problems.
At the moment, SchedViz is primarily useful for tracking scheduling errors and fine-tuning the way you assign computing resources. Although that's pretty useful in itself, plans are in progress to make it even more powerful in the future. Beyond using SchedViz for figuring out kernel scheduler defects, Google is also looking at using it to visualize other kernel tracepoints to analyze other kernel behavior that could be optimized for better efficiency, so watch this space.
Infos
- Understanding scheduling behavior with SchedViz: https://opensource.googleblog.com/2019/10/understanding-scheduling-behavior-with.html
- "Command Line – at, cron, and anacron" by Bruce Byfield, Linux Magazine , issue 225, December 2019, pg. 50: http://www.linux-magazine.com/index.php/Issues/2019/225/Command-Line-at-cron-anacron/
- Scheduling policy: https://www.oreilly.com/library/view/understanding-the-linux/0596005652/ch07s01.html
- NUMA: https://en.wikipedia.org/wiki/Non-uniform_memory_access
- "What is AES Encryption?" by Will Ellis: https://privacyaustralia.net/complete-guide-encryption/
- sched_setaffinity: https://linux.die.net/man/2/sched_setaffinity
- cgroups: https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
- SchedViz on GitHub: https://github.com/google/schedviz
- Yarn: https://www.yarnpkg.com/en/
- "30 Linux Commands Every User Should Know" by Arturas B.: https://www.hostinger.com/tutorials/linux-commands
- "NUMA overview" by Christoph Lameter: https://queue.acm.org/detail.cfm?id=2513149]
- SchedViz features and usage walkthrough: https://github.com/google/schedviz/blob/master/doc/walkthrough.md
- ftrace: https://www.kernel.org/doc/Documentation/trace/ftrace.txt
« Previous 1 2
Buy this article as PDF
(incl. VAT)