Data Compression as a CPU Benchmark
Cuckoo Clock
At the Dragon Propulsion Laboratory, we have long been looking for a meaningful CPU benchmark, beyond pure computational muscle displays and those architecture-bound demonstrations of number-crunching prowess usually favored by chip vendors. We have long relied on convenient load generators like stress
[1] and its modern descendant stress-ng
[2], but putting load on the system and summing its performance in a few numbers are not really the same thing: The latter helps compare systems, whereas the former helps test them. I think the search is finally over, and I am settling on data compression as a more realistic compute benchmark that does not look like a stress exercise for a vintage math coprocessor or a GPU.
Comparing systems should always involve a real workload, and comparing an ARM with an x86 system on the grounds of how fast it can compress data feels more truthful than doing so on pure disjoint CPU operations. Of course, if more information is available, it should be used by testing with the exact workload, but otherwise this choice represents a sensible default.
Enter LZMA
The Lempel-Ziv-Markov chain algorithm (LZMA) [3] is a dictionary-based, lossless compression algorithm in use with the 7-Zip archiver [4]. Demonstrating higher compression ratios than the original LZ77 [5] algorithm, it is generally expected to have comparable decompression performance. More important to the purpose, 7-Zip's author is very methodical with his performance testing, providing a reference library of results for many CPU types [6], and prebuilt binaries of the tool exist for both Linux and Windows. A sampling of the results library available online is the benchmark for the recent Apple M1 processor, captured in Figure 1 [7], including both test results and CPU architecture notes.
Per my usual custom, I test on the latest Ubuntu LTS, 20.04 "Focal Fossa," with 7-Zip installed from the Universe repository:
$ sudo apt install p7zip-full
After completing the install, you have access to a straightforward benchmark requiring no setup or configuration through the single command 7z b
– in this variation, MIPS results are normalized against a "standard" Intel Core 2. Running the test on a single virtual CPU (vCPU) Digital Ocean droplet in the NYC1 availability zone yielded the results found in Listing 1. The results size this core at about 3 billion instructions per second (see the MIPS rating). On a multicore processor, the benchmark would repeat continuing to double the cores in use up to the maximum allowable number, as for the Apple M1 [7] example with eight cores in Listing 2. The results there approach 50 billion instructions per second.
Listing 1
7z b Output
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,1 CPU Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz (50654),ASM,AES-NI) Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz (50654) CPU Freq: - - - - - - - - - RAM size: 1987 MB, # CPU hardware threads: 1 RAM usage: 435 MB, # Benchmark threads: 1 Compressing | Decompressing Dict Speed Usage R/U Rating | Speed Usage R/U Rating KiB/s % MIPS MIPS | KiB/s % MIPS MIPS 22: 3062 100 2987 2979 | 35411 100 3028 3023 23: 2733 100 2787 2785 | 34100 100 2954 2952 24: 2482 99 2702 2669 | 33606 100 2957 2950 25: 2312 99 2657 2640 | 31616 100 2822 2814 ---------------------------------- | ------------------------------ Avr: 99 2783 2768 | 100 2940 2935 Tot: 100 2862 2852
Listing 2
7z b Apple M1 Output
7-Zip (z) 21.03 beta (arm64) : Copyright (c) 1999-2021 Igor Pavlov : 2021-07-20 64-bit arm_v:8 locale=en_US.UTF-8 Threads:8, ASM Compiler: Apple LLVM 12.0.5 (clang-1205.0.22.9) GCC 4.2.1 CLANG 12.0 Darwin : 20.4.0 : Darwin Kernel Version 20.4.0: PageSize:16KB Apple M1 8C8T RAM size: 16384 MB, # CPU hardware threads: 8 RAM usage: 1779 MB, # Benchmark threads: 8 Compressing | Decompressing Dict Speed Usage R/U Rating | Speed Usage R/U Rating KiB/s % MIPS MIPS | KiB/s % MIPS MIPS 22: 51020 750 6559 49633 | 538251 795 5762 45898 23: 46106 727 6402 46977 | 529572 795 5757 45809 24: 45006 749 6452 48391 | 515399 788 5722 45221 25: 44111 759 6616 50365 | 505839 793 5664 45009 ---------------------------------- | ------------------------------ Avr: 46561 747 6507 48841 | 522265 792 5726 45484 Tot: 770 6117 47163
Infos
- stress: https://githubmemory.com/repo/resurrecting-open-source-projects/stress
- stress-ng: https://github.com/ColinIanKing/stress-ng
- LZMA: https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm
- 7-Zip: https://www.7-zip.org/
- Ziv, Jacob, and Abraham Lempel. A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory, May 1977, 23:3
- 7-Zip CPU benchmark library: https://www.7-cpu.com/
- Apple M1 7z benchmark results: https://www.7-cpu.com/cpu/Apple_M1.html
Buy this article as PDF
(incl. VAT)