Benchmarking a new architecture
Risky Business
As I do at least twice every year, I changed my plans for this column at the very last minute. This time, some interesting hardware came into the lab rightfully deserving of analysis – a new single-board-computer (SBC) that is part of the first production run of the BeagleV-Ahead [1], the latest addition to the BeagleBone family [2] (Figure 1). This new entry is built around a CPU based on the up-and-coming RISC-V architecture [3] (Figure 2). RISC-V is a relatively recent open source CPU instruction set available royalty-free and has successfully drawn interest from more than a dozen chip suppliers so far. It cannot yet compete with x86 chips from Intel and AMD, or even the best many-core ARM offerings, but it shows great promise and is therefore worth tinkering with.
What's in the Box?
Despite its different architecture, the BeagleV-Ahead is designed around the well-known BeagleBone Black form factor. Featuring a quad-core Xuantie C910 processor designed by Alibaba's team and later released as open source [4], this out-of-order, pipelined core is the fastest RISC-V CPU made available commercially to date. A modern 64-bit CPU, the chip is a product of T-Head, Alibaba's semiconductor unit, and it is apparently commercially restricted by the federal government, because I had to confirm that my order was not intended for export.
Modern politics aside, the CPU itself has some interesting extensions for artificial intelligence (TPU delivering 4 TOPS INT8 at 1GHz), includes a 50 GFLOPS GPU, and is accompanied by 4GB of RAM and 16GB of embedded multimedia card (eMMC) flash storage. Interfaces include USB3, micro-HDMI, microSD card, display serial interface (DSI), and camera serial interface (CSI) alongside the standard Beagle "cape" GPIO connectors. Connectivity options include gigahertz wired Ethernet and WiFi. The eMMC can be flashed to a different distribution over USB, and serial debugging interfaces are accessible. Power is supplied over a 2.1mm barrel connector at 5V, or directly over USB. See Figure 3 for a layout diagram and Table 1 for the full specifications.
Table 1
BeagleV-Ahead Specs
Feature | Spec |
---|---|
CPU | T-Head TH1520 at 2GHz |
Quad-core Xuantie C910 | |
64KB+64KB data/instruction caches per core | |
1MB shared L2 cache | |
GPU | 50GFLOPS BXM-4-64 |
NPU | 4TOPS INT8 at 1GHz |
Memory | 4GB LPDDR4 |
Storage | 16GB eMMC flash |
MicroSD | |
Networking | 802.11n, Bluetooth |
Realtek RTL8211F-VD-CG Gigabit Ethernet | |
USB | 3.0 (OTG and flash support) |
Video | Micro-HDMI |
Power | 5V, USB or 2.1mm barrel connector |
Other | 2 CSIs, 1 DSI |
I2C, UART, SPI, ADC, PWM, GPIO |
Once a piece of new hardware works, the immediate next challenge is the completeness and maintainability of the binary support package. I cannot speak to the latter as it pertains to the future, but the availability of a Yocto distribution (2023-06 preloaded in flash) [5] suggests one could self-support the board with custom builds easily enough once it is no longer the focus of the vendor's attention. For the former, the available port of Ubuntu (2023-07 based on Lunar Lobster) [6] makes the case nicely. It is worth noting that the Yocto image has no valid HDMI configuration, and switching to a text terminal (Ctrl+Alt+F1) is necessary.
To the Moon!
The Yocto image in built-in flash is a convenient, albeit sparse, development environment. For benchmarks, Ubuntu is the obvious choice between the two options. Flashing the much larger Ubuntu base image is a relatively painless process that I will not detail, for the sake of brevity (Figure 4). After flashing Ubuntu 23.04 "Lunar" to the on-board eMMC I have convenient online access to the full Main and Universe repositories to further my system exploration. With access to Ubuntu's RISC-V Universe repository, you can just install 7-Zip [7], as discussed in a previous article [8]:
$ sudo apt install p7zip-full
As I explained then, comparing systems should always involve a real workload, and comparing RISC-V with ARM on the grounds of how fast it can compress data feels more truthful than doing so on pure disjoint CPU operations. The benchmark has many switches, of course, but a first pass can be executed with just
$ 7z b
Listing 1 shows the RISC-V speeds posted by the SBC. The previous article [8] included results for a single-core virtualized Xeon core and an Apple M1 ARM Desktop, the latter extracted from the exceptional online library of CPU benchmarks that 7-Zip hosts [9]. A more apt comparison is found in Listing 2, with the results posted by a Raspberry Pi 400 [10], which is essentially a Raspberry Pi 4 (Broadcom BCM2711 Cortex-A72, ARM v8 quad-core running at 1.8GHz). The ARM chip in the Pi is posting 6.2 billion instructions per second compared with 4.5 for the RISC-V in the Beagle.
Listing 1
7z b Output on Xuantie C910
7-Zip 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,4 CPUs LE) LE CPU Freq: 64000000 64000000 - - - - - - - RAM size: 2923 MB, # CPU hardware threads: 4 RAM usage: 882 MB, # Benchmark threads: 4 Compressing | Decompressing Dict Speed Usage R/U Rating | Speed Usage R/U Rating KiB/s % MIPS MIPS | KiB/s % MIPS MIPS 22: 3252 300 1054 3164 | 73071 398 1566 6234 23: 3170 314 1029 3230 | 68302 399 1482 5910 24: 3086 320 1037 3318 | 66423 399 1463 5831 25: 2904 327 1014 3316 | 62838 397 1407 5593 ---------------------------------- | ------------------------------ Avr: 315 1033 3257 | 398 1480 5892 Tot: 357 1256 4575
Listing 2
7z b Output on Pi 400 BCM271
7-Zip [32] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,32 bits,4 CPUs LE) LE CPU Freq: 975 1181 1798 1798 1796 1798 1798 1798 1798 RAM size: 3838 MB, # CPU hardware threads: 4 RAM usage: 882 MB, # Benchmark threads: 4 Compressing | Decompressing Dict Speed Usage R/U Rating | Speed Usage R/U Rating KiB/s % MIPS MIPS | KiB/s % MIPS MIPS 22: 3787 361 1021 3685 | 109383 399 2341 9332 23: 3656 363 1025 3725 | 106167 399 2304 9186 24: 3214 337 1027 3456 | 97910 383 2243 8595 25: 3294 362 1040 3762 | 91207 375 2165 8117 ---------------------------------- | ------------------------------ Avr: 356 1028 3657 | 389 2263 8808 Tot: 372 1646 6232
Infos
- BeagleV-Ahead: https://www.beagleboard.org/boards/beaglev-ahead
- BeagleBone boards: https://www.beagleboard.org/boards
- RISC-V architecture: https://riscv.org
- OpenXuantie OpenC910 core: https://github.com/T-head-Semi/openc910
- Xuantie Yocto: https://www.beagleboard.org/distros/beaglev-ahead-yocto-npi-2023-06-10
- Xuantie Ubuntu: https://www.beagleboard.org/distros/beaglev-ahead-ubuntu-2023-07-05
- 7-Zip: https://www.7-zip.org
- "Data Compression as a CPU Benchmark" by Federico Lucifredi, ADMIN , issue 66, 2021, https://www.admin-magazine.com/Archive/2021/66/Data-Compression-as-a-CPU-Benchmark
- 7-Zip CPU benchmark library: https://www.7-cpu.com/
- Raspberry Pi 400: https://www.raspberrypi.com/products/raspberry-pi-400/
Buy this article as PDF
(incl. VAT)