Lead Image © Igor Kovalchuk, 123RF.com

Lead Image © Igor Kovalchuk, 123RF.com

Analysis tour with Binary Ninja

Martial Arts

Article from ADMIN 73/2023
By
Binary analysis is an advanced technique used to work through cyberattacks and malware infestations and is also known as reverse engineering. We show you how to statically analyze binary programs with Binary Ninja, an interactive binary analysis platform.

If you want to know exactly what operations a program on your computer performs, you have several ways to find out. One obviously simple method is to investigate the source code of the file, if available. You can do so easily with scripting languages like Python or open source software, but if programs are translated into bytecode before delivery, your job is a little more complicated.

Various tools are involved in converting code into an executable program: in most cases, at least a compiler and a linker. From the commands in a programming language, the compiler creates the machine language. In doing so, it optimizes the execution sequence or individual operations, depending on the configuration, to a greater or lesser extent, which can ultimately have a major effect on the resulting machine code. During linking, the libraries are statically or dynamically linked to the program. Static linking adds the code from the libraries to the resulting program. If libraries are linked dynamically, the code is located in external files and is only added to the process's working memory when the program is started.

Binary Ninja [1] is a tool for static program analysis. Originally designed for use in capture-the-flag competitions, Binary Ninja is now being developed commercially. For initial insights into the functionality, as covered by this article, it's fine to use the free trial version; download and install the version for your operating system to try it out.

Create a Test Program

To get started with a program that is as simple as possible for analysis, it's a good idea to write your own. Name the following code admin.c:

#include <stdio.h>
int main(int argc, char **argv)
{
printf("Hello World!");
return 0;
}

You can compile the source code with:

gcc -O0 admin.c -o admin

The -O0 entry switches off the compiler optimizations to keep the machine code closer to the source code, which is especially useful for debugging during development.

Analyzing Applications

To analyze the program you created, launch Binary Ninja and open the admin file. The analysis interface shows an overview of the symbols included in the binary, a code view, and the structure of the program in a feature map, among other options. Binary Ninja offers different levels of the generated program code, which can significantly facilitate the analysis. The default High Level IL output shows generated code that already looks very similar to the original C code. IL stands for "intermediate language"; Binary Ninja offers different language levels up to C code. In this example, the main function will look very similar to the C code.

The call to printf is replaced by the underlying function __printf_chk (Figure 1), which is called at the code level and checks the format string for possible stack overflows before outputting. If you want to get closer to the machine code, click through the different language layers and look at the different derivations.

Figure 1: Underlying printf function.

Note that the displayed code is generated from the machine code. It's more of an approximation of programming in a high-level language like C or C++ than an actual representation of the underlying code. This is also evident in the generic, and mostly not very meaningful, variable names, which are oriented on the CPU registers or memory locations. You cannot compile this code again without further editing.

Now select Disassembly in the top bar, and you will see the program in assembler code. The main function then contains the commands. In addition to operations on the stack pointer rsp, you can see how lea is used to load the address of the Hello World! string into the rsi register (Figure 2) – after all, the format string resides in the Data area of the binary. Next is initializing the registers edi with 1 and eax with  . Instead of xor eax, eax, you could also use mov eax, 0x0. In fact, this is already a compiler optimization, because the opcode is two bytes shorter, which prepares the arguments for __printf_chk, whereas call calls the function. Afterward, a   return value is stored in eax, the stack pointer is reset, and the function is exited.

Figure 2: Hello World! program assember code.

You are currently in executable and linkable format (ELF) mode in the code display. ELF is the Linux binary format that is evaluated directly by Binary Ninja and displayed accordingly. If you open the menu where ELF is currently selected and switch to RAW mode, you can view the ELF headers themselves, which is where you will find, for example, the pointers to dynamic libraries or the string and symbol tables.

For example, to display the sequence graph instead of the linear representation of the code, select the Graph representation in the menu to the right – currently still with the Linear display. In this small example, only the main function is initially displayed. For example, if you select deregister_ tm_clones from the icons on the left to manage memory transactions, the graph becomes a little larger. If you open a real program with Binary Ninja, you can better understand the structure and relationships of the processes with the flow graph.

If you select Hex as the display format instead, the program file will be displayed in a hex editor. Besides the address on the left and the hexadecimal representation in the middle, you can see the printable characters on the right. For example, here you can find the hex values of the string Hello World! . However, changes to the program are not possible in this mode.

Besides ELF programs, Binary Ninja supports many other executable file formats. You can analyze the portable executable (PE) binaries common on Windows systems, as well as the Mach-O binaries used by Apple's operating system. You are not limited to x86 or x64 platforms and can disassemble programs compiled for architectures such as ARM, MIPS, or PowerPC. Because Binary Ninja is not a debugger or decompiler, but disassembles binary data in line with the assembler for the respective platform, interpreted languages or those with an intermediate representation of the code, such as with the Java Virtual Machine (VM) or .NET, cannot be meaningfully analyzed.

Extensions and Legal

If you have a valid license, you can use the Binary Ninja Python API. With its help, you can automate operations that you perform regularly or control the program (e.g., change settings or displays, move around the generated code, or launch plugins). If you are missing a function, Binary Ninja lets you add plugins. Of course, you can also develop these yourself. Interfaces to C/C++, Rust, and Python are available for this purpose. The Plugin Manager also lets you install extensions written by other users.

To analyze malware, experts regularly use tools such as Binary Ninja or Ghidra [2], developed by the US National Security Agency (NSA), to convert binary code into other representations. Copyright law sets narrow limits for this activity. For example, you are only allowed to convert the code if you hold the rights to it yourself or if it is necessary to ensure the interoperability of an existing program. Analyzing malware to protect your own infrastructure is presumably not critical here, but whether parts of the code may be used in the context of public reporting is something that needs to be examined on a case-by-case basis. Of course, it seems very unlikely that the malware developers, who are themselves criminals, will enforce their rights in a court of law in this case.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • The Meson Build System

    Developers fed up with cryptic Makefiles should take a look at the new Meson build system, which is simple to operate, offers scripting capabilities, integrates external test tools, and supports Linux, Windows, and Mac OS X.

  • From debugging to exploiting
    Kernel and compiler security techniques, together with sound programming practices, fend off memory corruption exploits.
  • Malware analysis in the sandbox
    In malware analysis, a sandbox can provide insight into the software and its run-time environment. While a sandbox can prevent the execution of malicious code with built-in detection mechanisms, malware developers can use countermeasures to take advantage of those same detection mechanisms.
  • Static code analysis finds avoidable errors
    Static code analysis tools like JSLint, Splint, RATS, and Coverity help you find code vulnerabilities.
  • Workflow-based data analysis with KNIME
    They say data is "the new oil," but all that data you collect is only valuable if it leads to new insights. An open source analysis tool called KNIME lets you analyze data through graphical workflows – without the need for programming or complex spreadsheet manipulation.
comments powered by Disqus