The Cuckoo sandboxing malware analysis tool

Cuckoo, Cuckoo

Installation

The examples in this article are all based on an installation of Cuckoo 2.0 RC1 on Fedora 23 and KVM/libvirt. The archive with the sandboxing software is available for download [3]. After unpacking, you need to create a cuckoo user and add the account to the libvirt group. Because Cuckoo must at all times be in a position to create a virtual machine using the libvirt framework, the Polkit rule from Listing 1 ensures that access to the framework is possible for all members of the libvirt group.

Listing 1

Polkit Rule for libvirt

polkit.addRule(function(action, subject) {
  if (action.id == "org.libvirt.unix.manage" && subject.isInGroup("libvirt")){
       return polkit.Result.YES;}
});

In the conf/ folder, you will want to check out the configuration files, of which there are many, including cuckoo.conf, kvm.conf, auxiliary.conf, memory.conf, processing.conf, and reporting.conf. For a first test, the cuckoo.conf, auxiliary.conf, and kvm.conf files are the most important. If you use any virtualization solution other than KVM/libvirt, such as VMware or VirtualBox, suitable configuration files are available for them, too.

Basic settings for operating Cuckoo are defined in the cuckoo.conf file. For example, a few of the parameters you can use, all of which are very well documented, define which virtualization solution to use (e.g., machinery = kvm), the host to which reports and logfiles are sent (ip), and whether a memory dump file of the virtual machine is to be created (memory_dump).

If you use KVM/libvirt on the management system, you need to customize the settings in the kvm.conf file to ensure that the name (machines), label, IP address (ip), and operating system (platform) of the virtual machine are specified correctly. Of course, several machines can be defined at this point, because all statements within a separate section apply to just this machine. If you have, say, virtual machines named fed01 and win01, then the entries in the kvm.conf file might look like Listing 2.

Listing 2

Sample Configuration

machines = fed01, win01
interface = virbr0
[fed01]
label = fed01
platform = linux
ip = 192.168.122.10
[win01]
label = win01
platform = windows
ip = 192.168.122.110

Additional services can be integrated from the auxiliary.conf file. For example, this is where you determine whether to dump the virtual machine's network traffic. The services are implemented via a separate module in the modules/auxiliary/ folder and can be extended if necessary. Generally, this customization approach applies to all configurations in Cuckoo and is one of the great strengths of the software.

The memory.conf file defines what type of tests to perform on the memory dump of the virtual machine. For example, you can define whether the memory should be searched for specific kernel modules, and you can specify in the processing.conf file what exactly the analysis of the malware samples should look like. Among other things, integration with the VirusTotal online service is possible, or you can specify that a Python script is generated dynamically based on the malware process dump; the script can then be download for further analysis in IDA Pro. Finally, the entries in the reporting.conf file determine what form the Cuckoo reports should take. The JSON format is a useful choice for automated processing of the results downstream. HTML reports are fine if you want an overview of the analysis results. If you team Cuckoo with MongoDB, you can use the Django-based web interface to access the results.

Even though we have not yet created a virtual machine to analyze the malware samples, Cuckoo should launch with these settings. The call to ./cuckoo.py from the installation directory welcomes the user with a nice display of ASCII art (Figure 2). You still can't do much, because the virtual machines that Cuckoo uses as the basis for analyzing the malware samples we will be passing in and their snapshots need to be generated in the next step.

Figure 2: After the ASCII art welcome message, Cuckoo waits for incoming malware analysis jobs.

Generating Machine Templates

Cuckoo does not have its own procedure for generating virtual machine templates; instead, it relies on existing tools and mechanisms. For this article, I used a single virtual machine based on Fedora. You can use the graphical virt-manager tool for the installation, or virt-install in the shell. The connection to the host system running the Cuckoo Management Framework is controlled by a bridge. KVM/libvirt uses the private IP address space 192.168.122.0/24, where the address 192.168.122.1 is assigned to the host system. Hard drives should be created as LVM or QCOW2 volumes within the virtual machine; otherwise, no snapshots of the machine can be generated. A description of the complete installation is beyond the scope of this article, which is why I refer you to the existing installation instructions [6]. You also need to ensure that Python 2.7 is installed on the virtual machine, because the Cuckoo analysis software requires this version.

Assuming that the installation of the machine is successful, you should update the kvm.conf configuration file on the Cuckoo host system with the correct data for the machine. As already mentioned, this includes the IP address, the name, and the label of the virtual system. Finally, you need to copy to the machine the Cuckoo agent (agent.py), which you will find on the host system in the Cuckoo installation folder agent/.

You can start the agent on the virtual machine with the help of the shell script also found in the same folder. The folder in which the agent is stored doesn't really matter. The agent implements an XMLRPC server that waits for incoming connections from the host system. The malware samples are then sent to the system through these connections. Ensure that the agent starts automatically after rebooting the virtual machine before creating a snapshot of the machine in the next step. You can create such a snapshot using libvirt's own tools:

# virsh snapshot-create fed01
# virsh snapshot-list fed01
Name       Creation Time    State
-----------------------------------
1469460006 2016-07-25 [...] running

To avoid problems, only one snapshot should ever exist per virtual machine. After the snapshot is created, the virtual machine can be turned off. Cuckoo will now access the system's snapshot as soon as a sample is received and analyzed within a virtual system instance.

At this point, note that existing virtual systems can also serve as a basis for Cuckoo. If you already have such a system and only need to change the disk types, you can easily convert such a image with the command:

# qemu-img convert -O qcow2 fed01.raw fed01.qcow2

Then you need to create the new disk type (<driver name='qemu' type='qcow2'/>) in the XML definition of the virtual machine and state the storage location of the image file (<source file='/var/lib/libvirt/images/fed01.qcow2'/>). The easiest way to do this is to use the command:

# virsh edit fed01

Again, replace the fed01 label with the label of your own virtual machine, save the file, and start the virtual system; it should be possible to create a snapshot.

Cuckoo Operation

For a first test, your best bet is to use the European Institute for Computer Antivirus Research (EICAR) test file, which is detected as a virus by most malware analysis systems [7]. To keep the file name from triggering Cuckoo, I renamed the file readme.txt for my tests.

Several options allow you to send the fake malware to Cuckoo (e.g., the Django web interface lets you upload files). Cuckoo itself offers a feature-rich API that lets you send malware samples to the management system from your own applications. The easiest approach, though, is to use submit.py from Cuckoo's utils folder. The tool has many options, but in the simplest case, calling it with the file path of the malware as a parameter will suffice:

$ utils/submit.py tests/readme.txt
Success: File "/home/tscherf/cuckoo/test/readme.txt" added as task with ID 6

For Cuckoo to accept the file, the management framework must be started up front by running python cuckoo.py from the installation folder – if you have not done this already. Immediately after posting a sample, Cuckoo outputs appropriate messages on the console (Listing 3).

Listing 3

Cuckoo Output

2016-07-25 17:37:00,192 [lib.cuckoo.core.scheduler] INFO: Starting analysis of FILE "readme.txt" (task #6, options "")
2016-07-25 5:37:00 PM,207 [lib.cuckoo.core.scheduler] INFO: File already exists at "/home/tscherf/cuckoo/cuckoo/storage/binaries/275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f"
2016-07-25 5:37:00 PM,335 [lib.cuckoo.core.scheduler] INFO: Task #6: acquired machine fed01 (label=fed01)
2016-07-25 17:37:00,345 [modules.auxiliary.sniffer] INFO: Started sniffer with PID 9360 (interface=virbr0, host=192.168.122.10, pcap=/home/tscherf/cuckoo/cuckoo/storage/analyses/6/dump.pcap)
tcpdump: listening on virbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
2016-07-25 17:37:02,801 [lib.cuckoo.core.guest] INFO: Starting analysis on guest (id=fed01, ip=192.168.122.10)
2016-07-25 17:39:14,711 [lib.cuckoo.core.guest] INFO: fed01: analysis completed successfully
31 packets captured
31 packets received by filter
0 packets dropped by kernel
2016-07-25 17:39:16,637 [lib.cuckoo.core.scheduler] INFO: Task #6: reports generation completed (path=/home/tscherf/cuckoo/cuckoo/storage/analyses/6)
2016-07-25 17:39:16,755 [lib.cuckoo.core.scheduler] INFO: Task #6: analysis procedure completed

On the basis of the output, you can easily detect the tool's workflow. After the readme.txt sample file has been received, a new analysis task can be started and a new virtual machine created from the snapshot created previously. The -machine option lets you define which virtual machine Cuckoo should use (e.g., if you created several systems previously), because you want to use different operating systems for the analysis.

If the system is running, the file is transferred, and a network sniffer launches to grab the network traffic off the bridge. In this example, the analysis takes about two minutes. Subsequently, Cuckoo publishes the reports in the storage/analyses/<report>/ folder. For this test, I defined in the reporting.conf file that I wanted to produce an HTML report that could be viewed easily in a web browser (Figure 3).

Figure 3: An initial test with the EICAR virus is successful.

The actual investigation of malware samples is conducted in Cuckoo by the analysis packages residing in the installation directory under analyzer/modules/packages/. Cuckoo tries to discover which of these packages to use according to the file type. Alternatively, the corresponding analysis package can be stated when uploading the samples. If you are inspecting a PDF file, the call might look like:

$ utils/submit.py --package pdf --machine win01 evil.pdf

Cuckoo can inspect entire websites for defective code. To do so, you need to call the submit.py tool with the url option:

$ utils/submit.py --url http://lexu.goggendorf.at/nukgfr2.html

In addition to the tool for uploading malware samples, Cuckoo provides some more interesting utilities. For example, the latest modules for reporting, analyzing, and processing samples can be downloaded with the community.py tool. The stats.py tool shows statistics for the completed tasks (Listing 4).

Listing 4

Reporting Modules

$ python utils/community.py --reporting
Downloading modules from
https://github.com/cuckoosandbox/community/archive/master.tar.gz
Installing REPORTING
$ python utils/stats.py
4 samples in db
11 tasks in db
pending 0 tasks
running 0 tasks
completed 0 tasks
recovered 0 tasks
reported 8 tasks
failed_analysis 3 tasks
failed_processing 0 tasks
failed_reporting 0 tasks
roughly 0 tasks an hour
roughly 1 tasks a day

Professional users of the software will appreciate the Cuckoo API [8], which allows many jobs carried out manually to be automated and integrated with existing tools.

At this point, I should note that Cuckoo can also run in a cluster, which is very useful if you need to run a large number of tasks simultaneously. Because Cuckoo starts a virtual machine for each analysis, running Cuckoo on a single machine does not scale after a certain number of tasks running in parallel; at this point, a cluster setup becomes worthwhile. The software provides a REST API for this purpose that can be used to submit samples. The distributed/app.py tool then forwards the incoming samples to one of the previously registered Cuckoo nodes. For more information about this topic, check out the quite detailed Cuckoo documentation [8].

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Malware analysis in the sandbox
    In malware analysis, a sandbox can provide insight into the software and its run-time environment. While a sandbox can prevent the execution of malicious code with built-in detection mechanisms, malware developers can use countermeasures to take advantage of those same detection mechanisms.
  • Detecting and analyzing man-in-the-middle attacks
    Wireshark and a combination of tools comprehensively analyze your security architecture.
  • Secure Your KVM Virtual Machines
    A common misconception posits that software cannot cause mischief if you lock the system away in a virtual machine, because even if an intruder compromises the web server on the virtual machine, it will only damage the guest. If you believe this, you are in for a heap of hurt.
  • Data Compression as a CPU Benchmark
    Data compression is a more realistic compute benchmark than number crunching.
  • Controlling virtual machines with VNC and Spice
    Administrators on Linux virtual machines tend to use VNC to transfer the graphical system to Virtual Machine Manager or a VNC client. One alternative is Spice: If the guest system is running the QXL driver, you can look forward to fast graphics and audio pass through.
comments powered by Disqus