« Previous 1 2 3 Next »
The Cuckoo sandboxing malware analysis tool
Cuckoo, Cuckoo
Installation
The examples in this article are all based on an installation of Cuckoo 2.0 RC1 on Fedora 23 and KVM/libvirt. The archive with the sandboxing software is available for download [3]. After unpacking, you need to create a cuckoo user and add the account to the libvirt group. Because Cuckoo must at all times be in a position to create a virtual machine using the libvirt framework, the Polkit rule from Listing 1 ensures that access to the framework is possible for all members of the libvirt group.
Listing 1
Polkit Rule for libvirt
polkit.addRule(function(action, subject) { if (action.id == "org.libvirt.unix.manage" && subject.isInGroup("libvirt")){ return polkit.Result.YES;} });
In the conf/
folder, you will want to check out the configuration files, of which there are many, including cuckoo.conf
, kvm.conf
, auxiliary.conf
, memory.conf
, processing.conf
, and reporting.conf
. For a first test, the cuckoo.conf
, auxiliary.conf
, and kvm.conf
files are the most important. If you use any virtualization solution other than KVM/libvirt, such as VMware or VirtualBox, suitable configuration files are available for them, too.
Basic settings for operating Cuckoo are defined in the cuckoo.conf
file. For example, a few of the parameters you can use, all of which are very well documented, define which virtualization solution to use (e.g., machinery = kvm
), the host to which reports and logfiles are sent (ip
), and whether a memory dump file of the virtual machine is to be created (memory_dump
).
If you use KVM/libvirt on the management system, you need to customize the settings in the kvm.conf
file to ensure that the name (machines
), label, IP address (ip
), and operating system (platform
) of the virtual machine are specified correctly. Of course, several machines can be defined at this point, because all statements within a separate section apply to just this machine. If you have, say, virtual machines named fed01 and win01, then the entries in the kvm.conf
file might look like Listing 2.
Listing 2
Sample Configuration
machines = fed01, win01 interface = virbr0 [fed01] label = fed01 platform = linux ip = 192.168.122.10 [win01] label = win01 platform = windows ip = 192.168.122.110
Additional services can be integrated from the auxiliary.conf
file. For example, this is where you determine whether to dump the virtual machine's network traffic. The services are implemented via a separate module in the modules/auxiliary/
folder and can be extended if necessary. Generally, this customization approach applies to all configurations in Cuckoo and is one of the great strengths of the software.
The memory.conf
file defines what type of tests to perform on the memory dump of the virtual machine. For example, you can define whether the memory should be searched for specific kernel modules, and you can specify in the processing.conf
file what exactly the analysis of the malware samples should look like. Among other things, integration with the VirusTotal online service is possible, or you can specify that a Python script is generated dynamically based on the malware process dump; the script can then be download for further analysis in IDA Pro. Finally, the entries in the reporting.conf
file determine what form the Cuckoo reports should take. The JSON format is a useful choice for automated processing of the results downstream. HTML reports are fine if you want an overview of the analysis results. If you team Cuckoo with MongoDB, you can use the Django-based web interface to access the results.
Even though we have not yet created a virtual machine to analyze the malware samples, Cuckoo should launch with these settings. The call to ./cuckoo.py
from the installation directory welcomes the user with a nice display of ASCII art (Figure 2). You still can't do much, because the virtual machines that Cuckoo uses as the basis for analyzing the malware samples we will be passing in and their snapshots need to be generated in the next step.
Generating Machine Templates
Cuckoo does not have its own procedure for generating virtual machine templates; instead, it relies on existing tools and mechanisms. For this article, I used a single virtual machine based on Fedora. You can use the graphical virt-manager
tool for the installation, or virt-install
in the shell. The connection to the host system running the Cuckoo Management Framework is controlled by a bridge. KVM/libvirt uses the private IP address space 192.168.122.0/24, where the address 192.168.122.1 is assigned to the host system. Hard drives should be created as LVM or QCOW2 volumes within the virtual machine; otherwise, no snapshots of the machine can be generated. A description of the complete installation is beyond the scope of this article, which is why I refer you to the existing installation instructions [6]. You also need to ensure that Python 2.7 is installed on the virtual machine, because the Cuckoo analysis software requires this version.
Assuming that the installation of the machine is successful, you should update the kvm.conf
configuration file on the Cuckoo host system with the correct data for the machine. As already mentioned, this includes the IP address, the name, and the label of the virtual system. Finally, you need to copy to the machine the Cuckoo agent (agent.py
), which you will find on the host system in the Cuckoo installation folder agent/
.
You can start the agent on the virtual machine with the help of the shell script also found in the same folder. The folder in which the agent is stored doesn't really matter. The agent implements an XMLRPC server that waits for incoming connections from the host system. The malware samples are then sent to the system through these connections. Ensure that the agent starts automatically after rebooting the virtual machine before creating a snapshot of the machine in the next step. You can create such a snapshot using libvirt's own tools:
# virsh snapshot-create fed01 # virsh snapshot-list fed01 Name Creation Time State ----------------------------------- 1469460006 2016-07-25 [...] running
To avoid problems, only one snapshot should ever exist per virtual machine. After the snapshot is created, the virtual machine can be turned off. Cuckoo will now access the system's snapshot as soon as a sample is received and analyzed within a virtual system instance.
At this point, note that existing virtual systems can also serve as a basis for Cuckoo. If you already have such a system and only need to change the disk types, you can easily convert such a image with the command:
# qemu-img convert -O qcow2 fed01.raw fed01.qcow2
Then you need to create the new disk type (<driver name='qemu' type='qcow2'/>
) in the XML definition of the virtual machine and state the storage location of the image file (<source file='/var/lib/libvirt/images/fed01.qcow2'/>
). The easiest way to do this is to use the command:
# virsh edit fed01
Again, replace the fed01
label with the label of your own virtual machine, save the file, and start the virtual system; it should be possible to create a snapshot.
Cuckoo Operation
For a first test, your best bet is to use the European Institute for Computer Antivirus Research (EICAR) test file, which is detected as a virus by most malware analysis systems [7]. To keep the file name from triggering Cuckoo, I renamed the file readme.txt
for my tests.
Several options allow you to send the fake malware to Cuckoo (e.g., the Django web interface lets you upload files). Cuckoo itself offers a feature-rich API that lets you send malware samples to the management system from your own applications. The easiest approach, though, is to use submit.py
from Cuckoo's utils
folder. The tool has many options, but in the simplest case, calling it with the file path of the malware as a parameter will suffice:
$ utils/submit.py tests/readme.txt Success: File "/home/tscherf/cuckoo/test/readme.txt" added as task with ID 6
For Cuckoo to accept the file, the management framework must be started up front by running python cuckoo.py
from the installation folder – if you have not done this already. Immediately after posting a sample, Cuckoo outputs appropriate messages on the console (Listing 3).
Listing 3
Cuckoo Output
2016-07-25 17:37:00,192 [lib.cuckoo.core.scheduler] INFO: Starting analysis of FILE "readme.txt" (task #6, options "") 2016-07-25 5:37:00 PM,207 [lib.cuckoo.core.scheduler] INFO: File already exists at "/home/tscherf/cuckoo/cuckoo/storage/binaries/275a021bbfb6489e54d471899f7db9d1663fc695ec2fe2a2c4538aabf651fd0f" 2016-07-25 5:37:00 PM,335 [lib.cuckoo.core.scheduler] INFO: Task #6: acquired machine fed01 (label=fed01) 2016-07-25 17:37:00,345 [modules.auxiliary.sniffer] INFO: Started sniffer with PID 9360 (interface=virbr0, host=192.168.122.10, pcap=/home/tscherf/cuckoo/cuckoo/storage/analyses/6/dump.pcap) tcpdump: listening on virbr0, link-type EN10MB (Ethernet), capture size 262144 bytes 2016-07-25 17:37:02,801 [lib.cuckoo.core.guest] INFO: Starting analysis on guest (id=fed01, ip=192.168.122.10) 2016-07-25 17:39:14,711 [lib.cuckoo.core.guest] INFO: fed01: analysis completed successfully 31 packets captured 31 packets received by filter 0 packets dropped by kernel 2016-07-25 17:39:16,637 [lib.cuckoo.core.scheduler] INFO: Task #6: reports generation completed (path=/home/tscherf/cuckoo/cuckoo/storage/analyses/6) 2016-07-25 17:39:16,755 [lib.cuckoo.core.scheduler] INFO: Task #6: analysis procedure completed
On the basis of the output, you can easily detect the tool's workflow. After the readme.txt
sample file has been received, a new analysis task can be started and a new virtual machine created from the snapshot created previously. The -machine
option lets you define which virtual machine Cuckoo should use (e.g., if you created several systems previously), because you want to use different operating systems for the analysis.
If the system is running, the file is transferred, and a network sniffer launches to grab the network traffic off the bridge. In this example, the analysis takes about two minutes. Subsequently, Cuckoo publishes the reports in the storage/analyses/<report>/
folder. For this test, I defined in the reporting.conf
file that I wanted to produce an HTML report that could be viewed easily in a web browser (Figure 3).
The actual investigation of malware samples is conducted in Cuckoo by the analysis packages residing in the installation directory under analyzer/modules/packages/
. Cuckoo tries to discover which of these packages to use according to the file type. Alternatively, the corresponding analysis package can be stated when uploading the samples. If you are inspecting a PDF file, the call might look like:
$ utils/submit.py --package pdf --machine win01 evil.pdf
Cuckoo can inspect entire websites for defective code. To do so, you need to call the submit.py
tool with the url
option:
$ utils/submit.py --url http://lexu.goggendorf.at/nukgfr2.html
In addition to the tool for uploading malware samples, Cuckoo provides some more interesting utilities. For example, the latest modules for reporting, analyzing, and processing samples can be downloaded with the community.py
tool. The stats.py
tool shows statistics for the completed tasks (Listing 4).
Listing 4
Reporting Modules
$ python utils/community.py --reporting Downloading modules from https://github.com/cuckoosandbox/community/archive/master.tar.gz Installing REPORTING $ python utils/stats.py 4 samples in db 11 tasks in db pending 0 tasks running 0 tasks completed 0 tasks recovered 0 tasks reported 8 tasks failed_analysis 3 tasks failed_processing 0 tasks failed_reporting 0 tasks roughly 0 tasks an hour roughly 1 tasks a day
Professional users of the software will appreciate the Cuckoo API [8], which allows many jobs carried out manually to be automated and integrated with existing tools.
At this point, I should note that Cuckoo can also run in a cluster, which is very useful if you need to run a large number of tasks simultaneously. Because Cuckoo starts a virtual machine for each analysis, running Cuckoo on a single machine does not scale after a certain number of tasks running in parallel; at this point, a cluster setup becomes worthwhile. The software provides a REST API for this purpose that can be used to submit samples. The distributed/app.py
tool then forwards the incoming samples to one of the previously registered Cuckoo nodes. For more information about this topic, check out the quite detailed Cuckoo documentation [8].
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)