Detecting malware with Yara
Search Help
Yara is a useful open source tool for searching, finding, and acting on text strings or patterns of binary text within a file. The project website [1] calls Yara the "pattern-matching swiss army knife" for malware detection.
You can download Yara onto your Linux system using RPM, apt-get, or any other package manager. Windows users can download the executable from the Yara main web page. Source code is also available.
Yara, which received some attention for its role in finding and defeating a Trojan called BlackEnergy, may have had its 15 minutes of fame around 2013 or 2015. But malware attacks have been on the rise. Plus, a lot has been written over the past couple of years about the practice of "threat hunting," which is where a security professional proactively hunts for probable threats on the network. Threat hunting requires more than just reviewing logfiles or waiting for signature-based Intrusion Detection System (IDS) tools to send alerts. A threat hunter looks deeply into systems and system files. Yara is an important tool for this kind of proactive malware detection.
I've also seen security professionals use Yara during an actual attack. Once they've determined that a system has been compromised, they'll use Yara to quickly determine if the attack has spread to other systems.
How Does Yara Work?
Yara uses Python-based rule files to look for patterns in a file. The syntax for using Yara is as follows:
rule NameOfRule { strings: $test_string1= "James" $test_string2= {8C 9C B5 L0} Conditions: $test_string1 or $test_string2 }
In the preceding code, you start by naming the rule – you can use any name you wish. After the name, supply a bracket to start the function. You can then list strings you wish to find within the file. The $test_string1= "James"
variable tells Yara to look for the actual text string James
within the file. The test_string2=
variable tells Yara to look for binary code, rather than a text string. The Conditions:
section tells Yara what to match. In this case, Yara looks for either string.
Once you've defined the patterns, Yara can go out and look for problems.
A Very Simple Example
Figure 1 is a very simple example that tells Yara to search for the word Stanger
. I've named the rule StangerWorld
. If Yara finds a match, the word StangerWorld
will appear whenever there is a match, along with an indicator of the file. The next section defines the strings to look for. In this case, I look for the text word, Stanger
.
The condition section tells Yara what to do. In this case, the file tells Yara to report that it has found something.
To test if Yara is working, I create three files named badfile1.txt
, badfile2.txt
, and badfile3.txt
. I put random words in each of the files. I only put the word Stanger
inside of one file.
I then tell Yara to read my ambitious little rule file and look inside every file within the current directory:
yara -s yararule1.yar .
In Figure 2, you see that I've run the preceding command.
Notice that Yara reported the contents of only one file: badfile1.txt
. The report basically tells you that it found a match for variable $a
, the word Stanger
. This simple command demonstrates how to create a rule and issue a command against a file or set of files. More sophisticated examples will show you how Yara is used today by security professionals.
What to Look For
Yara can search for patterns inside of any file – either as text or in binary form. Suppose you suspect that someone has distributed text files that contain a suspect URL. You can configure Yara to automatically search for files that have that URL embedded within it.
Or, you can search for a binary file that has a hard-coded instruction in it. A friend of mine once told me how she detected an attack using Yara. She was asked to take a look at a few Industrial Control System (ICS) implementations at a power plant. She noticed a couple of things about a particular Supervisory Control and Data Acquisition (SCADA) console application. One of these console applications kept making Domain Name System (DNS) queries.
She found that this was a bit odd, because most SCADA systems don't use Internet-based DNS or Internet-based time systems. To help determine if the system had been compromised, she ran Yara with a rule file that contained the following:
Rule DNS { strings: $test_string1= " (#cmd='whoami')" $test_string2= " (#cmd='nslookup')" $test_string3= {9D J5 G8 P9} Conditions: $test_string1 or $test_string2 or $test_string3 }
By searching for DNS-specific commands, such as whoami
and nslookup
, the ruleset was able to find that one of the SCADA system's files had been compromised.
Yara can also search for installed code (e.g., Apache Struts, a Linux kernel, or a Windows DLL) to identify suspicious code running inside of your (supposedly) secure daemons, services, and applications.
Buy this article as PDF
(incl. VAT)