How to Write Yara Binary Pattern Matching Rules to Enhance Threat Hunting and Cybersecurity Ops

Level Up Your Threat Hunting: A Guide to Writing YARA Binary Rules

Welcome back to our corner of the web where we talk all things threat hunting and cybersecurity. Last week, we took a deep dive into the fundamentals of YARA and how to craft basic string-based rules. If you’re new to YARA or need a refresher on string matching and conditionals, we recommend checking out that post first.

This week, we’re building on that foundation and venturing into the powerful world of YARA binary rules. Get ready to roll up your sleeves and learn how to hunt for threats at the byte level.

Beyond Plain Text: What Are YARA Binary Rules?

While string rules are fantastic for finding readable text within files, malware often hides its most telling indicators in non-textual data. This is where binary rules come into play. They allow you to search for specific hexadecimal (hex) values or byte patterns, giving you a much more granular way to identify malicious files.

Think of it like this: a string rule might look for the filename “evil.exe,” but a binary rule can look for the specific sequence of bytes that defines a file as an executable, regardless of its name.

Getting Started: Matching Exact Hex Values

The simplest form of a binary rule is one that looks for a precise sequence of hex bytes. In YARA, you define these hex patterns within curly braces {}.

Let’s say we’re looking for the hex value DEADBEEF, a common placeholder used by programmers. Here’s how you would write a YARA rule to find it:

rule Find_DEADBEEF
{
    strings:
        $A = { DE AD BE EF }

    condition:
        $A
}

In this example, the $A string is defined as the hex sequence DE AD BE EF. The condition section simply states that for a file to be a match, it must contain the $A string. It’s that straightforward.

Embracing Uncertainty: Wildcards and Alternations

Threat actors are constantly changing their malware to evade detection. A hardcoded value like DEADBEEF might be slightly different in a new variant. Fortunately, YARA provides ways to handle this uncertainty.

Alternations: The “Either/Or” Scenario

Imagine you’re looking for a pattern that could be either DEADDRBEEF or DEADBEEF. Instead of writing two separate rules, you can use parentheses () to specify alternative byte sequences.

rule Find_DEAD_or_DEADD
{
    strings:
        $A = { DE AD ( BB | DD ) BE EF }

    condition:
        $A
}

Now, this rule will flag a file if it contains either DEADBBEF or DEADDDBEEF. This flexibility is crucial when dealing with evolving malware.

Wildcards: When You Don’t Know the Value

What if you know a byte will be there, but you have no idea what its value will be? For this, you can use the wildcard character, which is a question mark ?. Each ? represents a single, unknown byte (or nibble, half a byte).

 
rule Find_DEAD_Anything_BEEF
{
    strings:
        $A = { DE AD ?? BE EF }

    condition:
        $A
}

This rule will match DEADBEEF, DEADC0BEEF, or any other variation where two bytes separate DEAD and BEEF.

The Power of Jumps: Searching Across Unknown Distances

Sometimes, you’re not just dealing with a few unknown bytes, but a variable-length chunk of data between two known patterns. This is where jumps come in handy. Jumps allow you to specify a range of bytes to skip over.

Let’s say we want to find “dead” and “beef” but there’s an unknown amount of data in between.

 
rule Find_DEAD_Jump_BEEF
{
    strings:
        $A = { DE AD [4-16] BE EF }

    condition:
        $A
}

This rule will look for DE AD, then “jump” over anywhere from 4 to 16 bytes, and then look for BE EF.

Modern versions of YARA (2.0 and later) even support unbounded jumps. For example, [10-] means a jump of at least 10 bytes with no upper limit. A simple [-] signifies a jump of zero to infinite bytes. This can be incredibly powerful but use it with caution, as it can be resource-intensive.

Pinpointing Threats with Offset Matching

In many file formats, specific values are expected at precise locations, or offsets. For example, the first two bytes of a Windows executable file are always MZ (or 0x4D 0x5A in hex). YARA allows you to check for values at specific offsets using the uint functions in the condition section.

 
rule Is_Windows_Executable
{
    condition:
        uint16(0) == 0x5a4d
}

Let’s break this down:

uint16(0) tells YARA to read an unsigned 16-bit integer (2 bytes) starting at offset 0 (the very beginning of the file).
== 0x5a4d compares that value to the hex representation of “MZ”. Notice the byte order is reversed (5A 4D instead of 4D 5A). This is due to endianness, a topic for another day, but it’s important to be aware of.

You can also use uint8 for a single byte, uint32 for a 4-byte integer, and even specify big-endian or little-endian byte order to match network or host data formats.

Putting It All Together

By combining these elements, you can create highly effective and resilient YARA rules to uncover even the most elusive threats. Start with the knowns, build in flexibility with wildcards and jumps, and use offset matching to target specific file structures.

We hope this gives you a solid starting point for writing your own YARA binary rules. The more you practice, the more you’ll see how these building blocks can be combined to create powerful detection logic.

Thanks for tuning in this week. We’ll see you next time with more cybersecurity tips and tricks!

See how Insane Cyber transforms security

Our products are designed to work with
you and keep your network protected.

Products

Valkyrie Automated Security

Cygnet Flyaway Kit

Services

Corvus Managed Services

Aesir Professional Services

OT Penetration Testing

OT Cybersecurity Assessments

Company

Resources