Patch Diffing

A patch diff compares a vulnerable version of a binary with a patched one. The intent is to highlight the changes, helping to discover new, missing, and interesting functionality across various versions of a binary.

reign-abarintos--cKXtsJWU-I-unsplash
_{Photo by jareeign on Unsplash}

Patch Diffing

About

Binary code similarity approaches compare two or more pieces of binary code to identify their similarities and differences. The ability to compare binary code enables many real-world applications on scenarios where source code may not be available such as patch analysis, bug search, and malware detection and analysis. A Survey of Binary Code Similarity - IrfanUlHaq2019

Binary diffing refers to the comparison of two binaries. Whether the binary is of a different architecture, a variant of OS version, or the latest update, it can be useful to analyze the differences to determine:

specific compiler changes
abnormal or unexpected behavior
new features
details of a patched vulnerability

Patch diffing (a specific form of binary diffing) is a technique to identify changes across versions of binaries as related to security patches. A patch diff compares a vulnerable version of a binary with a patched one. The intent is to highlight the changes, helping to discover new, missing, and interesting functionality across versions of a binary.

Discovery of new, missing, and interesting functions:


flowchart LR;
linkStyle default interpolate basis

funcA1234<--Match 85%-->funcB2348
funcA1235<--Match 100%-->funcB2347
funcA1236<--Match 100%-->funcB2345
funcA1237<--Match 100%-->funcB2349

subgraph Binary v2 - patched
	funcB2345	
	subgraph NewFunc
		funcB2346
	end
	funcB2347
	subgraph Interesting
		funcB2348		
	end
	funcB2349
end

subgraph Binary v1 - vulnerable

	funcA1234
	funcA1235
	funcA1236
	funcA1237
	subgraph Missing
		funcA1238
	end
end

Ecosystem

Overall, the risk of post-patch vulnerability exploitation is inevitable for software which can be freely reverse-engineered, and is thus accepted as a natural part of the ecosystem. Mateusz Jurczyk -P0

Patch diffing is a reality of the modern-day update process. For vendors of closed-source software, a dichotomy exists between the release of updates to improve security while simultaneously providing malicious attackers and security researchers a map to vulnerable code. This same tension is present within the vulnerability disclosure debate.

Patch diffing is an often overlooked part of the perpetual vulnerability disclosure debate, as vulnerabilities become public knowledge as soon as a software update is released, not when they are announced in release notes. Skilled researchers can quickly determine the vulnerability that was fixed by comparing changes in the codebase between old and new versions. If the vulnerability is not publicly disclosed before or at the same time that the patch is released, then this could mean that the researchers who undertake the patch diffing effort could have more information than the defenders deploying the patches.Maddie Stone -P0

The controversy as to whether or not public disclosure of vulnerabilities is beneficial is up for debate. One side of the argument proclaims public disclosure raises awareness of security issues, pressuring vendors to fix them. The counterargument is that disclosure provides a shortcut for attackers. The premise of groups like Project Zero releasing vulnerabilities for the “greater good” is hotly contested. Whether or not you agree, a security patch is a form of vulnerability disclosure that is always public.

Benefits

Single Source of Truth

The blog post included details about the exploit, but only included partial details on the vulnerability. My end goal was to do variant analysis on the vulnerability, but without full and accurate details about the vulnerability, I needed to do a root cause analysis first. I tried to get my hands on the exploit sample, but I wasn’t able to source a copy. Without the exploit, I had to use binary patch diffing in order to complete root cause analysis. Maddie Stone -P0

Patch diffing can be a deep well of vulnerability information on its own. Often, without a CVE blog post or sample POC, a patch diff is the only source of information to determine the changes made and deduce the original issue. The skill in patch diffing is separating the signal from all the noise (followed by a dash of skill in reverse engineering).

Vulnerability Discovery through Re-Discovery

It’s also interesting to see that the graphical subsystem had fewer changes detected in general, but more than the core kernel specifically in the syscall handlers. Once we knew the candidates, we manually investigated each of them in detail, discovering two new vulnerabilities in the win32k!NtGdiGetFontResourceInfoInternalW and win32k!NtGdiEngCreatePalette system services. Mateusz Jurczyk - P0

On your path to understanding the original issue, what you find may surprise you. A CVE combined with the latest patch to fix it has just transported you to an area of troubled code. The question is “Do bugs congregate?” Where you find one bug, others are likely to gather.

Practice Makes Perfect

Patch diffing is an excellent exercise to add to a researchers training regimen to hone their skill and keep them sharp. Patch diffing gives you both a goal (understanding the patch) and focus (reducing the scope of the whole binary to a few functions). Not only can you see more clearly, along the way you gain experience in reverse engineering and clarity on what actual security issues are. As you build out mental models for various vulnerability classes, the energy needed to find them decreases. And your ability to see them through all the noise increases as well.

Now that you are convinced that you should be patch diffing, what are the chances of all these benefits coming to fruition?

Feasibility

Asymmetry

It takes relatively little work to change source code and recompile, while the analysis of the object code will have to be completely redone to detect the changes. - Halvar Flake - Structural Comparison of Executable Objects

In Halvar Flake’s epic whitepaper (from 2004!) on the benefits of using structural comparison in binary diffing, he introduced a fast and reliable means to way to identify functions across two different versions of a binary that detects logic changes in a function rather simple byte changes or compiler optimizations. Along with teaching us structural comparison, he describes the asymmetric problem of reverse engineering two different versions of a binary. Essentially, a small code change in source can drastically affect the composition of a compiled binary.

Typical changes from a minor source code modification include:

Registers used to hold specific variables (using RAX instead of RDX)
Basic block arrangement and branches (flowgraph and callgraph respectively)
Compiler might optimize instructions that perform the same operations (xor eax, eax or mov eax,0)

Binary diffing tools attempt to bring symmetry to the asymmetric problem of reverse engineering the changes between two versions of a binary.

Security patches are often made to applications, libraries, driver files, etc. When a new version is released it can be difficult to locate what changes were made
Some are new features or general application changes
Some are security fixes Some changes are intentional to thwart reversing
Some vendors make it clear as to reasoning for the update to the binary
Binary diffing tools can help us locate the changes
bruh? Do you even diff - RSAcon2016

Security fixes are typically not the only changes included in an update of a binary. Brand new non-security related features often appear in software updates that you will need to rule out. A single executable having 1000s of functions also includes other libraries and external dependencies that can simultaneously update. Your software component might consist of several binaries, increasing the difficulty of detecting changes, or at least multiplying the number of patch diffing sessions required to detect changes across builds. Last, there is nothing preventing multiple patched bugs within a single update.

Patch diffing is difficult. Even if there was only one change, understanding how to reach that code within a binary can be an arduous task. Is all hope lost?

To help you find your way, here are a couple of tips to keep in mind.

Minimize the Noise

Diff the Correct Binaries

But as I alluded to above, it turns out I analyzed and wrote a crash POC for not CVE-2019-1458, but actually CVE-2019-1433. Maddie Stone admitting user error -P0

It is not easy, you can get it wrong. There is no promise of success in patch diffing. In this particular case the researcher used the wrong set of relevant patch files. Following the aforementioned path from CVE -> relevant patch file should keep you on the straight and narrow for the correct patch.

Reduce the Time Delta

A file such as mshtml.dll is patched almost every month. If you diff a version of the file from several months earlier with a patch that just came out, the number of differences between the two files will make analysis very difficult. Gray hat hacking: the ethical hacker’s handbook 2015

This one is obvious. The less changes your have to look at, the less chance there will be for confusion of analysis on unrelated changes.

Cut the Chatter

James Forshaw advised me to patch diff the Windows 7 win32k.sys files rather than the Windows 10 versions. He suggested this for a few reasons:

The signal to noise ratio is going to be much higher for Windows 7 rather than Windows 10. This “noise” includes things like Control Flow Guard, more inline instrumentation calls, and “weirder” compiler settings.
On Windows 10, win32k is broken up into a few different files: win32k.sys, win32kfull.sys, win32kbase.sys, rather than a single monolithic file.
Kaspersky’s blog post stated that not all Windows 10 builds were affected. Source

This argument why to look at the Windows 7 vs the Windows 10 security updates, alluded to in Windows 10 vs Windows 7 Cumulative patches, follows the same vein of reducing complexity and increasing the signal-to-noise ratio. The basic idea here to perform a diff between two binaries with the least amount of changes.

Symbols - Prescription Lenses for SRE Tools

Luckily, Microsoft, as well as some other vendors, provide symbols. These symbols are extremely useful because we can often correlate the information provided in the patch bulletin with obvious symbol names. GrayHatHacking2015

For all of the available SRE tools, binary symbols add necessary precision to the comparison. For Ghidra, it allows for a much quicker analysis, as the matching heuristics can take advantage of the function names and types for matching.

Because we know that the vulnerability has to do with IPv6 and that route advertisements using prefixes is involved, let’s take a look at the symbol names showing as changed after the diff. GrayHatHacking2015

Besides aiding SRE tools, the function names often relate to the CVE description field and can help narrow down which functions, of those changed, to review.

Tools

For closed source or proprietary software, updates come in a binary form. There is no source code to analyze or perform a textual diff on to gain insight into what changes have occurred. For this, we turn to software reverse engineering tools that take much of the heavy lifting out of discover differences across binaries that contain 10s of 1000s of functions. These SRE tools help navigate complexity of these binaries, identifying the important differences from all the code and data (aka noise) that hasn’t changed.

There was a recent systematic analysis of modern binary diffing tools comparing over 61 tools and providing considerable insight on binary diffing. The analysis of the 61 binary diffing tools compared matching heuristics, speed, and several other features of the tools.

3 primary ways of doing detection.

syntax
Two (or more) pieces of binary code are identical if they have the same syntax, i.e., the same representation. The binary code can be represented in different ways such as an hexadecimal string of raw bytes, a sequence of disassembled instructions, or a control-flow graph. Determining if several pieces of binary code are identical is a Boolean decision (either they are identical or not) that it is easy to check: simply apply a cryptographic hash (e.g., SHA256) to the contents of each piece.
semantics
Two pieces of binary code are equivalent if they have the same semantics, i.e., if they offer exactly the same functionality
structure
Structural similarity compares graph representations of binary code (e.g., control flow graphs, callgraphs). It sits between syntactic and semantic similarity.

Each tool performs this to a different degree.

Which Diffing Tool To Use?

This is a good question. There are several, and for brevity here is a list from WhatsUpWithWhatsApp 2019:

IDA
- DarunGrim
- Diaphora
Radare2 (radiff2)
BinDiff
- Depends on analysis output of SRE tools like IDA or Ghidra.
Ghidra
- Version Tracking Tool

Which tool to use depends. The linked presentation walks through the pros and cons of each of the tools listed (except Ghidra). Check out the presentation and get an idea for yourself. In this course, we will focus on Ghidra and its Version Tracking tool.

Batteries Not Included

No binary diffing tool out of the box will highlight *which* changes you’re likely to care about. That will still take learning the tools to optimize their findings and doing some RE of your own.
Using a variety of different RE techniques can help you get to the answer faster Jailbreak2019.WhatsUpWithWhatsApp.pdf

No binary diffing tool will highlight the differences out of the box. There are too many variables in the equation. Each tool has a separate workflow that will need to be tried, tested, and evolved. The skill of reverse engineering will need to be added to the analysis to get the value required for understanding the underlying vulnerability.

Conclusion

Patch Diffing is not a panacea, a silver bullet, or trivial. It requires skilled reverse engineering pull out the differences signal from the noise of semantically unchanged functions. It can be a source of truth. Did the change actually meet its mark? Did a security patch actually cure the root cause?

To answer these questions, this tutorial will provide a walk through with a single tool and a few scripts you can use in the Ghidra Version Tracking (aka Ghidra Patch Diffing) Tool section.

Our path for the course is to attempt to analyze the differences across the previously identified Windows Print Spooler CVEs (2048,1337,17001). We will look at each diff individually using the lens of patch diffing as an attempt to get some clarity as to why it took so many attempts to get it right.

gantt

    title Patch diffing sessions comparing N-1,1048,1337,17001
    dateFormat  YYYY-MM-DD
	axisFormat %Y-%m
	
	section Relevant CVEs
	N-1 :a1, 2020-04-14, 2020-05-11 
    CVE-2020-1048 :a2, 2020-05-12, 2020-06-08
	CVE-2020-1337 :a3, 2020-08-11, 2020-09-07
	CVE-2020-17001 :a4, 2020-11-10, 2020-12-07
	
    section Patch Diffing Sessions
    Session1  :l1, 2020-04-14, 2020-06-08 
    Session2 :l2, 2020-06-08, 2020-09-07
	Session3 :l3, 2020-09-07, 2020-12-07

CVE North Stars Map


graph TD;

classDef current fill:#00cc66;

F:::current;
A1[N-1] --> |"localspl.dll (6.1.7601.24383)"| F;
A[CVE-2020-1048] --> |"localspl.dll (6.1.7601.24554)"| F;
B[CVE-2020-1337] --> |"localspl.dll (6.1.7601.24559)"| F;
C[CVE-2020-17001] --> |"localspl.dll (6.1.7601.24562)"| F;
F[Patch Diffing];

Next section: Ghidra Patch Diffing