A patch diff compares a vulnerable version of a binary with a patched one. The intent is to highlight the changes, helping to discover new, missing, and interesting functionality across various versions of a binary.
- Patch Diffing
- Table of Contents
Binary code similarity approaches compare two or more pieces of binary code to identify their similarities and differences. The ability to compare binary code enables many real-world applications on scenarios where source code may not be available such as patch analysis, bug search, and malware detection and analysis. A Survey of Binary Code Similarity - IrfanUlHaq2019
Binary diffing refers to the comparison of two binaries. Whether the binary is of a different architecture, a variant of OS version, or the latest update, it can be useful to analyze the differences to determine:
- specific compiler changes
- abnormal or unexpected behavior
- new features
- details of a patched vulnerability
Patch diffing (a specific form of binary diffing) is a technique to identify changes across versions of binaries as related to security patches. A patch diff compares a vulnerable version of a binary with a patched one. The intent is to highlight the changes, helping to discover new, missing, and interesting functionality across versions of a binary.
Discovery of new, missing, and interesting functions:
flowchart LR; linkStyle default interpolate basis funcA1234<--Match 85%-->funcB2348 funcA1235<--Match 100%-->funcB2347 funcA1236<--Match 100%-->funcB2345 funcA1237<--Match 100%-->funcB2349 subgraph Binary v2 - patched funcB2345 subgraph NewFunc funcB2346 end funcB2347 subgraph Interesting funcB2348 end funcB2349 end subgraph Binary v1 - vulnerable funcA1234 funcA1235 funcA1236 funcA1237 subgraph Missing funcA1238 end end
Overall, the risk of post-patch vulnerability exploitation is inevitable for software which can be freely reverse-engineered, and is thus accepted as a natural part of the ecosystem. Mateusz Jurczyk -P0
Patch diffing is a reality of the modern day update process. For vendors of closed source software, an interesting dichotomy exists between the release of updates to improve the security of software while simultaneously providing malicious attackers and security researchers a map to vulnerable code. This same tension is present within the vulnerability disclosure debate.
Patch diffing is an often overlooked part of the perpetual vulnerability disclosure debate, as vulnerabilities become public knowledge as soon as a software update is released, not when they are announced in release notes. Skilled researchers can quickly determine the vulnerability that was fixed by comparing changes in the codebase between old and new versions. If the vulnerability is not publicly disclosed before or at the same time that the patch is released, then this could mean that the researchers who undertake the patch diffing effort could have more information than the defenders deploying the patches.Maddie Stone -P0
The debate as to whether or not public disclosure of vulnerabilities is ethical, is up for debate. One side of the argument proclaims public disclosure raises awareness of security issues, pressuring vendors to fix them. The counter argument is that disclosure provides a shortcut for attackers. The entire premise of groups like Project Zero releasing vulnerabilities for the “greater good” is hotly contested. Whether or not you agree, a security patch is a form of vulnerability disclosure that is always public.
The blog post included details about the exploit, but only included partial details on the vulnerability. My end goal was to do variant analysis on the vulnerability, but without full and accurate details about the vulnerability, I needed to do a root cause analysis first. I tried to get my hands on the exploit sample, but I wasn’t able to source a copy. Without the exploit, I had to use binary patch diffing in order to complete root cause analysis. Maddie Stone -P0
Patch diffing can be a deep well of vulnerability information on its own. Often, without a CVE blog post or sample POC, a patch diff is the only source of information to determine the changes made and deduce the original issue. The skill in patch diffing is separating the signal from all the noise (followed by a dash of skill in reverse engineering).
It’s also interesting to see that the graphical subsystem had fewer changes detected in general, but more than the core kernel specifically in the syscall handlers. Once we knew the candidates, we manually investigated each of them in detail, discovering two new vulnerabilities in the win32k!NtGdiGetFontResourceInfoInternalW and win32k!NtGdiEngCreatePalette system services. Mateusz Jurczyk - P0
On your path to understanding the original issue, what you find may surprise you. A CVE combined with the latest patch to fix it has just transported you to an area of troubled code. The question is “Do bugs congregate?” Where you find one bug, others are likely to gather.
Patch diffing is an excellent exercise to add to a researchers training regimen to hone their skill and keep them sharp. Patch diffing gives you both a goal (understanding the patch) and focus (reducing the scope of the whole binary to a few functions). Not only can you see more clearly, along the way you gain experience in reverse engineering and clarity on what actual security issues are. As you build out mental models for various vulnerability classes, the energy needed to find them decreases. And your ability to see them through all the noise increases as well.
Now that you are convinced that you should be patch diffing, what are the chances of all these benefits coming to fruition?
It takes relatively little work to change source code and recompile, while the analysis of the object code will have to be completely redone to detect the changes. - Halvar Flake - Structural Comparison of Executable Objects
In Halvar Flake’s epic whitepaper (from 2004!) on the benefits of using structural comparison in binary diffing, he introduced a fast and reliable means to way to identify functions across two different versions of a binary that detects logic changes in a function rather simple byte changes or compiler optimizations. Along with teaching us structural comparison, he describes the asymmetric problem of reverse engineering two different versions of a binary. Essentially, a small code change in source can drastically affect the composition of a compiled binary.
Typical changes from a minor source code modification include:
- Registers used to hold specific variables (using RAX instead of RDX)
- Basic block arrangement and branches (flowgraph and callgraph respectively)
- Compiler might optimize instructions that perform the same operations (
xor eax, eaxor
Binary diffing tools attempt to bring symmetry to the asymmetric problem of reverse engineering the changes between two versions of a binary.
Security patches are often made to applications, libraries, driver files, etc. When a new version is released it can be difficult to locate what changes were made
- Some are new features or general application changes
- Some are security fixes Some changes are intentional to thwart reversing
- Some vendors make it clear as to reasoning for the update to the binary
- Binary diffing tools can help us locate the changes
Security fixes are typically not the only changes included in an update of a binary. Brand new non-security related features often appear in software updates that you will need to rule out. A single executable having 1000s of functions also includes other libraries and external dependencies that can simultaneously update. Your software component might consist of several binaries, increasing the difficulty of detecting changes, or at least multiplying the number of patch diffing sessions required to detect changes across builds. Last, there is nothing preventing multiple patched bugs within a single update.
Patch diffing is difficult. Even if there was only one change, understanding how to reach that code within a binary can be an arduous task. Is all hope lost?
To help you find your way, here are a couple of tips to keep in mind.
But as I alluded to above, it turns out I analyzed and wrote a crash POC for not CVE-2019-1458, but actually CVE-2019-1433. Maddie Stone admitting user error -P0
It is not easy, you can get it wrong. There is no promise of success in patch diffing. In this particular case the researcher used the wrong set of relevant patch files. Following the aforementioned path from CVE -> relevant patch file should keep you on the straight and narrow for the correct patch.
A file such as mshtml.dll is patched almost every month. If you diff a version of the file from several months earlier with a patch that just came out, the number of differences between the two files will make analysis very difficult. Gray hat hacking: the ethical hacker’s handbook 2015
This one is obvious. The less changes your have to look at, the less chance there will be for confusion of analysis on unrelated changes.
James Forshaw advised me to patch diff the Windows 7 win32k.sys files rather than the Windows 10 versions. He suggested this for a few reasons:
- The signal to noise ratio is going to be much higher for Windows 7 rather than Windows 10. This “noise” includes things like Control Flow Guard, more inline instrumentation calls, and “weirder” compiler settings.
- On Windows 10, win32k is broken up into a few different files: win32k.sys, win32kfull.sys, win32kbase.sys, rather than a single monolithic file.
- Kaspersky’s blog post stated that not all Windows 10 builds were affected. Source
This argument why to look at the Windows 7 vs the Windows 10 security updates, alluded to in Windows 10 vs Windows 7 Cumulative patches, follows the same vein of reducing complexity and increasing the signal-to-noise ratio. The basic idea here to perform a diff between two binaries with the least amount of changes.
Luckily, Microsoft, as well as some other vendors, provide symbols. These symbols are extremely useful because we can often correlate the information provided in the patch bulletin with obvious symbol names. GrayHatHacking2015
For all of the available SRE tools, binary symbols add necessary precision to the comparison. For Ghidra, it allows for a much quicker analysis, as the matching heuristics can take advantage of the function names and types for matching.
Because we know that the vulnerability has to do with IPv6 and that route advertisements using prefixes is involved, let’s take a look at the symbol names showing as changed after the diff. GrayHatHacking2015
Besides aiding SRE tools, the function names often relate to the CVE description field and can help narrow down which functions, of those changed, to review.
For closed source or proprietary software, updates come in a binary form. There is no source code to analyze or perform a textual diff on to gain insight into what changes have occurred. For this, we turn to software reverse engineering tools that take much of the heavy lifting out of discover differences across binaries that contain 10s of 1000s of functions. These SRE tools help navigate complexity of these binaries, identifying the important differences from all the code and data (aka noise) that hasn’t changed.
There was a recent systematic analysis of modern binary diffing tools comparing over 61 tools and providing considerable insight on binary diffing. The analysis of the 61 binary diffing tools compared matching heuristics, speed, and several other features of the tools.
3 primary ways of doing detection.
Two (or more) pieces of binary code are identical if they have the same syntax, i.e., the same representation. The binary code can be represented in different ways such as an hexadecimal string of raw bytes, a sequence of disassembled instructions, or a control-flow graph. Determining if several pieces of binary code are identical is a Boolean decision (either they are identical or not) that it is easy to check: simply apply a cryptographic hash (e.g., SHA256) to the contents of each piece.
Two pieces of binary code are equivalent if they have the same semantics, i.e., if they offer exactly the same functionality
Structural similarity compares graph representations of binary code (e.g., control flow graphs, callgraphs). It sits between syntactic and semantic similarity.
Each tool performs this to a different degree.
This is a good question. There are several, and for brevity here is a list from WhatsUpWithWhatsApp 2019:
- Radare2 (radiff2)
- Depends on analysis output of SRE tools like IDA or Ghidra.
- Version Tracking Tool
Which tool to use depends. The linked presentation walks through the pros and cons of each of the tools listed (except Ghidra). Check out the presentation and get an idea for yourself. In this course, we will focus on Ghidra and its Version Tracking tool.
- No binary diffing tool out of the box will highlight *which* changes you’re likely to care about. That will still take learning the tools to optimize their findings and doing some RE of your own.
- Using a variety of different RE techniques can help you get to the answer faster Jailbreak2019.WhatsUpWithWhatsApp.pdf
No binary diffing tool will highlight the differences out of the box. There are too many variables in the equation. Each tool has a separate workflow that will need to be tried, tested, and evolved. The skill of reverse engineering will need to be added to the analysis to get the value required for understanding the underlying vulnerability.
Patch Diffing is not a panacea, a silver bullet, or trivial. It requires skilled reverse engineering pull out the differences signal from the noise of semantically unchanged functions. It can be a source of truth. Did the change actually meet its mark? Did a security patch actually cure the root cause?
To answer these questions, this tutorial will provide a walk through with a single tool and a few scripts you can use in the Ghidra Version Tracking (aka Ghidra Patch Diffing) Tool section.
Our path for the course is to attempt to analyze the differences across the previously identified Windows Print Spooler CVEs (2048,1337,17001). We will look at each diff individually using the lens of patch diffing as an attempt to get some clarity as to why it took so many attempts to get it right.
gantt title Patch diffing sessions comparing N-1,1048,1337,17001 dateFormat YYYY-MM-DD axisFormat %Y-%m section Relevant CVEs N-1 :a1, 2020-04-14, 2020-05-11 CVE-2020-1048 :a2, 2020-05-12, 2020-06-08 CVE-2020-1337 :a3, 2020-08-11, 2020-09-07 CVE-2020-17001 :a4, 2020-11-10, 2020-12-07 section Patch Diffing Sessions Session1 :l1, 2020-04-14, 2020-06-08 Session2 :l2, 2020-06-08, 2020-09-07 Session3 :l3, 2020-09-07, 2020-12-07
graph TD; classDef current fill:#00cc66; F:::current; A1[N-1] --> |"localspl.dll (6.1.7601.24383)"| F; A[CVE-2020-1048] --> |"localspl.dll (6.1.7601.24554)"| F; B[CVE-2020-1337] --> |"localspl.dll (6.1.7601.24559)"| F; C[CVE-2020-17001] --> |"localspl.dll (6.1.7601.24562)"| F; F[Patch Diffing];