Reverse Engineering in Malware Analysis
Written by: Malware Analysis Team, Ensign Labs
Malware intrusion is the world's leading type of cyberattack on systems and computers. Malicious software such as viruses, spyware, and adware have evolved over the years. With increased levels of sophistication, malicious software can inflict greater damage, as well as disable and disrupt operations of an organisation. Malware detection must therefore be done as early as possible to prevent any potential damage that could be costly for organisations.
The dramatic rise in malware attacks in recent years has resulted in organisations spending more resources to analyse these types of software. The objective is to understand malwares’ impact on the organisation’s IT assets. The act of analysing malicious software is called malware analysis.
What is Malware Analysis?
Malware Analysis is the process by which a suspicious, potentially malicious file is dissected for incident responders to better understand its behaviour and capability. This helps the incident responders mitigate possible threats.
There are various reasons as to why Malware Analysis is performed:
- Incident Management (Investigation & Response): Understand how the malware works, in order to triage and react accordingly
- Better Malware Detection: Uncover Indicators of Compromise (IoCs) which are especially useful when dealing with threats that were never seen before
- Malware Research: Understand the malware's modus operandi to better detect and counter them
While the most obvious use of the output from Malware Analysis is to support Incident Response and Triaging, it also uncovers IOCs that security analysts can use in threat hunting. This translates to improving the efficacy of alerts and notifications of security tools, which could potentially deter future threats.
What is Reverse Engineering and how is it conducted?
Reverse Engineering (RE) is the process where the malware file is taken apart using mainly Static and Dynamic Analysis techniques.
This means extraction of key malware attributes and features, based on official file format specifications using disassemblers like IDA Pro, and Windows Portable Executable (PE) format viewers like CFF Explorer.
This is the detonation of malware sample in a controlled environment to observe its behaviour. This is usually performed in a sandbox, using debuggers like x64dbg and Network Protocol Analyzers like Wireshark.
Ensign’s Approach for Static Analysis
Before spending the time to delve deeper into the malware’s code, it is wise to first ascertain the objective and resources committed to the task, followed by a preliminary assessment of the malware under investigation.
For instance, from a filetype like PE32+ executable (console) x86-64 (stripped to external PDB), for MS Windows, it is likely that the malware file is a 64-bit Windows executable binary. This helps frame the context for analysts to know what they are dealing with. This way, analysts can mentally prepare relevant knowledge since CPU registers, calling conventions, and Operating System (OS) internals differ by malware.
The next check is to try determining the Programming Language/Compiler. During the analysis process, analysts should also look out for any sort of Obfuscator/Protector/Packer being employed as this could greatly hinder the overall analysis’ outcome. A rules-based identification tool that may be useful is ‘Detect It Easy’ (see: https://github.com/horsicq/Detect-It-Easy)
Example output from Detect-It-Easy:
- Packer: UPX 3.96
- Compiler: MinGW
As a follow-up, it is advisable to quickly search online for existing solutions to speed up the analysis process. In the example above, the malware used the UPX packer. The official UPX tool features a command-line switch to unpack the file directly, although sometimes it might turn out to be unreliable or inaccurate. It is thus important to know how to manually unpack it by hand to find the Original Entry Point (OEP) as a fallback, especially because RE is typically done manually.
Additionally, if the malware is not customer-sensitive, meaning it can be submitted online or it is already public information, we can submit it to VirusTotal (VT) and check for its classification by the various Anti-Virus (AV) vendors. The label assigned might tell the nature of the malware, and will dictate the steps to follow in the analysis. A label such as HEUR:Trojan-Ransom.Win32.Generic probably means it is a ransomware, a type of malware that has gained popularity in recent years. The Strings embedded in the malware, including its Original Filename, and Compilation Timestamp/Recency it was observed are also useful parameters to keep in mind.
Static Analysis provides hints on the code's behaviour that analysts can focus on to figure out answers to valuable, malware-related questions that support Incident Response operations. For instance, analysts can establish if it is possible to decrypt the locked documents and files after a ransomware attack.
Ensign’s Approach for Dynamic Analysis
Dynamic Analysis complements Static Analysis in gaining a more holistic picture of the nature of the malware sample. It is often preferred to do dynamic analysis to doing static analysis, especially when important parts of the code are obfuscated (e.g., packed, making it extremely tedious to understand what the malware is doing, or if an additional download of the next payload is required). Dynamic Analysis is achieved by detonating (or executing) the binary file within a sandbox, which is an isolated, controlled environment (such as within a Virtual Machine). This environment comes equipped with instrumentation to monitor various key indicators (e.g., File, Network, and Process activity).
Some examples of Dynamic Analysis techniques include:
- Function call monitoring
- Information (control/data) flow tracking
1. Function call monitoring
A function is a reusable, self-contained block of code that accomplishes a specific task. Functions accept input, process it, and produce (return) a result. Functions can be invoked ("called") by other functions, and can provide a level of abstraction to facilitate the understanding of the entire programme.
During execution, malware will exhibit function-calling behaviour that is unique to its family. Monitoring the type and sequence of such function calls can help us associate malware with the correct family. It also helps in better understanding what the malware may be doing on the system. Function call monitoring is achieved by intercepting calls between functions, with the aim of identifying the critical parts of the programme to focus the analysis on, and which "junk codes" to ignore. The process of capturing the input arguments and return result(s) of function calls is known as hooking. Typical candidates for hooking are standard Windows Application Programming Interface (API) functions or System Calls. Hooking can reveal cryptographic keys and/or decoded/decrypted data automatically, without going through the manual process of tracing individual instructions.
2. Information (control/data) flow trackingSource: "DroidEcho: an in-depth dissection of malicious behaviors in Android applications"
Information flow tracking is used to monitor how a programme processes its data. We make use of this to extract decoded/decrypted data from malware. This is especially useful when dealing with ransomware or obfuscated malware. During analysis, specific key data of interest are "tainted", and their propagation throughout the rest of the code is then observed. Subsequently, the recorded execution trace is examined to infer logical properties of the data relationship between various states of the programme execution lifecycle. This allows the analyst to discover data dependencies and unravel the algorithm to decode or decrypt the data.
Malware is constantly employing novel ways to evade detection. It also evolves its defence mechanisms against analysis. Organisations need to stay ahead of attackers by being proactive in detecting and identifying malware. This is where malware analysis comes into play.
We hope this article gives you a better understanding of what malware analysis is, and why it is necessary. While this article has presented a whirlwind overview of Reverse Engineering in Malware Analysis, through the lens of Static and Dynamic Analysis, it is by no means exhaustive. We did not discuss other techniques such as Program Analysis and Hybrid Analysis which we hope to cover in future articles.