Analyzing Malware By Example: Part 1
In this tutorial you will learn how to perform basic static analysis on a malicious sample. Please make sure to prepare a safe analysis environment on your machine before you start.
I strongly encourage you actually do the things that are explained here on your analysis environment. Merely reading the tutorial is not enough.
Our Sample:
Download the zip file with sample here: sample.zip
The password is "infected".
File Type Analysis
You have sample now that you want to analyse, but you don't know what kind of file it is. The file type or the format it is made of is the most important thing to start with. Once you know the file format you are able to decide which tools are suitable for analysis.
If you are only used to a Windows environment, you may be skeptic about the usefulness of file type analysis. Afterall the file extension of a file will show in most cases the correct type, e.g., .exe for executable programs, .doc for Microsoft Word files etc.
But it is not that easy.
File types are determined by file signatures, which are usually at the very beginning of the file. A signature is a specific sequence of bytes that was defined by the file format specification so the correct file type can be verified by applications that parse these files.
A large list of file type signatures is at http://www.garykessler.net/library/file_sigs.html
Most malware analysts know the file type signature or other typical strings of most common file types by rote, because they see them every day. So they often just open the file in a hex editor and can tell what is inside.
However, if you don't know how certain files typically look like, you might also run a file type parser or scanner. Linux has in inbuilt file type parser, which is:
On both Linux and Windows you can use TdID, which has over 6000 file type definitions. Download TdID now and try it with our sample.
Tip for command prompt usage on Windows: Navigate to the trid_w32 folder, hold Shift, then rightclick on the folder and click open command window here (see also link).
A command prompt will open in the folder where you had the focus on. Type
trid.exe, then drag the sample into the command prompt and press Enter.
(Note: I take as a given that you can navigate your command prompt if you use Linux.)
If you get a message that says "No definitions available!" you need to download a the newest definition database from here (scroll down to the Download section and choose TrIDDefs.TRD package). Unpack the ZIP file and put the triddefs.trd into the same folder as trid.exe. Apply trid.exe on our sample and you should get an output like this:
That means our sample is most likely a Word or Excel document.
If you used the Linux file command on the sample, you will get an even more detailed output, because it is able to parse a lot of file types.
Now that we have an idea, let's use a hex editor. It can be any of your choice.
Scroll a bit through the file and see if you recognise any strings.
At some point you might see this:
This tells you that our sample is a Microsoft Word document.
Another possibily of research: Check if the file is listed on Virustotal. For that you may get the file hash. Linux has again an inbuilt command called sha256sum to calculate a hash value. For Windows you may use a program like HashCheck.
Virustotal does not only list detections, it also shows lots of additional information about the file, depending on the filetype.
You can of course upload the file, but often there are reasons not to do so. The file might contain private information that shouldn't be available on the web. You must be aware that every file you upload on Virustotal is available for everyone who pays for file access.
Looking at the Code
Our sample is a Microsoft Office document, most likely a Word document. There are great tools out there to analyse these documents.
Download OfficeMalScanner and extract the ZIP file and execute the OfficeMalScanner.exe via command line. You will see usage information.
Apply the scan mode with the following command:
It will verify that this is a Word document:
And it will tell you that it found no malicious traces, but this is an automated analysis. Always check the file yourself. Run the info mode to extract any Macro code from the file.
The program tells you that it found VB-Macro Code in the file and where the Macro code is saved to. Navigate to that location.
I strongly recommend that you use Notepad++ to open the extracted VB code. In the Menue choose Language --> V --> VB to get proper syntax highlighting.
You will see a lot of code that does not look useful. Adding clutter is a common way of obfuscation.
Press Ctrl + F to open the search window and search for the string "environ". A description of the function is here: https://msdn.microsoft.com/en-us/library/office/gg264486.aspx
A lot of malware authors use this function to determine the location of the Temp folder.
Other typical functions you might search for in unknown Macro scripts are:
These will lead you to the relevant code parts if you have a lot of clutter in the code.
In this part of the code you can see some interesting hex strings. To get the meaning of these hex strings open a terminal and the python interpreter (or use another language you are more comfortable with).
We save one of the strings in a variable.
The VBA macro reverses the string, so we do the same:
The last step is to transform this hex representation into a readable string.
The result will show you a download path for an executable. Warning: Even if it is tempting, you must not visit a website found in malicious files! But you may do some additional research with whois.
The other strings can be obtained the same way:
You will get the following strings
Search for some of the other keywords that I told you and explore the code. You will find the code that writes the file to disk and the part that runs it.
Obviously this document downloads a file from hxxp :// fachonet . com/js / bin . exe, saves it as YEWZMJFAHIB.exe into the TEMP directory and runs it. This kind of malware is called macro downloader.
That was the first malware analysis tutorial. Macro malware seemed dead for while, but a new wave of it popped up. Office malware samples are usually droppers or downloaders that are spread via email. That makes them the initial carriers of infections.
I hope you all understood it..
Bye!
In this tutorial you will learn how to perform basic static analysis on a malicious sample. Please make sure to prepare a safe analysis environment on your machine before you start.
I strongly encourage you actually do the things that are explained here on your analysis environment. Merely reading the tutorial is not enough.
Our Sample:
Download the zip file with sample here: sample.zip
The password is "infected".
File Type Analysis
You have sample now that you want to analyse, but you don't know what kind of file it is. The file type or the format it is made of is the most important thing to start with. Once you know the file format you are able to decide which tools are suitable for analysis.
If you are only used to a Windows environment, you may be skeptic about the usefulness of file type analysis. Afterall the file extension of a file will show in most cases the correct type, e.g., .exe for executable programs, .doc for Microsoft Word files etc.
But it is not that easy.
- The file extension can be spoofed
- With the right command you can execute any file regardless of the file extension.
- Temporary files often have the file extension .tmp regardless of their file type.
- Therefore, malware often has the wrong file extension.
- Depending were you get the samples, they will likely not have any file extension. Samples shared by researches often just have the hash as their filename.
- You should be able to detect the type of embedded files.
File types are determined by file signatures, which are usually at the very beginning of the file. A signature is a specific sequence of bytes that was defined by the file format specification so the correct file type can be verified by applications that parse these files.
A large list of file type signatures is at http://www.garykessler.net/library/file_sigs.html
Most malware analysts know the file type signature or other typical strings of most common file types by rote, because they see them every day. So they often just open the file in a hex editor and can tell what is inside.
However, if you don't know how certain files typically look like, you might also run a file type parser or scanner. Linux has in inbuilt file type parser, which is:
Code
file <sample>
On both Linux and Windows you can use TdID, which has over 6000 file type definitions. Download TdID now and try it with our sample.
Tip for command prompt usage on Windows: Navigate to the trid_w32 folder, hold Shift, then rightclick on the folder and click open command window here (see also link).
A command prompt will open in the folder where you had the focus on. Type
trid.exe, then drag the sample into the command prompt and press Enter.
(Note: I take as a given that you can navigate your command prompt if you use Linux.)
If you get a message that says "No definitions available!" you need to download a the newest definition database from here (scroll down to the Download section and choose TrIDDefs.TRD package). Unpack the ZIP file and put the triddefs.trd into the same folder as trid.exe. Apply trid.exe on our sample and you should get an output like this:
Code
TrID/32 - File Identifier v2.20 - (C) 2003-15 By M.Pontello
Definitions found: 6108
Analyzing...
Collecting data from file: 048714ed23c86a32f085cc0a4759875219bdcb0eb61dabb2ba03de09311a1827
45.7% (.DOC) Microsoft Word document (32000/1/3)
42.8% (.XLS) Microsoft Excel sheet (30000/1/2)
11.4% (.) Generic OLE2 / Multistream Compound File (8000/1)
That means our sample is most likely a Word or Excel document.
If you used the Linux file command on the sample, you will get an even more detailed output, because it is able to parse a lot of file types.
Now that we have an idea, let's use a hex editor. It can be any of your choice.
Scroll a bit through the file and see if you recognise any strings.
At some point you might see this:
This tells you that our sample is a Microsoft Word document.
Another possibily of research: Check if the file is listed on Virustotal. For that you may get the file hash. Linux has again an inbuilt command called sha256sum to calculate a hash value. For Windows you may use a program like HashCheck.
Virustotal does not only list detections, it also shows lots of additional information about the file, depending on the filetype.
You can of course upload the file, but often there are reasons not to do so. The file might contain private information that shouldn't be available on the web. You must be aware that every file you upload on Virustotal is available for everyone who pays for file access.
Looking at the Code
Our sample is a Microsoft Office document, most likely a Word document. There are great tools out there to analyse these documents.
Download OfficeMalScanner and extract the ZIP file and execute the OfficeMalScanner.exe via command line. You will see usage information.
Code
+------------------------------------------+
| OfficeMalScanner v0.61 |
| Frank Boldewin / www.reconstructer.org |
+------------------------------------------+
Usage:
------
OfficeMalScanner <PPT, DOC or XLS file> <scan | info> <brute> <debug>
Options:
scan - scan for several shellcode heuristics and encrypted PE-Files
info - dumps OLE structures, offsets+length and saves found VB-Macro code
inflate - decompresses Ms Office 2007 documents, e.g. docx, into a temp dir
Switches: (only enabled if option "scan" was selected)
brute - enables the "brute force mode" to find encrypted stuff
debug - prints out disassembly resp hexoutput if a heuristic was found
Examples:
OfficeMalScanner evil.ppt scan brute debug
OfficeMalScanner evil.ppt scan
OfficeMalScanner evil.ppt info
Malicious index rating:
Executables: 20
Code : 10
STRINGS : 2
OLE : 1
----------------------------------------------------------------------------
I strongly suggest you to scan malicious files in a safe environment
like VMWARE, as this tool is written in C and might have exploitable bugs!
----------------------------------------------------------------------------
Apply the scan mode with the following command:
Code
OfficeMalScanner.exe <samplename> scan
It will verify that this is a Word document:
Code
[*] Ms Office OLE2 Compound Format document detected
[*] Format type Winword
And it will tell you that it found no malicious traces, but this is an automated analysis. Always check the file yourself. Run the info mode to extract any Macro code from the file.
Code
OfficeMalScanner.exe <samplename> info
The program tells you that it found VB-Macro Code in the file and where the Macro code is saved to. Navigate to that location.
I strongly recommend that you use Notepad++ to open the extracted VB code. In the Menue choose Language --> V --> VB to get proper syntax highlighting.
You will see a lot of code that does not look useful. Adding clutter is a common way of obfuscation.
Press Ctrl + F to open the search window and search for the string "environ". A description of the function is here: https://msdn.microsoft.com/en-us/library/office/gg264486.aspx
Quote
Returns the String associated with an operating system environment variable.
A lot of malware authors use this function to determine the location of the Temp folder.
Other typical functions you might search for in unknown Macro scripts are:
Code
Shell
StrReverse
Chr
Put
Write
.exe
Open
ResponseBody
Binary
These will lead you to the relevant code parts if you have a lot of clutter in the code.
In this part of the code you can see some interesting hex strings. To get the meaning of these hex strings open a terminal and the python interpreter (or use another language you are more comfortable with).
Code
unknown = "568756E2E69626F237A6F2D6F636E24756E6F686361666F2F2A307474786"
We save one of the strings in a variable.
The VBA macro reverses the string, so we do the same:
Code
reversed = unknown[::-1]
The last step is to transform this hex representation into a readable string.
Code
reversed.decode("hex")
The result will show you a download path for an executable. Warning: Even if it is tempting, you must not visit a website found in malicious files! But you may do some additional research with whois.
The other strings can be obtained the same way:
Code
"05D45445"[::-1].decode("hex")
You will get the following strings
Code
hxxp://fachonet.com/js/bin.exe
\\YEWZMJFAHIB.exe
TEMP
Search for some of the other keywords that I told you and explore the code. You will find the code that writes the file to disk and the part that runs it.
Obviously this document downloads a file from hxxp :// fachonet . com/js / bin . exe, saves it as YEWZMJFAHIB.exe into the TEMP directory and runs it. This kind of malware is called macro downloader.
That was the first malware analysis tutorial. Macro malware seemed dead for while, but a new wave of it popped up. Office malware samples are usually droppers or downloaders that are spread via email. That makes them the initial carriers of infections.
I hope you all understood it..
Bye!
Jayme Silvestri
ReplyDeleteThank you for the auspicious writeup. It in fact was a amusement account it. Look advanced to more added agreeable from you! However, how can we communicate?