One of the modules in our new Rapid Reverse Engineering class is artifact extraction. For this section of the class the students use a python module we create for doing some artifact/metadata extraction from samples. One of the more interesting pieces of metadata that attackers leave behind is the software that the malicious file was created with. In this case I was looking at some PDFs. I then realized that I extract this information for individual samples, but I have never run a test on a large set of known APT malware to see what comes out. So a quick adventure I set out on and wow was I surprised by the information.
I ended up with the following pie graph
The sample size was roughly 300+ known APT samples that we have. It wasn't our whole sample set of PDF's but for starters was a decent size. List (top 10) looked like this
Acrobat Web Capture 8.0 (15%)
Adobe LiveCycle Designer ES 8.2 (15%)
Acrobat Web Capture 9.0 (8%)
Python PDF Library - http://pybrary.net/pyPdf/ (7%)
Acrobat Distiller 9.0.0 (Windows) (7%)
Acrobat Distiller 6.0.1 (Windows) (7%)
Adobe Acrobat 9.2.0 (4%)
Adobe PDF Library 9.0 (4%)
From the defensive position it points out the ability for defense organizations to do some early detection. I doubt that most organizations are actually keeping track or analyzing what types of clean, business case pdfs come through the front doors. What do the normal clean pdf's coming through your front doors actually look like? Are the clean business case PDFs being created by the
"Python PDF Library - http://pybrary.net/pyPdf/" software? This is a piece of software that is no longer maintained. If you have a standard set of pdf's that come through your front doors and they aren't using strange libraries such as pyPDF then it might be time to create a nice little snort signature and alert on it. I wouldn't recommend blocking at that level (unless you are up for it), but alerting on something simple like that can create extremely large dividends for response/defense teams. Imagine telling your CIO/CISO that you detected and re-mediated APT* attack coming through the front door by a simple snort sig.
Acrobat Web Capture 6.0 (wow that is old)
¦ d o P D F V e r 6 . 2 B u i l d 2 8 8 ( W i n d o w s X P x 3 2 ) *Ya that is the way it show's up
PDFlib 7.0.3 (C++/Win32)
At the end of April (25-26th) we are debuting Rapid Reverse Engineering in New York City with Trail Of Bits http://www.trailofbits.com/training/#rapidre. Rapid Reverse Engineering is a class designed for helping students learn how to rapidly assess files for incident response scenarios.