Monday, January 16, 2012

Timeline Analysis: The Hybird Approach

Harlan Carvey recently blogged about approaches to conduct Timeline Analysis:

"So, anyway...I've been thinking about some of the things that I put into pretty much all of my timeline analysis presentations.  When it comes to creating timelines, IMHO there are essentially two "camps", or approaches.  One is what I call the "kitchen sink" approach, which is basically, "Give me everything and let me do the analysis."  The other is what I call the "layered" or "overlay" approach, in which the analyst is familiar with the system being analyzed and adds successive "layers" to the timeline.  When I had a chance to chat with Chad Tilbury at PFIC 2011, he recommended a hybrid of the two approaches...get everything, and then view the data a layer at a time, using something he referred to as a "zoom" capability.  This is something I think is completely within reach...but I digress."

I very much agree with the various approaches outlined above and their respective descriptions. Well put, Harlan and Chad Tilbury.

Over the years I have observed the traditional "kitchen sink" approach evolve into a "layered - overlay" approach. Fundamentally this has been the building blocks of timeline analysis. Harlan, Rob Lee, and the Sleuth kit have been primary drivers of this transformation with contributions such as "", "mac_daddy", and "fls". These contributions have allowed us to take the "kitchen sink", a entire hard drive image, and break it up into different "layers". Each layer representing a specific artifact type such as registry or file system.

What I appreciate about the "layered - overlay" approach is that it is a effective method of "removing the noise". This is my way of saying, hone in on specific areas of interest. In contrast, the "kitchen sink" approach can result in overwhelming volumes of data that can easily lead to distraction.

For example, if I'm only interested in reviewing USB connections, there are specific "data points" that I only need to look at. In such, I would only apply relevant layers of data points to my timeline (i.e. registry, setupapi.log) to identify the connections. Then if needed, I could double check my results by adding a third layer into the timeline, ".evtx" files (event logs in win7 logs usb connections system) which should essentially overlay my existing USB connections and confirm my results.

Perhaps, I then wanted to see if there was any ".lnk" files created on the hard drive image to show files being accessed from the USB device during the date/time of a USB connection. Subsequently, a fourth layer, file system activity could be added to the timeline for review and quickly filtered by ".lnk" files. In summary, this fundamental process of building a timeline is the concept of the "layered - overlay" approach.

Adobe Photoshop (a graphic design application) is a good example of putting this concept to use. For anyone not familiar with the product (pictured to the right), multiple layers are used to represent and control each part of a image; background, shading/coloring, objects, etc. All of the individual layers merged together (overlayed) make up the "entire picture."

However, as Harlan alluded to, not using the "kitchen sink" approach will dilute visibility into the context of specific artifacts -- limiting your analysis to specific layers instead of looking at the "entire picture." :

"the more data we have, the more context there is likely to be available.  After all, a file modification can be pretty meaningless, in and of itself...but if you are able to see other events going on "nearby", you'll begin to see what events led up to and occurred immediately following the file modification."

So how does Dav Nads' combine the best of the two approaches into one - the Hybird Approach?

To my knowledge there's no "out of box solution" or "push button" solution for this. It's a process of using multiple tools and applications. It's a manual process but comprehensive process. The process like all processes should be is constantly redefining to adapt to technology and needs..

It all starts out with owning a lot of real estate, 2 x 24" monitors :-) Having tall and wide monitors is key for any type of timeline analysis. It allows you to see more data (and context) at one glance and increases efficiently by reducing clicking n' scrolling.

I use one monitor to display the timeline data output  from log2timeline-sift in SPLUNK. This process is described in detail by Klein&Co. Why do I use SPLUNK to display my log2timeline-sift output?
  • Running log2timeline-sift on a 120GB hard drive image can easily result in a 2-3 GB of output. Not to mention, try running log2timeline on a 500 GB hard drive image. Microsoft Excel ain't going to work to review all of your data. It has limitations, period. 
  • Sure you can use "l2l_process" to cull your resulting output from log2timeline down by criteria such as date-range, but this still does not guarantee your resulting output will be a manageable volume. It also takes away context and makes the process of building timeline a iterative process if you need to adjust later on.
  • Most people know enough Python, SQL, GREP or PERL to be dangerous but not productive. Therefore, having a GUI based platform similar to Excel tends to be a preference when reviewing timeline data.
  • SPLUNK indexes timeline data, providing the ability to search, filter, and sort data on the fly. It's also scalable, in the sense it's a enterprise tool that is designed to work with GBs of data. With the click of a button I can easily refine my timeline to only show certain data types. Note, DAV NADS does not work for SPLUNK, it's just the the best solution I have found.
Harlan raises an excellent point, "That leads me to this question...if you're running a tool that someone else designed and put together, and you're just pushing a button or launching a command, how do you know that the tool got everything?  How do you know that what you're looking at in the output of the tool is, in fact, everything?"

If I were to rely solely on the using the output of log2timeline and SPLUNK as a review tool for my analysis, that would be a issue for 2 reasons:

First , let's be honest regardless of what tool used (commercial or open source) they all have or had at one point BUGS. Just as recently as a week ago, a bug in log2timeline was identified on the win4n6 list and was subsequently fixed.

Secondly, timelines are what I like to refer to as skeletons. They do not show you the meat on the bones. Reviewing timeline data may reveal that "Top Secret - Receipt for Coke.docx" was created and opened. However the limitation with timeline data is, you can't view the document. That's when the second monitor comes into the picture...

I use the second monitor to display the hard drive image in a Forensic tool (Encase, FTK, etc). This allows me now to take a look at "Top Secret - Receipt for Coke.docx" and see that it's just a document that discusses how Coke's secret formula is now on exhibit in World of Coca-Cola in Atlanta! This also allows me to potentially see anything that may be in context of this event that is not displayed in my timeline as a layer.

Leveraging a second tool simultaneously to view the data from a different perspective allows me to also double check and verify findings. For instance, if I see that how_to_kill_the_dog.doc was created on January 1, 2013 in my timeline data, I can quickly check to see if I'm seeing the same thing from my forensic tool or if this is a odd anomaly and potentially a issue with my timeline.

From my experience, the Hybrid timeline analysis approach is really finding synergy between the "full kitchen" and "layered - overlay" approaches. The important thing to understand to sucessfully deploy this approach is the strengths and weakness of the tools you use. For instance, identifying the difference between timeline data (output from log2timeline or wherever) that may only contain X where the full disk image contains Z and empowering a processing to fill these gaps. This allows you to develop a Hybird approach, like I described above that fits your needs.

- DAV NADS,  tweetin' @DAVNADS tweet at me cyber girls!

Wednesday, January 4, 2012

Thank you to all of my #DFIR followers. Hope everyone had a great New Years. Let 2012 bring many dongles, matching hashes, and cold blowing CPU fans to everyone!