Finding quality tools is tough, particularly if you’re an independent practitioner or a small company. One tool at $1,000 to $2,500 is affordable, but we need an entire toolbox full of tools and they’re all trending towards $1,000 and 20% per year maintenance. Pretty soon you’re out $20,000 up front and then $4,000 per year to stay current. OSS and free tools are awfully welcome.
Thankfully, if you’re a US citizen, your tax dollars paid for the development of an OS X forensics tool called MEGA. (paper) Quoting from the paper: “This project was supported by Award No. 2007-DN-BX-K020 awarded by the National Institute of Justice….” Very cool, right? Alas, MEGA morphed into Mac Marshal and went commercial. (And when did this happen? The MEGA paper includes screenshots of the tool with the label “Mac Marshal” rather than “MEGA”.)
So go to the Mac Marshal web site where you find:
“Because of a special arrangement with the U.S. National Institute of Justice, Mac Marshal is available free of charge to U.S. Law Enforcement personnel. If you qualify, please use the instructions below.
Mac Marshal is available for purchase by the private sector, and law enforcement agencies outside of the United States, from Cyber Security Technologies.”
So, if you’re in law enforcement, you can get a copy of it for free. If you’re not LE, you get to pay $995 to Cyber Security Technologies for it. (order form)
Wait, didn’t I already pay for at least some of this tool through my tax dollars? I can see a private developer deciding to give their product away for free to LE, and corporations discounting the product to the government on GSA schedules. But in this case, the tool was developed using US tax dollars, and the price to the public isn’t just recovering costs, it is making a substantial profit.
It gets more interesting….
I got onto this because I was working on vfcrack (Google Code link, OpenCiphers link), a tool to brute force the encryption on DMGs. It’s a bit out of date, and I thought I’d bring it up to speed. Turns out that this has already been done – as part of Mac Marshal.
“Mac Marshal also include a modified version of vfcrack , which enables fast dictionary-based brute-force password cracking of FileVault sparseimage and sparsebundle images, as well as other encrypted Apple disk image formats (the original distribution of vfcrack does not support sparseimage and sparsebundle images).” (citation)
So there is open source code in Mac Marshal that may have been updated at the taxpayer’s expense but not been returned to the public domain. The vfcrack license doesn’t explicitly prohibit this, but MacMarshal’s developer’s refusal to put the updated code back in the public domain certainly seems to be in bad form.
A couple of suggestions if you accept tax dollars to support the development your tools:
- Price the resulting product so that the independent practitioner can afford to buy it without having to really think about it too much. A range of $200 – $300 I can see, but $995 is getting greedy. $200 covers distribution costs, the web site, answering questions, and the like.
- If you use open source code in your tool and update it, put the updated code back in the public domain for the rest of us to use. It costs you nothing to do so, it earns you good will, and we (the taxpayers) paid for some of that development.
- Remember that we are all working for the public, not just law enforcement. These tools are obviously used in civil matters, civil matters involving the same taxpayers.
And suggestions to tool vendors in general:
- Price your tools so they are affordable. We (small companies) aren’t going to drop $1,000 on a tool without thinking about it, much less $2,500 or $5,000. My gut (biased, I’ll admit) says that if some vendors dropped their prices significantly, they’d get a boost in sales that covers the decreased per-unit profit, and they’d get their product into more peoples’ hands, which would lead to more sales. (Or am I being idealistic?)
- Don’t discount the influence of someone who appears “small”. Many of us have clients in larger firms, and all of us talk (a lot) amongst ourselves. Check the CCE and HTCIA lists, look at Forensic Focus, go to the forensics conferences and talk to the smaller companies.
- Invest in the long term. The small customers you win over now, and who you help do better work so they can be more profitable in the future, will be your beta testers, promoters, and recurring customers in the future.
None of what I describe here is against the letter, or even the spirit, of the law. It probably even falls under “good business practices”. But in charging a premium for a tool that was funded in part by US tax dollars, and in taking public domain code and not returning the changes to the public, the pricing and failure to publicly update code borrowed from the public domain seems contrary to the spirit of the digital forensics community.
I finally figured out how to build a standalone executable after an Alice in Wonderland run through redistributable libraries, py2exe, and Windows installers. There are still some issues, but it works well for the most part. Check the Download section on www.integriography.com.
Some tools that helped me turn a Python script into something that can run on any (most?) Windows systems are:
- py2exe – http://www.py2exe.org/ – Read the Tutorial page for some really good help with the .dlls
- Dependency Walker – http://dependencywalker.com/ – A great tool for determining what modules your application depends on
- Inno Setup – http://www.jrsoftware.org/isinfo.php – A very simple yet powerful tool to build installation packages
At the request of Harlan Carvey and Rob Lee I made some changes to analyzeMFT and fixed a few bugs along the way.
- Version 1.1: Split parent folder reference and sequence into two fields. I’m still trying to figure out the significance of the parent folder sequence number, but I’m convinced that what some documentation refers to as the parent folder record number is really two values – the parent folder record number and the parent folder sequence number.
- Version 1.2:
- Fixed problem with non-printable characters in filenames. Any Unicode character is legal in a filename, including newlines. This presented some problems in my output. Characters that do not render well are now converted to hex and a note is added to the Notes column indicating this.
- Added “compile time” flag to turn off the inclusion of any GUI related modules and libraries for systems missing tk/tcl support. (Set noGUI to True in the code)
- Version 1.3: Added new column to hold log entries relating to each record. For example, a note stating that some characters in the filename were converted to hex as they could not be printed
The code and more details are available at www.integriography.com
Quick note on $MFT sequence numbers:
Microsoft tells us that each record in the $MFT has a FILE_RECORD_SEGMENT_HEADER Structure. Within this structure is a sequence number, defined as follows:
“This value is incremented each time that a file record segment is freed; it is 0 if the segment is not used.”
Ok, that’s pretty straightforward. At least until you look at teh first 16 entries in any $MFT as all of their sequence numbers match their record number. I’ve been told that since these files can never be deleted, repurposing the sequence number adds an additional sanity check and disaster recovery option. However, I’ve found one volume where this behavior continues for 12,000 records or more. Still looking into that one.
One of the best sources for NTFS documentation isn’t Microsoft, it comes from the Linux NTFS developers and is available here.
It is considered very good practice to make two copies of any image collected, particularly in the field. On one very long collection trip we did this by collecting to one set of drives during the day and running Robocopy over night to duplicate the image set. FTK allows writing to two destinations, and the various versions of dd have always allowed this via one means or another. But these all require either time or precious IO bandwidth.
So, I thought, is there any way to create two images in real time without pushing the data down the pipe twice? Isn’t that what RAID1 is supposed to provide? But, are two drives in a hardware RAID 1 *really* identical? Turns out, that at least in my test case, they are.
I bought a vAGE220-SAU two drive, USB 2.0/eSATA, RAID0/1 external enclosure. ($275 @ Amazon.) It’s fairly well constructed, compact, and easy to use. The instructions weren’t clearly translated but were sufficient unto the task. Once I flipped the dip switches correctly and waited a few hours for it to do the initial mirroring, I was good to go.
I hooked my source drive up to one port on my field laptop’s eSATA card and the RAID enclosure up to the other one. Fired off FTK (but dd, or EnCase, or whatever would have done just as well.) Imaged the drive and it ran at near expected speeds. The process finished and the image was verified.
Now the test. I pulled both drives and hashed them via a writeblocker. The hashes matched. I had two identical, forensically sound, images of my source drive. This required less time that imaging to two destinations using the hardware available on my field laptop, and a lot less time than running a copy overnight.
I need to try this a few more times and do some more performance measurements, but I’m pretty happy with the outcome. I wish there was a drop in drive dock with RAID1 capability. That would eliminate the need to open the enclosure up when changing disks.
Three elements combined last week to inspire me to write a tool to deconstruct the Windows NTFS $MFT file:
- I’ve been wanting to learn Python for quite awhile. (I found a “Learning Python” book on my shelf published in 1999.
- Mark Menz’s MFT Ripper started me wondering about the significance of the MFT sequence number.
- I’d been trying to get through the SANS 508.1 book but couldn’t bear to read about NTFS structures yet again.
- I wanted to start building a framework for doing more detailed timeline analysis.
So, last week I sat down and wrote analyzeMFT.py. Please keep in mind that this is a novice Python programmer’s code and is definitely a work in progress. A simple project page and a link to the source can be found here.
If you have any comments, suggestions, or improvements, please do let me know. I’d like to keep building on this and making it as useful as possible.
I bought this months ago as it seemed like something that would be really useful to own. Then I got busy and it sat until someone asked me about doing a review. I pulled the unit out and immediately attached it to a WD 320GB Passport. No go, not enough power. Wiebetech very promptly sent me a very well constructed dual USB cable to provide power and connectivity. Good to go, or so I thought.
I really want this thing to work, and it just doesn’t.
- It doesn’t work at all with OS X. (Correction: It works once, but subsequent drives are not recognized.)
- The drivers failed to install on Windows 7. It looks like this can be addressed with a certain amount of “reset and reboot”. (Rebooting and reinstalling cleared the problem. It works with Windows 7 32bit and 64bit.)
- It requires a lot of “fiddling” to get it working on XP. Sometimes you need to hit the reset button, sometimes the order that you plug things in matters, and all too often you need to reboot your system.
- It doesn’t work with EnCase at all. EnCase locks up trying to read the first two sectors of the device. If you hit the reset button, EnCase returns an error and drops back to the GUI so it isn’t completely locked up, but you cannot add any evidence attached to the write blocker. X-Ways, FTK Imager, and FTK all work, taking the “fiddling” into account. (This is an EnCase v6.15 problem. If you turn off “Detect FastBloc”, EnCase v6.14 and .15 recognize the device.)
I tried it with FAT and NTFS formatted thumb drives as well as the WD 320GB Passport – same results.
The packaging is great and the concept is good but it simply doesn’t work consistently enough or with enough operating systems to make it worth adding to my kit.
Since posting this, I’ve spoken with several other users. Some have experiences very similar to mine, some haven’t had any issues at all. I’m also talking with CRU-DataPort and they’re actively working through identifying and fixing the problems. I’ll hang on to my unit and hope they get these problems sorted out soon.
- The USBWB apparently works with versions of EnCase prior to V6.14.
- Disabling “Detect FastBloc” in the later versions of EnCase allows. EnCase to successfully recognize the USBWB attached drive.
- USBWB doesn’t work on OS X 10.4, does work on 10.5, and hasn’t been tested on 10.6. It seems to work once on 10.6 but not with subsequent drives, at least not without a reboot.
- USBWB works on Windows 7 64bit after a reboot. Drivers fail to install, reboot, drivers install.
In several Guidance classes, I’ve heard fellow students ask “Can you suggest a standard workflow for using EnCase?” The exact workflow will vary from case to case, but I’ve put together one possible workflow with some help from other contributors to the Forensic Focus forum. Please bear in mind that this is a guideline, a suggestion, just one possible way to work through a case using EnCase. You should clearly understand what each of these steps entails and adjust the workflow to suit your style, your written processes, and the case you are working on.
- Create case – Ensure that you have all relevant information – custodians, clients, case name, etc.
- Change storage paths as appropriate. I set everything to go to a volume or folder dedicated to the case.
- Save All.
- Add evidence – E01, LEFs, loose files, etc. Each time you add evidence, you should consider rerunning several of the following steps.
- Confirm disk geometry, sector count, partitions. You’re checking to see if everything is accounted for. There may be hidden partitions, for example.
- Run Partition Finder if indicated
- Run Recover Deleted Folders
- Search case – hash and signature analysis. You will probably repeat this each time you add new evidence.
- Run File Mounter – recursive, not persistent, create LEF, add LEF to case.
- Run Case Processor -> File Finder. Export results, add back in as LEF.
- Search case – hash and signature analysis.
- Search for encrypted or protected files. Address as appropriate.
- Extract registry hives. This can happen at any point really and they’ll be fed to RegRipper.
- Index case.
Depending on the case:
- Analyze LNK files and INFO2 records
- Extract browser history and carve browser history from unallocated
- Parse the event logs into a CSV format.
Other tasks performed outside of EnCase:
- Mount image and scan for viruses. Use several different products and never assume that they’re 100% accurate.
- Mount image and run triage tool(s) against it
- Run image in LiveView or VFC to see system as user experienced it
- Run RegRipper and RipXP against registry hives
- Run MFT Ripper against an extracted MFT
- etc, etc, etc
An article on PoliceProfessional.Com (original article has vanished and been replaced with new content) contains the following statement:
“ACPO is currently working on a new software tool that will allow forensic officers to operate locally and uncover information almost instantaneously. “What we’re very keen on doing is looking for a forensic triage tool that police officers or forensic officers can use locally. One that is quite simple, one they can ask questions of, such as, ‘in this computer is there the following…?’,” said Ms Williams. “The triage tool can pull that out for them.” She said the current backlog is one of e-crime’s biggest problems and that ACPO is close to identifying the right product to handle it.”
[Note: I've been told that the ACPO is looking to the vendor community for this solution. Rereading this quote, I suspect I should focus less on "working on new software" and focus more on "identifying the right product". I'll leave the post as originally written but will insert commentary.]
The apparent expectation that a tool will significantly address the backlog is rather disturbing for three reasons:
1) The tool will not provide context. It may indicate the presence of an encrypted file container on the system but cannot determine its contents. Or that file sharing is present, but not what it was used for. Or that seven different chat programs are in use, but not the information going through them. As several people have pointed out, these PBF tools will get the low hanging fruit and gather disparate facts but cannot put do any analysis to show relationships, or lack thereof. Further, we’ll need to err on the conservative side and may well end up with a lot of false positives.
2) Technology, and the criminal’s use of technology, advances rapidly, often more rapidly than the tools. This is why DriveProphet’s author is very willing to add new capabilities as issues are reported to him. It is why Digital Detective Group’s Blade product has plug in modules that they can develop and release as new capability is required. Keeping a triage tool current requires ongoing investment by the developer and ongoing training for the users. A one time investment in the technology and training will quickly lead to a situation where the triage tool is missing relevant information. [Note: ACPO's looking to a vendor solution should address the support issue. Keep in mind maintenance costs when investing in a tool. Some vendors charge upwards of 20% of the initial investment each year for maintenance.]
3) I’ve not seen any well researched study on the LE computer forensics backlog that we can use to determine where resources should be spent. The ACPO and others believe that the the backlog is in the triage stage. This appears to be valid, particularly for getting evidence back to the owners, but I suspect that “fixing” the triage stage will simply move the backlog further downstream, even more so if the number of false positives is high.
I also wonder why the ACPO is working on a new tool rather than working with a vendor of an existing tool to tune it to their particular needs. A number of good, well supported, triage tools already exist – Drive Phrophet, Blade, EnCase Portable, e-fense’s suite (now Access Data’s?), to name a few. The ACPO money might be better spent creating a fund to provide training on these existing tools rather than bringing another tool to an already crowded market. [Note: This point is moot given the feedback I received, noted above.]
Triage is an incredibly valuable process, particularly in time critical situations where limited resources are available. Triage, in the medical environment, is performed by trained specialists using diagnostic tools. Computer forensics triage tools often are designed to be used by anyone with minimal training. Witness the Microsoft press release about COFFE – “According to a Microsoft spokesperson ‘an officer with even minimal computer experience can be tutored—in less than 10 minutes—to use a pre-configured COFEE device.'” I believe there is value in this sort of tool when used as part of a well designed forensics process. I fear that, due to vendor marketing, budget issues, and backlog pressures, these tools will be deployed without the necessary framework to properly support them.
Allow me to close with some questions:
- Why is the ACPO creating a new tool rather than using an existing one? [Note: Addressed by feedback, noted above.]
- Who will use these triage tools and how much training will they get? If they’re designed for lab use to address the backlog will they stay in the lab? Can they safely be deployed earlier in the process?
- Are there any well documented studies on the LE computer forensics backlog?
- What other options are available for addressing the backlog? Anyone who knows me also knows that I’m very interested in finding ways for the private sector to assist LE with computer forensics and this would be one option.
My post about the value of push button forensics produced a number of interesting comments for which I am quite thankful. A common thread in many of the remarks was that someone needs to understand the the science, logic, and art behind the PBF tools. I absolutely agree. Anyone depending on a technician and a tool alone is doing a disservice to their clients, and will likely fail spectacularly in court.
As one reader put it:
“I think the point that is being missed is this – at the end of the day the goal is to produce admissible evidence. The fact remains that our system generally looks to an expert to introduce digital evidence into court. “
Harlan made a similar comment, and really got to the heart of the matter:
“The fact is that the questions being asked by customers…was data exfiltrated, did the malware make data exfiltration possible, etc…cannot be answered by a $50/hr “analyst” with a dongle. This approach will work for low hanging fruit, but even a relatively unsophisticated compromise will be improperly and incompletely investigated in this sort of environment.”
A $50/hour analyst with a PBF dongle should not testify in court and their findings alone should not be presented to a client as they lack context and perspective. Their results are only pieces of the larger construct, a construct that should be built and signed off on by people with significantly more experience. A senior examiner can guide a team of less experienced staff using a wide variety of tools, interpret and combine the results into a well constructed report, and sign off on the team’s work product.
Law firms and private investigation firms are but two of many examples of organizations that employ associates to perform many of the simpler tasks involved in preparing cases. Doing so distributes the workload, frees senior staff up for more complex tasks, provides associates with opportunities to learn on the job under the supervision of senior staff, and ensures that work product is reviewed and approved by someone in the firm who is responsible for presenting the case to the court or to the client. The same can hold true in a computer forensics firm, lab, or department. In fact, any firm with more than a few examiners needs to operate in this manner simply for coordination and responsibility purposes. I’m just proposing that the same structure works well to mitigate the risks of using push button forensics.
We build everything from airplanes to software applications to roads out of component parts that are designed to accomplish a specific task but that, standing on their own, have little value. Organizations work in a similar manner, utilizing human components along with their associated skills and tools to streamline many processes and produce better results than one person standing alone could accomplish. Integrate PBF tools and less experienced people into your organization, manage them appropriately, validate the tools, review the results, and let the senior examiners do the heavy lifting with the complex problems, clients, and courts.
Also, I suspect if most people looked around their organization, they’ll see technicians using push button tools as part of the computer forensic process already. Do you have Voom Hard Copy II or a Talon or one of the other hardware imaging solutions? How many button presses does it take to image a drive, and who is usually pushing those buttons? Do you really believe that you’ll need to explain to a client or a court how the Talon creates an E01 image? Your report will say “Imaged the suspect’s drive with a Talon, serial number XXXXX. The hash values reported by the Talon were XXXX and they matched. The Talon was certified to be operating normally during our regular maintenance, conducted per our SOPs.” It is pretty likely that the imaging was performed by a technician, and as was the regularly scheduled testing.
Push button forensics tools are here to stay and they’re already in use in most of our organizations. There clearly are risks to using PBF and inexperienced examiners inappropriately but through sound business practices they can safely contribute to our projects and improve our efficiency in the process.