Archive
analyzeMFT – ADS support added
The latest version of analyzeMFT is available on github. I’ve not pushed it out to Pypi and will hold off until I’m sure it is free of bugs due to this new work. The changes are:
Fixed parsing and printing of UTF-16 strings, removed unicodeHack stuff.
My original code took a brute force approach to parsing file names from the MFT records. What I did not know at the time was that they were UTF-16. While working on other things, I took the time to figure that out and replaced about 20 lines of kludge with one line of code.
Fixed printing of unicode strings to output files.
While figuring out how to read UTF-16, I figure out how to write UTF-8.
Added ADS support.
This is probably a work in progress but it seems to be working so I’ll push this out. Whenever analyzeMFT encounters a resident $DATA record, it stores a copy of the contents away for later use. If it encounters a named $DATA record, it does two things:
- A duplicate of the parent record is created and the filename is changed to be <parent filename>:<ADS filename>.
- All ADS records, parent and children, get a flag set in the new ADS column
So you might see:
/normal.txt | Normal file |
/file-w-ads.txt | Normal file with ADS |
/file-w-ads.txt:adsfile.txt | The ADS file attached to file-w-ads.txt |
/dir | Directory |
/dir:adsdir.txt | The ADS file attached to dir |
/file-w-large-ads.txt | Normal file with ADS |
/file-w-large-ads.txt:largeads.txt | The (non-resident) ADS file attached to file-w-large-ads.txt |
/file-w-2-ads.txt | Normal file with two ADS files |
/file-w-2-ads.txt:ads1.txt | The first ADS file attached to file-w-2-ads.txt |
/file-w-2-ads.txt:ads2.txt | The second ADS file attached to file-w-2-ads.txt |
All of the records would have a ‘Y’ in the ADS column to indicate that either they are an ADS file or they have an ADS file attached.
As always, please let me know if I broke anything….
Using analyzeMFT from other programs
Now that analyzeMFT is a package, it is much easier to use from other programs. Here’s a quick example.
from analyzemft import mft
input_file = open(‘MFT-short’, ‘rb’)
options = mft.set_default_options()
raw_record = input_file.read(1024)
mft_record = {}
mft_record = mft.parse_record(raw_record, options)
print “\nRaw MFT record in analyzeMFT format”
print mft_record
csv_record = mft.mft_to_csv(mft_record, False)
print “\nMFT record in CSV format”
print csv_record
l2t_record = mft.mft_to_l2t(mft_record)
print “\nMFT record in L2T format”
print l2t_record
body_record = mft.mft_to_body(mft_record, options.bodyfull, options.bodystd)
print “\nMFT record in bodyfile format”
print body_record
This will produce:
Raw MFT record in analyzeMFT format
{‘f1’: ‘\x00\x00’, ‘seq’: 1, ‘lsn’: 4.365328012e-314, ‘attr_off’: 56, ‘bitmap’: True, ‘alloc_sizef’: 1024, ‘recordnum’: 0, ‘size’: 424, ‘upd_off’: 48, ‘filename’: ”, ‘upd_cnt’: 3, ‘base_seq’: 0, ‘fncnt’: 1, ‘link’: 1, ‘next_attrid’: 6, ‘data’: True, ‘base_ref’: 0, ‘magic’: 1162627398, (‘fn’, 0): {‘par_ref’: 5, ‘ctime’: <analyzemft.mftutils.WindowsTime instance at 0x107864d40>, ‘par_seq’: 5, ‘nlen’: 4, ‘flags’: 3e-323, ‘real_fsize’: 32686080, ‘mtime’: <analyzemft.mftutils.WindowsTime instance at 0x107864830>, ‘alloc_fsize’: 32686080, ‘nspace’: 3, ‘atime’: <analyzemft.mftutils.WindowsTime instance at 0x1078649e0>, ‘crtime’: <analyzemft.mftutils.WindowsTime instance at 0x107864368>, ‘name’: ‘$MFT’}, ‘notes’: ”, ‘si’: {‘maxver’: 0, ‘ver’: 0, ‘ctime’: <analyzemft.mftutils.WindowsTime instance at 0x1078648c0>, ‘class_id’: 0, ‘usn’: 0.0, ‘sec_id’: 256, ‘quota’: 0.0, ‘own_id’: 0, ‘mtime’: <analyzemft.mftutils.WindowsTime instance at 0x1078647e8>, ‘dos’: 6, ‘atime’: <analyzemft.mftutils.WindowsTime instance at 0x1078645a8>, ‘crtime’: <analyzemft.mftutils.WindowsTime instance at 0x10777ac20>}, ‘flags’: 1}
MFT record in CSV format
[0, ‘Good’, ‘Active’, ‘File’, ‘1’, ‘5’, ‘5’, ”, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ‘2007-08-15 15:32:29.656248’, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ”, ‘True’, ‘False’, ‘True’, ‘False’, ‘False’, ‘False’, ‘True’, ‘False’, ‘False’, ‘True’, ‘False’, ‘False’, ‘False’, ‘False’, ‘False’, ”, ‘N’, ‘N’]
MFT record in L2T format
2007-08-15|15:32:29.656248|TZ|…B|FILE|NTFS $MFT|$FN […B] time|user|host||desc|version||1||format|extra
MFT record in bodyfile format
0|$MFT|0|0|0|0|32686080|1187191949|1187191949|1187191949|1187191949
Simple. Hand it a raw MFT record and then ask for the results to be produced in a string in one of three formats. (Hmm, I suppose I should support JSON, too.)
analyzeMFT now available via pip
[Ed Note: Please excuse the formatting. WordPress seems to be doing something funky.]
analyzeMFT just got two major, and related upgrades:
- You can install it via PyPi
- It is now a well behaved (?) package and can more easily be included in other programs.
PyPi:
pip install analyzeMFT
Alternatively:
git pull https://github.com/dkovar/analyzeMFT.git python setup.py install
or, just run it from that directory.
The main program is now much simpler:
#!/usr/bin/python
try: from analyzemft import mftsession except: from .analyzemft import mftsession if __name__=="__main__":
session = mftsession.MftSession() session.mft_options() session.open_files() session.process_mft_file() session.print_records()
The main program just opens a session, gets options, opens the files, processes the records, and prints the results. All of the records are available via:
session.mft[seqnum]
Where seqnum is the sequence number of the record you want to reference.
You should also be able to ask it to process a single record and return it in raw, bodyfile, L2T CSV, or normal CSV form. If this would be useful, let me know and I’ll document and confirm the process.
First steps in converting analyzeMFT to a Python module, plus improved error handling
I started rewriting analyzeMFT so that it can be loaded as a module and called from other programs. The primary reason is to enable including it in plaso, but perhaps other programs will find a need for it.
The work isn’t done yet, but it is usable as a standalone program still and it has some improved handling of corrupt MFT records so I decided to release it.
Quick install:
- git clone https://github.com/dkovar/analyzeMFT.git
- cd analyzeMFT/analyzemft
- python analyzeMFT
Once I finish the work I’ll also make a zip file available.
Notes:
- All output between the new and old version is identical except in cases where records are corrupt or incomplete. In those cases, the new output is more accurate.
- There is a lot of strangeness going on in MFT records. In restructuring analyzeMFT, I found a number of conditions that I failed to check for but which accidentally didn’t throw errors. For example, there are MFT records with no Standard Information attributes.
- Detection of Orphan records, my term, has been improved. Additional research is required to determine what causes them to occur.
- Processing time improved slightly
Improved bodyfile support
With more thanks to Jamie for the prompting, I’ve improved bodyfile support in the latest version of analyzeMFT.
- You can now specify just a bodyfile for output and do not need to create a normal output file as well.
- The real (not allocated) file size is included
- If you use the –bodypath option, it writes out the full path to the file rather than just the file name
- If you use the –bodystd option, it uses the STD_INFO timestamps rather than just the FN timestamps. I find STD_INFO to be more interesting….
This is a pretty significant fix and I would suggest upgrading if you create timelines with analyzeMFT.
Links:
Git: git clone https://github.com/dkovar/analyzeMFT.git
Code: https://github.com/dkovar/analyzeMFT/blob/master/analyzeMFT.py
Updated analyzeMFT – fixed MFT record number reporting
When I originally wrote analyzeMFT I assumed that the MFT record numbers would start at zero and politely increase by one for each record so “recordNumber = recordNumber + 1” would be valid. Happily, this worked, apparently for years. That is, until Jamie threw corrupted MFT files at it, such as MFT records extracted from memory.
- The sequence numbers had gaps
- If there was a gap, then the actual sequence number wouldn’t match the reported sequence number
- Determination of the file path might be off as the parent record number pulled from the entry might now point to the wrong entry
Oooops.
This has been fixed.
I also fixed the handling of orphan files, those files that had a null parent or whose parent was a file.
This is a pretty significant fix and I would suggest upgrading.
Links:
Git: git clone https://github.com/dkovar/analyzeMFT.git
Code: https://github.com/dkovar/analyzeMFT/blob/master/analyzeMFT.py
analyzeMFT has moved to GitHub
Just a quick note to say that analyzeMFT has moved to GitHub:
https://github.com/dkovar/analyzeMFT
I’ve got some other things in the works and was looking for a place that would allow me to neatly consolidate them all. The fact that GitHub allows for private and public repos in one account was a big selling point, but there are other factors.
analyzeMFT 2.0 released – OO’d!
Matt Sabourin created an object-oriented version of analyzeMFT.py. Most of the MFT analysis code and other logic was retained from the original version (along with the comments). The OO version is structured for importing the module directly into the python interpreter to allow for manual interaction with the MFT. The module can also be imported into other python scripts that need to work with an MFT.
Matt also added some new options, and the full list of options is now:
Options: --version show program's version number and exit -h, --help show this help message and exit -f FILENAME, --filename=FILENAME [Required] Name of the MFT file to process. -d, --debug [Optional] Turn on debugging output. -p, --fullpath [Optional] Print full paths in output (see comments in code). -n, --fntimes [Optional] Use MAC times from FN attribute instead of SI attribute. -a, --anomaly [Optional] Turn on anomaly detection. -b BODYFILE, --bodyfile=BODYFILE [Optional] Write MAC information in mactimes format to this file. -m MOUNTPOINT, --mountpoint=MOUNTPOINT [Optional] The mountpoint of the filesystem that held this MFT. -g, --gui [Optional] Use GUI for file selection. -o OUTPUT, --output=OUTPUT [Optional] Write analyzeMFT results to this file.
The project is now hosted on GitHub, here.
New home for analyzeMFT, now with current binary, source repo, downloads, issue tracker, and wiki
With thanks to Cory Altheide, analyzeMFT has a new home at:
http://code.google.com/p/opensourceforensics/
It is currently the only project there, but I will be adding a new project hopefully this week and others are encouraged to make this their home as well.
The site has all the bells and whistles required to support collaborative development of open source DFIR tools – a wiki, a Mercurial source code repository (and Mercurial really seems easier to grok than git), an issue tracker, and a download page for binaries and other packages.
As part of the move, I finally built a current binary using bb-freeze. (Hat tip to @bbaskin for the pointer to it.)
As mentioned elsewhere, I’m just starting on a new project to build a loose framework of dfir utilities and their supporting libraries in Python. The first release should go up on the site this week.
New version of analyzeMFT
I’ve been awfully busy with real work, but thanks to the gentle prodding of some interested parties, I updated analyzeMFT over the past few weeks.
- Version 1.5:
- Fixed date/time reporting. I wasn’t reporting useconds at all.
- Added anomaly detection, with many thanks to Greg Kelley. Adds two columns:
- std-fn-shift: If Y, entry’s FN create time is after the STD create time
- usec-zero: If Y, entry’s STD create time’s usec value is zero
- Version 1.6: Various bug fixes
- Version 1.7: Bodyfile support, with thanks to Dave Hull
The anomaly detection isn’t perfect by any stretch of the imagination, it simply helps reduce the noise a bit.
- On the $MFT from a volume on a workstation with 110593 total records, checking for FN creation times greater than STF creation times resulted in 19649 flagged records. Pretty significant reduction.
- On the same file, checking to see if the STF creation time microseconds are zero resulted in 14571 flagged records.
- Turning both on resulted in 2157 flagged records. Most appear to be benign. (I hope they all are!)
That’s still 2157 (or 19,649, or 14571) files that you need to check by other means, but it is a lot less than 110593.
If there’s some feature you’d like to see in analyzeMFT, please, do drop me a note.
You can find the source and more details here….
There’s also a great post on how to install Python and run analyzeMFT’s source code here….