查看文章 |
来源 http://blog.csdn.net/yeqihong/archive/2007/09/27/1803762.aspx http://www.securityfocus.com/infocus/1902/1 The basic recovery processIn this section we will go step-by-step through the data recovery process and describe the tools, and their options, in detail. We start by listing a directory below. [abe@abe-laptop test]$ ls -al
In the listing above we can see that there is a file named weimaraner1.jpg in the test directory. This is a picture of my dog. I don't want to delete it. I like my dog. [abe@abe-laptop test]$ rm -f *
Here we can see I am deleting it. Whoops! Sorry buddy. Let's gather some basic information about the system so we can begin the recovery process. [abe@abe-laptop test]$ df -h
Here we see that the full path to the test directory (which is /home/abe/test) is part of the / filesystem, represented by the device file /dev/sda2. [abe@abe-laptop test]$ su -
Using debugfs 1.40.4 (31-Dec-2007)
After debugfs: imap <1835328>
The next command we want to run is debugfs: stats
Running the stats command will generate a lot of output. The only data we are interested in from this list, however, is the number of blocks per group. In this case, and most cases, it’s 32768. Now we have enough data to be able to determine the specific set of blocks in which the data resided. We're done with [root@abe-laptop ~]# dls /dev/sda2 1835008-1867775 > /media/PUBLIC/block.dat
The next thing we need to do is pull all unallocated blocks from block group 56 so we can examine their content. The (56 x 32768) through ((56 + 1) x 32768 - 1)
This would give us a range of 1835008 through 1867775. It's very important that the destination of the output does not reside on the same partition as the data you're attempting to recover. What will most likely be a large amount of data being written to disk from the output of this command could potentially overwrite the data you are trying to recover (as the blocks which stored the data from the deleted file have already been marked unallocated). You want as little disk activity as possible on the partition you're working with. In this example, I'm using a USB thumb drive (located on /media/PUBLIC) as a location to store this data. [root@abe-laptop ~]# mkdir /media/PUBLIC/output
Next we need to attempt to extract this data from the unallocated blocks we extracted with the dls command above. To do this, we are going to use Foremost. This program is used to recover files based on header information, footer information, and internal data structures. This is the process, mentioned earlier, called data carving. First we are going to create a directory to store the foremost output (again, this should be on a separate partition). Next we are going to run the foremost command giving it the file type of jpg (which is an internally recognized type - more on custom types below), the input file, and the output directory. The output from this command is listed below. Foremost version 1.5.3 by Jesse Kornblum, Kris Kendall, and Nick Mikus
As we can see, Foremost found forty-nine previously deleted jpg files (this output is also saved in a file named audit.txt in the root of the specified output directory). How do we know which is the file we are trying to recover? We could, as is most commonly done, open all of these files and see their contents. Another option is to simply compare file sizes. We know from our directory listing above that the jpg file we are looking for is 41k in size. There's only one file that foremost extracted into the output directory that's 41k, and indeed, 00114144.jpg is the file we are attempting to recover. Comparing size only works, of course, if you "know your data". Integrity checking programs such as Tripwire play a big role in a recovery operation as you can identify the recovered data without ever inspecting the content, as well as verify its integrity. This becomes quite useful if the information you're attempting to recover is confidential and you are not authorized to view the data. Defining custom types in ForemostAs of Foremost v1.5.3, the internally supported data types that the program will recover without custom rules are jpg, gif, png, bmp, avi, exe, mpg, wav, riff, wmv, mov, pdf, ole, doc, zip, rar, htm, and cpp. If you need to recover data beyond these built-in data types, you will need to define custom types in Foremost's configuration file (foremost.conf). An entry that defines a type in the foremost configuration file (as explained in the documentation at the beginning of foremost.conf or in the manpage) consists of several columns: extension, case sensitivity, maximum size, header and footer (optional), and special keywords (optional). As an example that most should be familiar with, here is the entry for an html file: htm n 50000 <html </html>
We see here that the file extension is htm (NONE can be specified if no file extension should be used during the output of extracted data), the header and footer are not case sensitive, the maximum file size is 50k bytes (which means that 50k bytes after the header will be recovered if no footer is specified or 50k bytes will be recovered if that amount of data is recovered before the defined footer is detected), the recovered file should start with "<html" (header) and end with "</html>" (footer). The ASCII keyword can also be used when attempting to recover ASCII files. Specifying this keyword at the end of an entry will tell Foremost to extract all ASCII printable characters before and after the keyword defined. An example of this would be a type to recover a perl script. If, for example, you need to recover a perl script that you know included Crypt::CBC, you could use the following type definition: pl y 100000 Crypt::CBC Crypt::CBC ASCII
Note that Crypt::CBC is listed in both the header and footer fields. This is done so that Foremost will recognize this as the string to search around when the ASCII keyword is used. A more general type to find perl scripts could be defined as follows: pl n 100000 #!/usr/bin/perl #!/usr/bin/perl ASCII
When attempting to recover files that are not ASCII, hexadecimal and octal notation can be used by specifying \x[0-f][0-f] or \[0-3][0-7][0-7], respectively. Below is an example of hexadecimal notation describing the header and footers of a gif file: gif y 155000000 \x47\x49\x46\x38\x37\x61 \x00\x3b
As you may have realized by now, Foremost is a very powerful tool. Learn its intricacies and it can be a wonderfully flexible tool in data recovery and computer security forensic operations. Read the Foremost man page or consult the configuration file for a complete guide to creating custom data types. ext2 vs ext3 Data Recover
You may be asking yourself why this process is so much more difficult with ext3 than it is with ext2? This question is answered by one of the ext3 developers in the Linux ext3 FAQ: Q: How can I recover (undelete) deleted files from my ext3 partition? Actually, you can't! This is what one of the developers, Andreas Dilger, said about it: In order to ensure that ext3 can safely resume an unlink after a crash, it actually zeros out the block pointers in the inode, whereas ext2 just marks these blocks as unused in the block bitmaps and marks the inode as "deleted" and leaves the block pointers alone. Your only hope is to "grep" for parts of your files that have been deleted and hope for the best. The process, as described in this article, is the "grep" that Andreas is referring to. Hopefully, as ext3 is developed further, some effort will be put in to making this process easier and more reliable. ConclusionWhile going through this process may be necessary to recover information lost in any number of situations, it’s not a process you want to go through on a Monday morning to recover your organization's payroll data after an administrator fat-fingers an |