Speed matters. How to make a forensic image as quickly as possible.

The typical method used to create a forensic image is to connect the source disk to a write-blocker. The write-blocker is then connected to a computer and a forensic image is made. This process needs to be updated to keep up with the capacity and speeds of the newest disk drive. By making the process as efficient as possible, the forensic imaging times can be substantially reduced.

When making a forensic image of a disk drive, it is necessary to copy every byte available from the source disk and to ensure that nothing is written to the source disk. As the capacity of disk drives has increased, the time required to make a forensic image has also increased. For example, a 20GB disk drive would take approximately 8 minutes to image at best. A 200GB could take approximately 50 minutes at best, while a 1TB disk drive would take approximately 2.5 hours.

We can calculate how fast a disk drive can be imaged by dividing the total capacity of the disk by the maximum sustained transfer rate (MSTR) of the disk. The MSTR is the manufacturers information on how fast data can be read off of a disk drive for a very large transfer. The MSTR tells us how fast data comes off of the disk. (Note that the maximum burst transfer rate is not of use to us since it only provides information on how quickly data comes out of the disk cache, and it only applies to a small amount of data.)

Let’s look at a 1.5TB Western Digitial Caviar Green disk drive as an example. The data for this drive is available here.  This disk drive has a capacity of 1,500,301 MB and it has a maximum sustained transfer rate of 110 MB/s. Thus, it would take 227.3 minutes (almost 4 hours) to forensically copy the entire contents of the disk drive. (A transfer rate of 110MB/s is 6.6 GB/minute.) To achieve this speed, all parts of the forensic imaging process must be able to process data at a rate of 6.6GB/minute or greater.

Using a USB 2.0 write-blocker would slow this transfer rate down dramatically, as USB 2.0 has a maximum data transfer rate of approximately 34 MB/s. Using a USB 2.0 write-blocker when imaging the 1.5TB disk drive would require 735.4 minutes (over 12 hours).

Other factors that can alter the efficiency of the disk imaging process include:

  • The buffer size of a data transfer.
  • The filesystem where the data is being written to.
  • Whether compression is used when making the forensic image.

All of the above factors need to be tuned to ensure that forensic images are made as quickly and efficiently as possible.

I have recently published a paper in the Journal of Forensic Sciences entitiled Characteristic of Forensic Imaging. This article discusses the impacts of different factors on the efficiency of forensic imaging. I am also preparing a web page that will provide simple scripts to allow you to evaluate the efficiency of your forensic imaging setup.

How to find hidden passwords (and how to protect them)

While preparing to teach a computer forensic workshop, I discovered a new live Linux distribution entitled C.A.IN.E, (Computer Aided Investigative Environment.) This software is one of a few live Linux distributions that allows a user to boot Linux from a CD or DVD and start a forensic investigation. The distribution includes tools to make forensic and analyze forensic images. Since it is freeware, it is easy to make use of the software as part of the workshop.

In addition to Linux tools, NBCAINE version 2.5 includes WinTaylor, a set of tools that are designed to run on a Windows system.This software can be loaded onto a USB through the “dd” utility. (Once loaded on the USB,  a user can boot the live distro off of the USB and not access the WinTaylor tools or plug the USB into a running Windows system and access the WinTaylor tools.) Included in the WinTaylor section of the software are Windows based tools from NirSoft that allow a user to recover passwords saved in popular web browsers, view recent file activity on the Windows system, view information about USB drives attached to the computer and more.

The NirSoft tools include some noteworthy ones that are designed to uncover passwords stored on Windows systems. For example, when you log into a password protected website, Internet Explorer (and other browsers) give you the option to save the login information so that you don’t need to enter it the next time. A Nirsoft utility, iepv.exe(Internet Explorer Password Viewer), retrieves and displays the userids and passwords. If you use Microsoft Outlook and save your POP3 or IMAP password,  the Nirsoft utility mailpv.exe will retrieve and display the accounts and passwords saved in Outlook. And, WirelessKeyView.exe will display the wireless network names and associated passwords that are stored in your system.

I encourage you to obtain these tools and run them on your system to reveal how many passwords are stored on your system. If you discover sensitive passwords stored on your system and you allow others to use your system, you will want to ensure that you clean out the stored passwords.

While you might not be able to delete all of the saved passwords, at least you will now have a better handle on all of the passwords stored on your system that are recoverable.

Tracking and recovering a stolen iphone

A few months ago, a friend of mine lost his iPhone in a movie theater.  He noticed it was missing when he got home. At least he thought it was lost, until he noticed that someone was reading and deleting his emails.  It seemed that the iPhone was found by someone, and that someone was using the iPhone.

He contacted AT&T for assistance. It should have been a pretty easy recovery. The iPhone, when turned on, must register on to the AT&T cellular network with its unique Electronic Serial Number (ESN) and Mobile Identification Number (MIN).  AT&T should easily be able to find the cell tower covering the cell phone, right?

Well, technically AT&T can do that, but as a matter of policy, they don’t release this information without a subpoena. And that would need to come from the police.

Were there other options? Well, AT&T offered to turn off the service to the stolen iPhone and (for a fee) send him a new one. An offer that he took since he wanted to get into the mobile world.

Then, as luck would have it, the thief tried a test application on the iPhone call AirGraffiti. This app logs the GPS coordinates of the cell phone.

Here is a map showing some of the GPS coordinates reported for the cell phone.

  Keep in mind that iPhones are both 3G and WiFi capable. So, when AT&T had turned off the stolen phone’s service, the thief just started using the WiFi service.

GPS map view

There were a couple of challenges in this case. Since the phone was stolen, the thief had no expectation of privacy. However, everyone else in the neighborhood still did! So, we needed to be able to search for the stolen phone only. Next, we wanted to make sure that we were passively listening, we did not want to generate traffic and try to cause the iPhone to respond. And we did not want to listen to content. We only wanted to look for the MAC address of the cell phone.  The MAC addresses should be unique for each iPhone, and it is difficult to spoof the MAC address can be of an iPhone. These restrictions ruled out tools such as wireshark, netstumbler and kismet.

My company builds AP-Finder, software that can track the location of WiFi devices. Since the owner had the MAC address for the iPhone, all I needed to do was run AP-Finder. I searched for the iPhone’s MAC address and drove through the area reported by the GPS coordinates. Sure enough, I got a hit!

Using the results of this search, I contact the State Police and told them about the case and what I had. They came out to do the search using AP-Finder, and sure enough they also got a hit. Using the signal strength feature of AP-Finder, we were able to locate the house containing the cell phone. (Below is a sample of the AP-Finder’s search by MAC feature.

This technique has promise, but there is still more to do…



  The end result. The cell phone was recovered and the thief was charged with fourth degree theft, and third degree computer crime violations. All of this was done without issuing a subpoena to the cell phone carrier or ISP for information.

MD5? SHA1? – Some facts and mis-conceptions about the checksum value

A checksum is mathematically calculated value that is used to detect data integrity. There are a few well known checksum algorithms in common use, Cyclic Redundancy Check (CRC), Message Digest 5 (MD5), and Secure Hash Algorithm 1 (SHA-1). While there are more than these three checksum algorithms, let’s just focus on these three for the moment.

Checksum algorithms take digital data and spit out a number. For example, let’s calculate the checksum value for the work “Hello” using the CRC algorithm. Using a simple Linux system, we can generate a checksum of the word “Hello” using the following command.

$ echo “Hello” | sum
36978 1

(In the above, the 36978 is the checksum value, and the “1” is the size of the input in blocks. We can ignore the trailing one.) If we change the capital H to a lowercase h and recalculate the checksum value, we will get a different result.

$ echo “hello” | sum
36979 1

Let’s add a space to the end of the input.

$ echo “Hello ” | sum
18510 1

This is what makes the checksum valuable. The output value is different when the input is different. A good checksum algorithm will produce the same value on the same input, and different values on different input. And, it will produce the same value on the same input on any computer.

The table below shows the checksum values for the three different variations of the word hello calculated with the three different algorithms.

Sample checksum values for CRC, MD5 and SHA1

“Hello” “hello” “Hello “ (with space)
CRC 36978 36979 18510
MD5 09f7e02f1290be211da707a266f153b3 b1946ac92492d2347c6235b4d2611184 adb3f07f896745a101145fc3c1c7b2ea
SHA1 1d229271928d3f9e2bb0375bd6ce5db6c6d348d9 f572d396fae9206628714fb2ce00f72e94f2258f a83f9352aa642ceec0a03b126e453a5984cf68ab

Notice that the checksum values for the same word are different when using a different algorithm. CRC does not produce the same value on the same input as MD5. And, MD5 produces a different value than SHA1. This means that you cannot use MD5 to verify a checksum value calculated with SHA1.

Myth – Knowing the checksum value, I can regenerate the input.

Checksum values are not easily reversible because the checksum algorithm throws away information during the calculation. Because of this, the checksum value of “36978” can’t be converted back into “Hello”, because Hello is one of many different possible inputs that could create that value. This leads to another myth…

Myth- A good checksum algorithm prevents collision.

A checksum collision happens is when two different values return the same checksum value. For example, the CRC checksum value for “Hello” is 36978, as is the CRC checksum value for “Jdll0”. (With the CRC algorithm, a collision can be easily generated by lowering the first letter of the word by one character while raising the second letter by one character.) A checksum collision is always possible no matter how good the checksum algorithm is. This is because a checksum has to take a file of some arbitrary size and reduce it to a number. A good checksum algorithm will just make it difficult to predictable manipulate the input to create a known hash value. MD5 and SHA1, since they are cryptographic hash functions, make it more difficult to manipulate input to produce a predictable checksum value.

Myth – A checksum value can be used to prove that data has been read correctly.

Since checksums can be used to detect alterations in digital input, they can be very useful in computer forensics. Checksum values can help to establish a very low probability of alteration of digital evidence once it has been captured. A checksum is extremely effective when it is declared after the acquisition of electronic evidence. The declaration of the checksum should be printed or otherwise stored to prevent potential alteration or tampering. Should the checksum of the evidence later be found to not match the declared checksum, there is a possibility that the evidence or evidence container has been altered. (Note carefully that it is possible, not definite.) Factors such as disk errors and errors in the checksum implementation can also result in a checksum mismatch.
What a checksum cannot do is prove that the correct digital evidence was acquired. Here is an example to consider. My company makes forensic imagers, and forensic imagers undergo validation testing by neutral third parties. Basically, these third parties are checking that the product does not alter the data that it was copying and that it copies all of the data.

During a couple of the different validations by different groups, we were contacted by the testers. The testers had told us that they had noticed that the checksum values that we produced were occasionally different than the ones that they produced using their equipment. Follow-up up investigation revealed that the checksums were indeed different, and in all cases it was because our system was capturing more disk data than their test system was. Well, that is good news for us, but why were they different? It turns out that when you capture more data, you need to run the additional data into the checksum algorithm. That, in turn, changed the checksum value and led to the difference.

This highlights that the checksum algorithm cannot be used to determine if the original disk drive was read correctly. As happened here, the validation team’s checksum values had matched when they did not read all of the data. The checksum value was very useful to determine that the source disk was not modified, but was not useful in determining that the source disk was read completely.

is your online bank account safe?

What would you do if your bank called to verify some suspicious transactions? Well, that recently happened to a company I know. It turns out that the “suspicious” transactions were attempts to transfer approximately $9,000 a person to more than 10 different people.

Good thing that the bank noticed and halted the suspicious transaction. It turns out that when suspicious transactions are completed via a business account, the money is difficult to recover. The bank will not refund the money, the account holder needs to get the money back from the transferee.

After the first transaction was stopped, someone tried it again. Eventually, the bank cancelled all online access to the account. What was happening?

Was it an insider that stole the banking credentials? If not, then what? The initial review of systems with access to the banking credentials showed that their virus scanners were up to date. The review also did not show any signs of suspicious access into the systems.

Things were not adding up. But then, a break. A closer inspection showed that the computers used to access the bank accounts were infected with a malware that was not detected by virus scanners.

It turned out that the malware was clampi, a pretty nasty piece of malware that specialized in silently collecting banking credentials. (Symantec has a great writeup on the malware at symantec’s inside_trojan_clampi.pdf)

The malware was able to hide from virus scanners because the malware hides in the registry. The program is a registry key value, not a file. And, of course, the registry key value is encrypted.

There is a way to check your system for a sign of clampi. Check your windows registry for the following key:
HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Settings\”GatesList” . ( If you have this key, you may have a clampi infection.)

A clampi infection is concerning as it is silent and very good at collecting banking credentials. There are a couple of tips to help you avoid losing your banking credentials to the clampi malware:
1- Use a clean computer to access your online bank accounts.
2- Do not use your computer that you use to access bank accounts to access other web sites.
3- Change your online password frequently from a secure computer.

These tips are difficult to implement, but try as best as you can…

Forensics vs proprietary processes

Forensic processing must be able to stand up to inspection and challenge by the defense in a criminal prosecution. The use of “proprietary” techniques should be avoided, since vendors may not allow their techniques to be reviewed.

Here is an example where a vendor allowed their proprietary processes, in particular their product source code, to be examined as part of a criminal trial.
Let’s briefly look at the case of “State v Chun” (New Jersey). This case started off as a DWI (driving while intoxicated) case. Per the NJ statute, one of the criteria for conviction on the DWI charge is evidence that the suspect had a blood alcohol content (BAC) greater than 0.08. The evidence was produced by the prosecution via the Alcotest 7110 unit, a replacement to the traditional Breathalyzer.

During the trial, the defense challenged the validity of the BAC reading as reported by Alcotest 7110. In support of their defense, they hired a company to perform a review of the source code used by the Alcotest 7110. (A copy of the source code analysis is here. )The source code analysis reported many flaws, but ultimately none that were fatal to the prosecution.

Should the vendor have objected to the source code analysis, the usefulness of the Alcotest 7110 would have been put into jeopardy in this case and other cases.

What is the lesson here? If using a product for evidence collection, by sure that the vendor can explain how their product works, and will allow examination of the processing when necessary.