Why do people encrypt their data? Well, to protect their information from getting into the wrong hands, of course. But, what if the “wrong hands” is law enforcement, the court system or even your boss? Should they have the right to access your data when the law is on their side, or when you are storing it on a company owned computer? Most people would say that their information is sacred, and that they need to maintain control of it themselves.
When companies and individuals encrypt their data, they typically use software that is easily detectable. They see no need to hide the fact that they are encrypting data. Why should they? It is their data. There’s nothing wrong with encrypting data. But, maybe they should hide the data too. Can’t the encryption be broken with decryption software? Sure, depending on how strong the encryption key is and how many days, months or years you want to spend working on it. One step you can take to further secure your data, is to hide it as well.
How do you hide encrypted data? Well, you can find some very complicated ways to move it to unused/hidden places on a hard drive, make it look like a different/innocuous type of computer file or make it look like random/unerased data. This may sound pretty complicated, but products like TrueCrypt (9,623,114 downloads/users) actually make this whole process simple.
What does TrueCrypt do? TrueCrypt is a free open source utility that specializes in encrypting and hiding your data. This tool can create an entire encrypted hard disk partition, or a smaller encrypted file (virtual drive) that is easily seen by any disk utility. Where they differ from most of their competitors is that they also encrypt the parts of their storage file that don’t contain your data. This means that there are no file signatures, magic number IDs or even a common file extension for the disk utilities to identify that the encrypted file is made by TrueCrypt or even that it is encrypted. They even go one step further and provide a hidden encrypted partition within an encrypted partition for the off chance that your encrypted data is discovered and you are forced to provide the encryption key. In that situation, the invading party will see inside the first level of encryption and assume that there is nothing else to find.
How do investigators detect encrypted data? Well, most encryption tools use a recognizeable file header that can easily be recognized, but tools like TrueCrypt don’t do that. Encrypted data tends to look like random data. So, without a file header, encrypted data is completely undetectable. Or so we thought…
We recently started analyzing encrypted files, and found a method for detecting headerless encrypted data. Sure, it looks random, but not really. There actually is a pattern to it. You have to know how to extract that pattern. We just released version 2.23 of File Investigator TOOLS. This version detects TrueCrypt Dynamic files as well as most any other headerless encrypted file, as far as we have seen so far. Feel free to try the tool and see if you can find an encrypted file that it can’t identify.
What’s the value in finding encrypted data, that you can’t decrypt? It’s up to you how you leverage the information that our tool provides. Use it to entice the encryption key from a suspect, show the withholding of potential evidence in a case or catch your employees hiding data on company computers.
TrueCrypt is now Detectable
This is an obvious scam. The only thing your tool might be able to report is that a file was found to contain solely random data. Nothing more. The tool cannot distinguish a TrueCrypt container from a file containing random data. Therefore, you cannot identify a TrueCrypt volume.
A simple test to prove that you are nothing more than commercially motivated, deliberately misleading, fraudsters:
Create a couple of files on a partition. Each of the files will be 1MB in size and it will contain purely random data. Now create a 1MB TrueCrypt container on the same partition. Your software will NOT be able to distinguish the TrueCrypt container from the other files. It will falsely report that all of the files are TrueCrypt containers. Again, it will NOT be able to DISTINGUISH a TrueCrypt volume from random data or to identify it as such. (The fact that the size of a TrueCrypt container is always a multiple of 512 does not play any role — it doesn’t distinguish it from other files containing random data (let alone prove that it is a TrueCrypt container).)
There already is a similar scam tool. It’s free (unlike your tool) and the following comment was posted in response to it by one of the moderators on the TrueCrypt Forums. It sums it up well:
http://forums.truecrypt.org/viewtopic.php?p=63217#63217
If your claims were true, you would break the AES and you would now be really famous. But the only thing you will achieve is that you will be sued by the TrueCrypt Foundation for intentional, commercially motivated, damage of the reputation of their product.
So, again, in short:
TrueCrypt is not detectable — random data is (and always has been).
I’m going to quote Beth’s response to your recent threatening email message. Beth is in our sales department:
“You are prematurely casting judgment. Before you jump to conclusions, try our software on your encrypted volumes. You can download our software from http://www.forensicinnovations.com/download/fit223.exe. …
In our tests, we were able to identify all TrueCrypt Dynamic volumes as “TrueCrypt Dynamic File” (type #3175), and all other TrueCrypt volumes as “Encrypted Data (Headerless)) (type #3174). …
We welcome your input after you have tried our tool. We are not decrypting anything, but simply identifying where data is being hidden.
If you decide to post contradictory statements in the media, I recommend that you do so quoting facts (gained from testing our tool), and not theory. … We are simply interested in the truth. If you are able to fool our tool consistently, then I would ask that you discuss that with us calmly and provide some sample files as proof.
We have no incentive to make false claims which could sell some software, but would end with dissatisfied customers and harm our reputation. I realize that you are angry over the prospect of someone being able to work around a feature of TrueCrypt, but technology advances over time and other organizations should be allowed to advance their technologies without negative rhetoric. You could instead view this as a challenge to further improve TrueCrypt’s stealth technology.”
I’d like to add that we identify these types of files (#3174 “Encrypted Data (Headerless)” & #3175 “TrueCrypt Dynamic File”) along with a Medium accuracy level. In our software, this means that we claim to be accurate 90% or more of the time. Our highest accuracy level is High, which means that we claim 99% accuracy in those cases. We never claim 100% accuracy, because it is possible to spoof most every file type.
The TrueCrypt Forum post, that you provided a link for, only talks about a TCHunt utility that doesn’t do a very good job at detecting TrueCrypt values. Unless that TCHunt tool claims to always identify all TrueCrypt files, the only thing they are guilty of is using poor methods not a scam. We have no connection to TCHunt, and we use better methods to obtain higher accuracy.
Please test your theory and come back to us when you have some facts to present. The same goes to all of our other readers. Please test our tool. If you find encrypted files that we fail to identify, then feel free to notify us of your findings. If you like being able to identify encrypted files, then we invite you to provide us with sample files (that fail our process) so that we may further improve our methods.
Best regards,
Rob Zirnstein
President
Forensic Innovations, Inc.
Hi,
I just saw these posts comments so I took 2 minutes to verify by myself. After a few tests I can report that:
– fitools can correctly detect a Truecrypt Dynamic Volume
– fitools will detect any random file as “Encrypted Data (Headerless)”
– fitools will detect Truecrypt file containers as “Encrypted Data (Headerless)”
So as a conclusion:
– fitoosl can reliably detect Truecrypt Dynamic Volumes (Truecrypt documentation states that such volumes can be detected)
– fitools cannot distinguish random data from a standard Truecrypt file container and will report both as “Encrypted Data” (which random data is not obviously).
NB: I created my random files with two ways:
– cat /dev/urandom > random.bin under linux
– with a simple C program.
Regards,
Nicolas.
Hi,
I have tried your tool and it detect only truecrypt formatted dynamic volume. If dynamic volume isn’t formatted (this is a truecrypt option), your tool can’t detect the file as a truecrypt dynamic file.
Tchunt detect all truecrypt files, but it has many false positive, nevertheless it hasn’t false negative.
Nicolas,
Thank you for taking the time to try this, and for providing feedback.
The “any” in your statement of “fitools will detect any random file as “Encrypted Data (Headerless)”” is a bit far reaching since you only created the random files using those two methods. Other methods may use different seeding and appear different.
Could you email a few of your test files to support@forensicinnovations.com so that we may confirm your results?
What is the probability of such random files appearing on a typical user’s system? Why would any typical user application create such a random file? None of the systems we have tested contain such files, so it appears that you have to intentionally create such files solely for the purpose of tricking detection tools.
If typical systems do not contain such random files, then our solution is quite useful in the real world.
Mirko,
Your results indicate that the non-formatted TrueCrypt dynamic volume looks just like a static TrueCrypt volume.
Doesn’t Tchunt identify all headerless encrypted files as TrueCrypt, whether they were made by TrueCrypt or not? We just go one step further and label all of those files more accurately as #3174 “Encrypted Data (Headerless)”, unless we can detect that they really are TrueCrypt Dynamic files then we label them as #3175 “TrueCrypt Dynamic File”.
We use a different method than Tchunt, which we believe to be much more accurate, not limited to 512 byte boundaries or file sizes.
Rob
@rzirnste : I totally agree with you, “any” in my statement should have been “any file created using the methods I used (/dev/urandom or calling random and filling a byte array from a program).
I also agree that probability of finding a random file on a computer is low, unless someone tries to trick detection tools. But as the point of the post was “TrueCrypt is now Detectable” I guess someone who wants to hide Truecrypt volumes will use such deception techniques.
As a forensic researcher I am pleased to see that a tool can quickly highlight possible TC volumes. But maybe the User Interface should contain a tooltip indicating that pure Random Data cannot be distinguished from Headerless Encrypted Data. I don’t want someone to be prosecuted or accused of hiding data only because some tools on his computer has created a random file for any reason 🙂
Anyway now we know that someone who really wants to hide data will have two solutions:
– create a bunch of random files, and hide a real TC volume among them
– Use the “Hidden” volume feature which is the recommended procedure.
PS: I have deleted the sample files, just cat /dev/urandom to reproduce them….
Nicolas,
Thank you for your feedback. I agree that additional documentation (or a note) needs to be added.
We plan to change the description for #3175 from “TrueCrypt Dynamic File” to “TrueCrypt Formatted Dynamc File”. As well as add the following line to the notes attached to #3174 “Encrypted Data (Headerless)”: “Some files, containing pure randomized data, may be incorrectly identified as Encrypted Data, but such files are typically only created to interfere with the detection process and are not normally seen in real world environments.” What do you think of this wording?
These notes are visible when the Details Previewer is selected. FI Find, Tools menu >> Options… >> File Find tab >> Fall Back Previewer, set to ‘Details’. Also, put a check in the ‘Force on ALL files’ check box.
The notes appear about half way down the scrollable page, in the bottom half of the applications user interface.
Rob
Pingback:TrueCrypt Volumes Still Undetectable
Errm – that’s not the only reason for creating files full of random data. I’m pretty sure I have some around here somewhere that I was using to test network transfer speeds and forgot to delete. (Random data has the handy property of being impossible to compress, which makes for good test files.) I believe they’re also useful in some fields of mathematical modelling – while most people use pseudo-random numbers, they aren’t always random enough. Oh, and if you’re doing any sort of quality analysis of random number sources, you can often end up creating files of random data gigabytes in size.
Also, it’s not just files created by nicolas’ suggested methods – any sufficiently good-quality random data should have the same effect.
Makomk,
That’s good to know. Maybe we should add to the attached notes that random data may be found in network test, mathematical modelling and software development environments. Those should account for less than 1% of the computers that are likely to be searched or audited. That leaves 99% that should have no potential for false positives.
Would you have any of those sample files, that cause false positive identifications, that you could send to us as support@forensicinnovations.com?
I reviewed the “TrueCrypt Volumes Still Undetectable” blog posted on May 1st, and added a comment on their blog page. They still haven’t decided to approve that comment for the public to see, so I’m going to make a comment about it here.
Basically, they created one headerless encrypted file and two files full of random data, then tested FI TOOLS on them. The encrypted file was correctly identified as encrypted data, while the random files were incorrectly identified as encrypted data.
I explained that all that proved is that they know how to create files that can spoof encrypted data. I know how to spoof a Windows Bitmap and cause forensic tools to misidentify them as a Windows Bitmap. That doesn’t mean that those tools are incapable of identifying Windows Bitmaps. It only proves that I know how to practice anti-forensics and interfere with investigations and audits. In the real world, 99+% of the computers do not contain such spoofing data.
Some people loyal to TrueCrypt and/or encryption theory are acting desperate to prove us wrong. I’m sure that it is frustrating to have your beliefs contradicted, but no one has presented a case where our tool has failed to identify a TrueCrypt encrypted file as either TrueCrypt or more generally encrypted data (headerless). All of the files, that have caused false positives so far, were created solely for the purpose of spoofing encrypted data and are not found in real world environments which account for 99+% of the computer.
Our products are designed for use in the real world, and do not cater to the theorists or TrueCrypt zealots. I personally happen to use TrueCrypt and recommend it, but I don’t care if an investigator or boss can detect my encrypted volumes. I don’t have anything to hide from them, and I respect the law if a court orders me to provide the keys to my data.
Rob
Pingback:» FreeOTFE Charles Garry
I have found another application which will create pure random data files: Eraser (and lot of wiping tools which fills disks with random bytes). Most of time they create such files (which get deleted after completion) when using the “wipe unused space option”.
The process they use is to create one or several big files in a temp folder to fill the disk with random bytes (thus deleting content of previous deleted files).
@rzirnste: I understand your speach about detecting Truecrypt files and the analogy with bitmaps. But isn’t this flawed logic?
A human (or any bitmap viewer) could be able to distinguish a crafted bitmap from a real bitmap. But no one can distinguish a TC container from a random file.
So if I just make a tool that detects 100% of time random files, did I make a TC container detector? I hope you will accept this argument.
So unless you provide details on how your detection algorithm differs from random data recognition, I will just take Fitools as an helper to detect pure random files and as result, strong indication that this is a TC volume (even more probable if TC is installed on the computer!).
By the way what about checking the 512bytes boundaries, as we know this is an additionnal proof of a TC volume compared to pure random files (which of course, one more time good be crafted to prevent TC detection)?
Nicholas,
You have some good points.
Wiping tools: That’s interesting that those tools use the simplistic method of writing a large file rather than directly to the disk sectors. This method is not very dependable, because some operating systems will fail when the disk free space goes to zero, or the OS will prevent the final writes and leave some unused space unwiped. I’ve added “disk wiping tools” to the notes on tools that may create/use random data files. Taking the file’s folder location into account can be a further indicator in such cases.
Bitmap comparison to TrueCrypt: Yes, a bitmap would be easy to confirm by opening a supporting application. Many file types are not created to be loaded by a parent application, and are not so easily human detectable. For example, a program overlay may contain the right signatures to be identified, but actually contain hidden data. We know how to spoof all 3,312 file types that we support, but we don’t test our competitors tools on the spoof files and claim that they fail to identify those files.
For example, a text file containing the characters “MZ” at offset zero gets identified as an executable by Oracle’s Outside In File ID API, but we don’t claim that they fail to identify executables. Although we might claim that their accuracy is low. One of our customers performed this test, not us.
After obtaining some of the random files that people have been using to spoof headerless encrypted files, we are seeing some suttle differences. This will help us fine tune our method. 512 byte boundaries are one idea, although that is probably only specific to TrueCrypt. We don’t want to exclude Folder-Lock and other headerless encrypted file types. After adding some additional test conditions, we plan to create an additional file type of “Random Data (possibly encrypted)” as a fall back for the headerless encrypted files that fail our new tests. This will allow us to fine tune the encrypted data tests and move all failure conditions to the new file type in our database.
Unfortunately, we can’t reveal the methods we are using, because they are proprietary and are part of the value that we provide over competing products.
Rob
It seems that so called identification of TrueCrypt container by File Investigator TOOLS is simply a problem of clearly expressing what that “identification” means. You can not say “I can see that this is a headerless random data file, so it must be TrueCrypt container…what else could it be?” Even worse, you can assign some probability to this statement, like “90%” to make it sound more persuasive. Nice, but this is not kind of proof, but logical fallacy.
Borat,
Our product does not identify headerless random data files, nor conclude that they are TrueCrypt containers. When a file looks like a headerless encrypted file, then it identifies it as such. In the background comments, attached to that identification, we explain that some headerless random data files can cause a false positive for this identification. When we identify a “TrueCrypt” file we have additional means of confirming that identification, and no one has been able to generated a false positive on that identification. The TrueCrypt files that can not be identified with these additional means are labeled as “Headerless Encrypted Data”.
Some people have identified some application types that may place headerless random data files on a hard drive, for tasks like bench testing a network connection, wiping a disk, mathematical modeling, etc.. It’s so easy to generate random data on the fly, why would a software developer bother to store a file of it on their hard drive? In my opinion, that is sloppy and inefficient development methods. Sure, some lab environments may have some of these headerless random data files, but we haven’t observed any typical user environments (or corporate environments) that include such files.
In the real world (where our tools are typically used) the files that we identify as “Headerless Encrypted Data” have a 90% (or greater) chance of actually being headerless encrypted files. The headerless random files just aren’t common in such environments.
I understand that people loyal to TrueCrypt, and those who study these issues in a lab, are eager to take punches at our claim, but none of them have been able to prove that our tool doesn’t do what it says it does in the real world.
My challenge to you is to download a trial version of our tool (http://www.forensicinnovations.com/downloads.html), search outside your lab environment for real world environments that already contain headerless random data, then notify us of your results. If the issues that you raise are so simple to prove, then this shouldn’t be hard.
If you can’t do this simple task, then stop bothering us with your negative rhetoric.
Best regards,
Rob Zirnstein
Forensic Innovations
Pingback:Truecrypt files finding : TCHunt vs FI TOOLSDétecter truecrypt : TCHunt vs FI TOOLS | Artiflo Inside
After reading the “TC Hunt vs FI TOOLS” post (http://www.artiflo.net/2009/05/detecter-truecrypt-tchunt-vs-fi-tools/comment-page-1/), I left the following comments on that site:
FI TOOLS was developed with no knowledge of TC Hunt. We used our own technologies to acomplish the ability to identify encrypted files and, in some cases, that a headerless encrypted file was created with TrueCrypt. The methods that TC Hunt uses (at least in version 1.0) are clearly described, and they are completely different than our methods. To assume that we copied them is wrong.
TC Hunt identifies files as being TrueCrypt or not. Our software identifies 3,300+ different types of files. One of those is Encrypted Data (Headerless), another is TrueCrypt Data. The two false positives in the TC Hunt test appear to have been incorrectly labeled as TrueCrypt. In the FI TOOLS test, those files appear to have been correctly identified as Encrypted Data (Headerless). Since the author admitted that those files actually are encrypted, with a method other than TrueCrypt, they should be counted as correct positives for FI TOOLS. That moves FI TOOLS to 7 correct identifications out of 8 files, and TC Hunt to 5 correct identifications out of 8.
I do not see any proof of the jouib6.sys file being identified incorrectly by FI TOOLS. Please make all of these test files available for testing by other people and/or provide all of the details about the files tested.
The 5+ minutes required for FI TOOLS to identify these files indicates that the test may have been run on a slow computer (or slow virtual environment). FI TOOLS has to read the entire file for some file types, when configured to read beyond the first megabyte, in order to maintain our high accuracy. I didn’t think that slower configuration was the default, but I will look into it. Would the author/tester please document any configuration changes that they made before the test?
Since these two tools use completely different methods, I recommend that FI TOOLS be used to identify all of the files on a hard drive, then use TC Hunt to get a second opinion on the files that FI TOOLS identifies as Encrypted Data (Headerless). The files that FI TOOLS identifies as TrueCrypt Data are identified with even higher accuracy and should not require the use of TC Hunt. The TrueCrypt identification in FI TOOLS works on Formatted Dynamic True Crypt files. All other TrueCrypt files are identified as Encrypted Data (Headerless).
Rob Zirnstein
Forensic Innovations, Inc.
Hey,
Your’s program is excellent! But The PGP private key detection is (maybe)
not complet (Or missing ??? :).
I have some (test) private keys, but the FIFF don’t recognized that keys.
Here one sample :
test_bob_sec.asc:
—–BEGIN PGP PRIVATE KEY BLOCK—–
Version: GnuPG v1.4.9 (MingW32)
lQO+BEpBQHEBCACZa53goskhFclIUhyvjaIfIrAwANYRjVjIuU259zVOwFsRwPOq
…
blablabla
…
jJm/2GBdlcWE0dXxL7gh3ZD6
=N79J
—–END PGP PRIVATE KEY BLOCK—–
FiFF say : This is a simple Text file. … (Probably it isn”t a difficult job recognize it 🙂
You should fix it (I think) or add this feature to the list.
Because the pgp private key a VERY important thing in the analyze.
Regards , and nice job !
user22,
Thank you and good catch! We actually appreciate when people find files that we identify incorrectly. Providing samples of these files gives us the opportunity to improve our products. I will make sure that we add the PGP Private Key Block (ASCII) to our next release of File Investigator (v2.24.01).
Do you know where we can find a sample of the binary version of the PGP Private Key Block? We don’t support that one either.
If you, or anyone else, find other file types that we aren’t doing a good job with, please send us sample files to reverse engineer and/or the file format specification at Support@ForensicInnovations.com.
Thank you,
Rob Zirnstein
President
Forensic Innovations
So what you are basically saying is that your software detects “headerless random data” in a real world environment under the assumption that a “real world” user (however that may be defined) won’t have other random files on their computer because most people around the world have files with headers on their computer. It sounds reasonable in a generalized sense however it doesn’t seem more useful than the Search tool of whatever OS is being used.
Hi,
I dont really have the time to test.
But, won’t FI detect any sparse file containing random data as a dynamic TrueCrypt volume?
You might say “Yeah, but who would keep such files around?”; and I might say that I could do this myself just to test the performance of NTFS sparse files.
So, my opinion is (and probably that of any trained lawyer) that anything short of a serious bug in TrueCrypt causing it to output its name (“TrueCrypt”) in the header of the file (or something equally serious) will not constitute a serious proof that someone is using encryption.
If a user would be required to use a complicated procedure to create a file similar to a TrueCrypt dynamic container this might constitute a more serious proof against someone.
But any file that is generated by Truecrypt can also be generated using a simple Windows or bash one-line command.
Pingback:Twitter Trackbacks for Innovations Blog » Blog Archive » TrueCrypt is now Detectable [forensicinnovations.com] on Topsy.com
Mr. “whatever”,
It’s not that simple, and don’t confuse random data files with headerless data files. Just because a file lacks a header doesn’t mean that it contains random data.
Our software identifies thousands of different file types by means of various pattern recognition methods. One of those methods converts the file’s contents into a proprietary pattern that looks something like a histogram. We use this method to identify raw data files that contain no header, magic ID or signatures. When applying this method to headerless encryted data, we see an identifyable pattern appear. We use this pattern to identify headerless encrypted files.
Yes, some random data files can spoof this identification method. We have not encountered any such random data files on our test systems nor in our customers’ production environments. Some of our customers are large eDiscovery service providers that process terabytes of data each day. Their data comes from “real world” cases and investigations. I think it says a lot that none of these customers have reported any problem with this identification method. If they aren’t running into these random data files, then I don’t see a problem with our methods.
Taking all of our customers’ data into account, we have achieved greater than 99% accuracy on detecting encrypted data.
Rob Zirnstein
Forensic Innovations
Jeam,
No, our identification of “TrueCrypt Formatted Dynamic Files” has not been spoofed by any headerless random data files. This identification uses additional methods to confirm its accuracy. What can be spoofed by random data files is our identification of “Encrypted Data (Headerless)”. This identification includes files created by TrueCrypt (not Formatted Dynamic files), Folder-Lock and potentially other encryption applications that encrypt their headers and other file structures.
NTFS Sparse files are regular files that utilize a feature in NTFS that allows their unused portions (zeroed out) to be utilized as unused disk space by other files. It’s like another form of disk fragmentation, but when seeking to a sparse point in one of these “sparse” files the OS feeds all zeros to an application as if the file is still using that disk space. This OS trick doesn’t have any effect on our identification process. If a file is all zeros, then it doesn’t contain random data. When a file contains non-zeroed data, then we identify it as the type of data that it contains. When zeroed out sparse space is encrypted or compressed, then we identify it as such.
Our identification of “Encrypted Data (Headerless)” is intended to identify data that is probably encrypted and being hidden. If it is not actually encrypted data, then it is probable that someone is trying to spoof encrypted data and interfere with the investigation.
As with any file identification method, you can not obtain 100% accuracy. There are too many similar file types and variations of file types created by software developers. This is why we only claim up to a 99% accuracy rate on any of our methods, and we only claim up to a 90% accuracy rate on our identification of “Encrypted Data (Headerless)”. It is unrealistic to expect better than that.
Rob Zirnstein
Forensic Innovations
I thought the obvious way to decommission large amounts of data, possibly containing privileged information, is to encrypt their container (for instance by TrueCrypt) to an suitably large and uninteresting key one doesn’t bother to write down or remember. One could of course try to find a reliable source of irreproducable data but this seems easier, more user-friendly and, for most purposes, secure enough.
Koos Dering
Privatly