In this episode of SecuraTip, we demo the use of tekCollect. tekCollect is a python tool written by @TekDefense, which is useful for scraping data (IP addresses, MD5 hashes, SSNs, Emails, etc) from URLs and files. This episode highlights several use cases for the tool, while also exploring advanced features such as custom regex scrapes. To download tekCollect goto: http://www.tekdefense.com/tekcollect/
As I mentioned in my post over at TekDefense, the number of data dumps put out has increased heavily, mostly because of the Anonymous #OpIsreal stuff. With that in mind I figured this would be a great time to talk about my process for finding, gathering, and cracking hashes. I should note that I am a hobbyist in this arena, so don’t take my word as gospel here. Test my methods and develop a solution that fits your needs.
There are many methods for finding database dumps, here are a few:
- PasteLert: Created by Andrew Mohawk, PasteLert will index Pastebin and alert you to items that match whatever query you have. I have it alert for anything matching the MD5 for 123456 as this is the most common password.
- DumpMon: Dumpmon monitors for data dumps and reports them via twitter
- Leaks-DB: http://leaks-db.hacktalk.net/ collects data dumps and has them available via their website.
- PastebinDorks: PastebinDorks is another great Twitter account to follow for interesting pastes.
All of these sources are great, just monitor them and watch for the links to the data breaches
Collecting the Hashes
Usually the files or URLs that have hashes in them also contain other data. The following screenshot is an example of the typical format:
The way this is formatted, we could pull out the MD5s with the cut command in Linux. For others though, the format is so inconsistent that we would not be able to grab the MD5s easily. A tool I created (tekCollect) can grab them with ease. Download tekCollect here. tekCollect can grab specified data types from a file or URL. In this case I will use the URL option:
[email protected]:~/workspace/Automater# ./tekCollect.py -u http://pastebin.com/r aw.php?i=S6wCigZ5 -t MD5 59101d2acb7cdca8d7c98e352d6c9aae 02f679c21391498bcaf57cb6557971d5 c58893a76460232c87964bae8c377ac9 94f82a0a2cff100088a30cf21e41c171 6603f8a8c488a1c711cf0ee962eb95bf efa92594115ae50298c1dd62e7e7c4d2 378f6abfbd84c8193ae55bea03b53353 48b345f1f0fa0dec405ff326f97f42e1 e33404ec91666f3202c9453d68a27122 8c9470528fddb355d4b69e5efd8ba373 fd2ea7b90d9472ab0105d397952dc48e 2aeba10361a4cd4ad21a82cec540ad0b 50c6734ea53ea6a0c9a11ec0cfc6f0d3 698a2254130aa105df48fe2efa72cca5 1142b9cb231590a18ec4dd7171888fba 0122c0cdd39f2e74855c099505464842 c9a23c09c48009f7666f1380ce5384e4 d8818583ff93ad4e011289e9aef494bf afa7bf27ba3fffc9327ed8e6f92215d9 c655c48409c2bda89349e1ec1b823aeb 149372cfa4c2acef25d6b6bc994f9527 1790d869e27df56b802290cb4ca50155 aafaf9248a9a516ed3e5f5ced37094e4 6f0312c2af574711251cb32f31c487a3 5d56aefe59f299e4acee2fb969d0980f 4be96cbe0c926116cc8f1dba9235ccde fe2956cb48faa3227105b94a3cb7f27d 127a5c68d5a9f0bcadce2e2b6549e6fb 8d5d97185b9b285ebc4078e2b23af7b4 683d7793b4df3e7f67d7dc4b92e0e746 c87962996d8357563a62ce90b4f71aa6 e4606352ee8e08a62f057abc70fcc1e3 93018f4fa70dec793121cf95811349ca 45bdc176003729fe5b908e30eb03cc3d 0afd468ad9eeb023011b600ef848f4d7 dd0cb976264b4160e086ab2831423a15 0b2857bc6c8f8a9040b95c46796d29bc
With a -o option on the command, you can have tekCollect output the results out to a file.
My usual process for cracking hashes is to check the hashes against a wordlist, and for those that can’t be found with the wordlist, I attempt to bruteforce them using hashcat masks.
As I am doing my cracking within a VM, and not on a physical machine with hardcore GPUs I use traditional hashcat instead of oclhascat.
When you run hashcat it will hash your wordlist and then compare those hashes to the hashes you want to crack:
hashcat -o crackedhashes.out demohashes.out 1aN0rmusWL.txt
Using this option in less than 5 seconds we were able to crack 33 of the 413 passwords.
Now you may be wondering why only 33. The main reason for this is that this was part of #OpIsreal and because of that my dictionary being primarily English will not detect foreign languages. Notice how all the hashes that were found translated to passwords consisting of only numbers.
No matter how good your wordlists are, you are never going to catch all of the passwords. There may come a time where you will want to bruteforce your way in. What this means is that you’ll need to try every possible combination in an attempt to determine the password. Luckily, running tools like Pipal we are able to understand that most passwords follow certain formats. For instance, we understand that most passwords consist solely of lowercase letters. Using hashcat masks, we can tell hashcat the format we want it to look at.
For this attempt I am going to attempt to bruteforce all combinations of 8 digits:
hashcat -a 3 -o crackedhashes2.out demohashes.out ?d?d?d?d?d?d?d?d
Now, we have captured 46 more hashes. By experimenting with the hashcat masks, you will be able to bruteforce your way into a good number of these.
Like I said in the beginning of the post, I am not an expert in password cracking, but more of a hobbyist. Using my wordlists and hashcat masks, I am usually able to get 60% to 80% of the passwords from a dump. What is your method and how successful is it?