.-::Analysis of brute force & dictionary attacks.::-.
by Zapotek
zapotek[at]segfault.gr
Website: http://segfault.gr
[ 0x00 ] Chapters
================================================
[ 0x01 ] Prologue
[ 0x02 ] Terms & Conventions
[ 0x03 ] Tools, Environment & Prerequisites
[ 0x04 ] Basic Theory
[ 0x05 ] ASCII Codes
[ 0x06 ] Hashes
[ 0x07 ] Brute Force Attack
[ 0x08 ] Theory
[ 0x09 ] Source Code Analysis
[ 0x0a ] Dictionary Attack
[ 0x08 ] Theory
[ 0x09 ] Source Code Analysis
[ 0x0d ] Famous Crackers
[ 0x0e ] Pros and Cons of the 2 attacks
[ 0x0f ] Epilogue
================================================
[ 0x01 ] Prologue
===============================================
A lot of discussion has taken place for a long time in password cracking techniques.
In this paper we will analyze the two most commonly used techniques.
The first one being "Brute Force Attacks" and the second "Dictionary Attacks".
We will start from the hardest and most effective one, brute force.
After that, dictionary attacks will look like a piece of cake.
I hope that you will find the paper interesting and enlightening.
[ 0x02 ] Terms & Conventions
============================================
Blocks between:
"**********************************************************************"
contain code.
Blocks between:
"---------------------------------------------------------------------------------------------"
contain preformatted or special text.
Cryptography:
=============
Cryptography (or cryptology; derived from Greek ....... krypt�s "hidden," and ....... gr�fein
"to write") is a discipline of mathematics concerned with information security and related issues,
particularly encryption, authentication, and access control. Its purpose is to hide the meaning
of a message rather than its existence. [ http://en.wikipedia.org/wiki/Cryptography ]
Encryption:
===========
In cryptography, encryption is the process of obscuring information to make it unreadable without
special knowledge. [ http://en.wikipedia.org/wiki/Encryption ]
[ 0x03 ] Tools, Environment & Prerequisites
============================================
In oder for the examples presented here to be functional your system must comply with the following:
Operating System:
=================
Linux with a 2.6.x kernel.
Other versions might work, I can't guarantee anything though.
Get it from: http://www.kernel.org/
Compiler:
=========
gcc 4.1.0 and higher
I'm pretty sure any other version will work, it's just that v.4.1.0 is what I used to develop the examples.
Get it from: http://gcc.gnu.org/
PHP:
====
PHP 5.1.2 will be used for some examples.
I'm pretty sure any other version will work.
Get it from: http://www.php.net/
Chances are you already have all that software installed, but better make sure.
[ 0x04 ] Basic Theory
============================================
Ok, that's fairly easy to grasp, even though I'm a lousy teacher. :P
Let's move on.
[ 0x05 ] ASCII Codes
============================================
You probably already know this but it doesn't hurt to review it.
Computers don't know characters, they only understand binary, "0"s & "1"s.
For our comfort, though, we have fashioned other systems too, Hexadecimal, Decimal, Octal etc.
It would be pretty hard for a human to read binary, even decimal, that's when ASCII comes into play.
ASCII, American Standard for Information Interchange, codes represent characters and are expressed
in Decimal, Hexadecimal etc.
We will work with decimal codes here.
An example follows:
"A"=65, "B"=66 ... "Z"=90
So "ZAPOTEK" = 90 65 80 79 84 69 75 .
That's very useful when it comes to generating passwords, you'll see why in later chapters.
[ 0x06 ] Hashes
============================================
That's significantly harder than the ASCII codes.
Long story short, a hash is a "fingerprint" of some piece of data.
In general, a lot of hash functions exist, the most popular are MD5, SHA1, CRC.
What makes hashes so important is that they are one-way, at least they are supposed to be.
For example, the word "hash" has an MD5 hash of "0800fc577294c34e0b28ad2839435945".
And the fact that it is one-way means that we cannot figure out what the encrypted data was from it's hash.
Hashes are also used as checksums, meaning that we can use them to check if transported data is not
corrupted in some way. Even if a byte is missing from the data the hash will differ from the original one.
That comes in handy in password validation too, you'll see.
[ 0x07 ] Brute Force Attack
============================================
Remember when I said that we cannot recompute the data from the hash?
That's was true, the only thing we can try is bruteforcing.
Say, we have the password "hash", which converts to "0800fc577294c34e0b28ad2839435945".
The only way we can figure out the password from the hash is to brute force it.
[ 0x08 ] Theory
============================================
Brute forcing is the method of testing random strings against a given hash until we find a match.
The way string generation works is that we generate a random number between the ASCII codes "65" to "90",
that's for capital, English, characters, and then convert it to it's ASCII equivalent character
By adding up characters we generate our random string.
Looping through random strings until we exhaust the keyspace is the hard part because the keyspace increases
dramatically depending on the characters in the string, it's length, and the HASH function used.
Generally the following keyspace formula applies:
---------------------------------------------------------------------------------------------
keyspace = characters_used ^ string_length
---------------------------------------------------------------------------------------------
characters_used = the number of different characters used in the string
string_length = the amount of characters in a string
Worst case scenario is:
---------------------------------------------------------------------------------------------
keyspace = 94 ^ a_very_big_number
---------------------------------------------------------------------------------------------
94 ensues from the number of printable ASCII characters.
In case you didn't notice, keyspace is equal to the number of possible generated strings for the
given character set and string length.
So, looping until we exhaust the keyspace ensures that we will find a match.
If however the keyspace is exhausted and we have no match, either the string length is wrong,
or the character set is different from what was expected.
That's pretty much the gist.
[ 0x09 ] Source Code Analysis
============================================
This is a John the Ripper kind of bruteforcer in PHP:
bruteforcer.php:
*********************************************************************************************
<?php
/*
* @author: zapotek <zapotek[at]segfault.gr>
* @version: 0.1
* @name: MD5/SHA1 BruteForcer
* @description: A sample MD5/SHA1 bruteforcer in PHP, inspired from JTR.
*/
error_reporting(0);
echo "MD5/SHA1 Bruteforcer\n".
"by Zapotek <zapotek [at] segfault.gr\n\n";
// is we have insuficient parameters return usage
if ($argc!=3){
echo "Usage:
".$argv[0]." <hash> <lenght>
<hash> The MD5/SHA1 hash
<lenght> The estimated length of the encrypted string\n";
exit(1);
}
array_shift($argv);
$hash = array_shift($argv); // get the hash
$len = (int) array_shift($argv); // get the hash length
$start = strtotime ("now");
$start2 = strtotime ("now");
$keyspace = pow(75,$len); // compute keyspace
// decide the encryption algorithm
switch (strlen($hash)) {
//If the Hash is 32 chars long it's a MD5 Hash
//(Only for HEX encoded hashes)
case 32;
$algo="MD5";
break;
//If the Hash is 40 chars long it's a SHA1 Hash
//(Only for HEX encoded hashes)
case 40;
$algo="SHA1";
break;
//Else print error msg
default;
echo "Could not determine the encryption algorithm.\n";
echo "Ensure that the Hash is correct and try again.\n";
exit();
}
// generate initial key
$key = "";
for ($y=0;$y<$len;$y++){
$key .= "0";
}
// return some info to the user
echo "Specified string length: $len characters\n".str_repeat("-",65).
"\n$algo hash: $hash\n".str_repeat("-",65).
"\nKeySpace: $keyspace characters\n".str_repeat("-",65)."\n";
// loop through the keyspace
for ($x=0;$x<$keyspace;$x++){
// generate random string
for ($y=0;$y<$len;$y++){
// if we haven't reached "z" yet, move on to the next character
if ($key[$y] != "z"){
// create character from number
$key[$y] = chr(ord($key[$y])+1);
// zero the rest of the string out
if ($y > 0){
for ($z = 0; $z < $y; $z++){
$key[$z] = "0";
}
}
break;
}
}
// create hash from random string
$gen_hash = ($algo=="MD5")? md5($key):sha1($key);
// if the hashes match we have a winner!
if($hash==$gen_hash){
// inform the user we cracked the hash
echo "\nDecrypted string: $key\n".
str_repeat("-",65).
"\nOperation took: ".
date("H:i:s",mktime(0,0,strtotime("now")-$start2)).
"\n".str_repeat("-",65)."\n";
// and exit
exit(0);
}
// print out some statistics
if ($x % 24000 == 0){
$x2++;
if ($x2 == 4){
$x2 =0;
$time = strtotime ("now") - $start;
$start = strtotime("now");
if ($time==0) $time=1;
$rate = (24000 *4) / $time;
echo " $x/$keyspace ($key) [$rate Keys/sec]".
" [".round(100-(($keyspace-$x)/$keyspace)*100,3)."%]".
" [".gmdate("H:i:s", round((($keyspace-$x)/$rate),3))." left]\n";
}
}
}
// if the keyspace loop is finished with no match inform the user we failed
echo "\nKeyspace exhausted.\n".
"Please check the provided lentgh and try again.\n"
?>
******************************************************************
Lets give this baby a trial run.
For an MD5 hash:
---------------------------------------------------------------------------------------------
zapotek@lil-z:~/Documents> php bruteforcer.php 900150983cd24fb0d6963f7d28e17f72 3
MD5/SHA1 Bruteforcer
by Zapotek <zapotek [at] segfault.gr
Specified string length: 3 characters
-----------------------------------------------------------------
MD5 hash: 900150983cd24fb0d6963f7d28e17f72
-----------------------------------------------------------------
KeySpace: 421875 characters
-----------------------------------------------------------------
72000/421875 (1l<) [96000 Keys/sec] [17.067%] [00:00:03 left]
168000/421875 (1qM) [96000 Keys/sec] [39.822%] [00:00:02 left]
264000/421875 (1v^) [96000 Keys/sec] [62.578%] [00:00:01 left]
Decrypted string: abc
-----------------------------------------------------------------
Operation took: 00:00:03
-----------------------------------------------------------------
---------------------------------------------------------------------------------------------
For a SHA1 hash:
---------------------------------------------------------------------------------------------
zapotek@lil-z:~/Documents> php bruteforcer.php a9993e364706816aba3e25717850c26c9cd0d89d 3
MD5/SHA1 Bruteforcer
by Zapotek <zapotek [at] segfault.gr
Specified string length: 3 characters
-----------------------------------------------------------------
SHA1 hash: a9993e364706816aba3e25717850c26c9cd0d89d
-----------------------------------------------------------------
KeySpace: 421875 characters
-----------------------------------------------------------------
72000/421875 (1l<) [96000 Keys/sec] [17.067%] [00:00:03 left]
168000/421875 (1qM) [96000 Keys/sec] [39.822%] [00:00:02 left]
264000/421875 (1v^) [96000 Keys/sec] [62.578%] [00:00:01 left]
Decrypted string: abc
-----------------------------------------------------------------
Operation took: 00:00:03
-----------------------------------------------------------------
---------------------------------------------------------------------------------------------
Well, that's easy, lets give it a bit of a challenge.
Let's see what happens with "fb50df9b6b86db51569f2bab6457a24e" (= "zapo")
(I'd give it "zapotek" but the keyspace would get exhausted at around 6hours, so...)
---------------------------------------------------------------------------------------------
zapotek@lil-z:~/Documents> php bruteforcer.php fb50df9b6b86db51569f2bab6457a24e 4
MD5/SHA1 Bruteforcer
by Zapotek <zapotek [at] segfault.gr
Specified string length: 4 characters
-----------------------------------------------------------------
MD5 hash: fb50df9b6b86db51569f2bab6457a24e
-----------------------------------------------------------------
KeySpace: 31640625 characters
-----------------------------------------------------------------
72000/31640625 (1l<0) [96000 Keys/sec] [0.228%] [00:05:28 left]
168000/31640625 (1qM0) [96000 Keys/sec] [0.531%] [00:05:27 left]
264000/31640625 (1v^0) [96000 Keys/sec] [0.834%] [00:05:26 left]
.............
26664000/31640625 (1D?o) [96000 Keys/sec] [84.271%] [00:00:51 left]
26760000/31640625 (1IPo) [48000 Keys/sec] [84.575%] [00:01:41 left]
26856000/31640625 (1Nao) [96000 Keys/sec] [84.878%] [00:00:49 left]
Decrypted string: zapo
-----------------------------------------------------------------
Operation took: 00:03:57
-----------------------------------------------------------------
---------------------------------------------------------------------------------------------
Pretty cool huh?
By the way, the time reported at the right is how much it will take for the script to try all
the possible combinations and not necessarily the exact time it'll take to crack a given hash.
The comments in code make it easy to grasp so I won't explain it line-to-line, so I'll just paint
you the big picture in steps.
1) Read the hash & estimated string length from user input
2) Decide hash type depending on it's length (given that the hash is hexadecimal)
3) Compute the keyspace (75^estimated string length)
4) Create an initial key containing only "0"s
5) Loop until we have reached the end of the keyspace
6) Loop until we have crated a random string
7) Compare the random string's hash with the one provided by the user
8) If we have a match inform the user and exit, else keep looping
9) If we have exhausted the keyspace tell the user we were unable to find a matching hash
That's pretty much it...
[ 0x0a ] Dictionary Attack
============================================
If you understood the previous chapter this will be a walk in the park.
[ 0x08 ] Theory
============================================
Dictionary attacks are like brute force attacks, but significantly easier.
Instead of creating our own random strings we read them from a dictionary and then test them against
a given hash. And, instead of looping until the end of the keyspace, we loop until the end of file.
That's all there is to it.
Now, to some code.
[ 0x09 ] Source Code Analysis
============================================
dicattack.php:
*********************************************************************************************
#!/usr/bin/php
<?php
/*
* @author: zapotek <zapotek[at]segfault.gr>
* @version: 0.1
* @name: DicAttack
* @description: A sample dictionary attacker in PHP.
*/
$supported= "MD5/SHA1";
$name= array_shift($argv);
$wordlst= array_shift($argv);
$hash= array_shift($argv);
// timer
function getmicrotime() {
list ($usec, $sec)= explode(" ", microtime());
return ((float) $usec + (float) $sec);
}
// the function that does the cracking
function crack($algo) {
global $wordlst, $hash;
// start the timer
$time_start= getmicrotime();
// read the dictionary
$wordlist= file_get_contents($wordlst);
// do some error checking
if (!$wordlist) {
echo "Cannot access $wordlst.\n";
echo "Ensure that the file exists and try again.\n";
}else{
// put the words in an array
$words=explode("\n",$wordlist);
// create each word's hash
foreach ($words as $word) {
// create the word's hash
switch ($algo) {
case "md5";
$word_hash= md5($word);
break;
case "sha1";
$word_hash= sha1($word);
break;
}
// compare the hashes
if ($word_hash == $hash) {
// stop the timer
$time_stop= getmicrotime();
echo "Crack Successful!\n"."-----------------\n";
echo "$hash = $word\n"."-----------------\n";
$time= $time_stop - $time_start;
echo "[ Operation took $time seconds ]\n\n";
exit();
}
}
}
}
// in case of insuficient arguments return usage info
if ($argc != 3) {
stamp();
exit();
}
// decide the hash type
// given that the hash is in hexadecimal format
switch (strlen($hash)) {
// if the hash is 32 bytes it must be a MD5 Hash
case 32;
echo "\nGuessing MD5...\n\n";
crack("md5");
break;
// if the hash is 40 bytes it must be a MD5 Hash
case 40;
echo "\nGuessing SHA1...\n\n";
crack("sha1");
break;
// else return an error msg
default;
echo "Could not determine the encryption algorithm.\n";
echo "Ensure that the Hash is correct and try again.\n";
break;
}
// the usage stamp
function stamp() {
global $supported, $name;
echo "\nDicAttack\n-------\n";
echo "Simple Wordlist Based Password Cracker\n";
echo "Currently supporting $supported\n";
echo "Syntax: $name <wordlist file> <hash>\n\n";
}
echo "Couldn't find a match check your dictionary or hash and retry.\n";
?>
****************************************************************
The code is fairly simple and usage examples follow.
We will be using this file as a wordlist.
list.txt:
---------------------------------------------------------------------------------------------
a
b
c
ab
bc
abc
---------------------------------------------------------------------------------------------
For an MD5 hash:
---------------------------------------------------------------------------------------------
zapotek@lil-z:~/Documents> php dicattack.php list.txt 900150983cd24fb0d6963f7d28e17f72
Guessing MD5...
Crack Successful!
-----------------
900150983cd24fb0d6963f7d28e17f72 = abc
-----------------
[ Operation took 0.028094053268433 seconds ]
---------------------------------------------------------------------------------------------
For a SHA1 hash:
---------------------------------------------------------------------------------------------
zapotek@lil-z:~/Documents> php dicattack.php list.txt a9993e364706816aba3e25717850c26c9cd0d89d
Guessing SHA1...
Crack Successful!
-----------------
a9993e364706816aba3e25717850c26c9cd0d89d = abc
-----------------
[ Operation took 0.00048303604125977 seconds ]
---------------------------------------------------------------------------------------------
Once again, these are the steps:
1) Read the hash from user input
2) Decide hash type depending on it's length (given that the hash is hexadecimal)
3) Read the dictionary
4) Put every word in an array
5) Loop through the array of words
6) Create each words hash and compare it with the user given hash
7) If we have a match inform the user and exit, else keep looping
8) If we have exhausted the array tell the user we were unable to find a matching hash
[ 0x0d ] Famous Crackers
============================================
Ok, these scripts are far from optimized and not suited for serious cracking.
Fortunately, a lot of advanced, high-end, applications exist, a list follows:
John The Ripper
===============
Ok, who hasn't heard of that before huh?
Very fast hash cracker supporting a lot of algorithms.
Website: http://www.openwall.com/john/
Cain & Abel
===============
That's more of a security suite than just a cracker.
Sniffer, cracker, decrypter, encrypter you name it, it's got it.
It's only for windows and it can extract VERY interesting information from a system.
Website: http://www.oxid.it/cain.html
RainbowCrack
===============
Now that's a big boy toy!
The difference of RainbowCrack from other crackers is that it reads precomputed hashes
which make password cracking faster. Considering that the precomputed hash tables can
reach terrabytes in size you can imagine how many hashes you have before you even know it.
Problem is though, unless you have a cluster it is better to get the hash tables from
somewhere else than generate them yourself.
Website: http://www.antsight.com/zsl/rainbowcrack/
L0phtCrack5
===============
That's the Cadillac of password recovery tools for Windows.
It's a sad thing that @stake got bought from Symantec and stopped supporting LC.
Search for LC5, though, chances are someone has a copy buried somewhere.
LCP
===============
That's a freeware L0phtCrack clone.
Unfortunately, it runs only on Windows boxes
Website: http://www.lcpsoft.com/english/index.htm
These are the most widely used applications, give them a shot, you'll love them.
[ 0x0e ] Pros and Cons of the 2 attacks
============================================
Brute Force
=================
Pros Cons
----- -----
It is guaranteed that the hash will get cracked It might even take years for the crack to complete
It requires a lot of computation power
Dictionary Attack
=================
Pros Cons
----- -----
Faster than bruteforcing Significantly less chances to crack a hash than bruteforce
A lot of big dictionaries are freely available If the encrypted string is random no "valid" wordlist will crack it
[ 0x0f ] Epilogue
============================================
Ok, this paper got to an end.
It wasn't a big, in depth, analysis of the 2 techniques but I hope that it
served as a good starting point.
Thanks for baring through it,
Zapotek.