Password practices as a provider
Back
Let’s talk about what your password (should) look like when it’s stored by a service. This post will talk about (cryptographic) hashing algorithms, salts, peppers, and key stretching. Let’s start with our first topic, hashes.
What is a hash?
A hashing algorithm is a one-way function, taking any input such as ‘mypassword’ or ‘pass’ and converting it to a fixed-size value such as ‘34819d7beeabb9260a5c854bc85b3e44’ or ‘1a1dc91c907325c69271ddf0c944bc72’ (this is with the md5 algorithm). A hash function is:
- Deterministic: Any time a message m is hashed it creates the same digest h
- Uniform: The output should be as uniform as possible to avoid two messages, m and n, from having the same digest (collisions WILL occur, but we want them to occur as infrequently as possible)
Note: While we want collisions to be rare, there will be collisions since we’re taking a much larger set of inputs and mapping it to a smaller set of outputs
There are some extra features that are desireable in cryptographic hash algorithms as compared to search based algorithms:
- One way function: Given a digest
h
, it should be difficult to reverse the hash and find a messagem
that return the digesth
when passed through the hashing algorithm. - Avalanche effect: Any small change in the message should change the digest drastically. Ideally, changing a single bit would change half the bits in the digest. We can see this with a quick example with the md5 algorithm hashing
b
andc
which yields92eb5ffee6ae2fec3ad71c777531578f
and4a8a08f09d37b73795649038408b5f33
(60 out of 128 bits flipped from 1 bit changing).
Password should never be stored in the clear (e.g. unencrypted). In order to avoid this, the password is hashed with a cryptographic hash, and the digest stored along with the username.
Update for clarity: When you hit enter or submit, the hashed value of the password you entered is what’s compared to the digest that the service has stored. Most of the time on networked services hashing occurs on the server and your password is sent in the clear to the server. If you’re interested in who might be seeing your password in the clear, refer to this helpful article from the EFF. TLDR, don’t log in if you don’t see that HTTPS lock in the top-left of your browser. There are a few services that do hash on the client-side; specifically LastPass which uses your username as a salt, hashes the salt+password, gives that to the server, and hashes that while using an additional salt. Expect more mentions of LastPass in a future post on attacks a provider might face.
Attacking hashes
So, let’s ask ourselves, how do we discover a password given a digest? This is typically done via a rainbow table, which provides a list of precomputed hashes for a number of common passwords or character combinations. You can check out crackstation.net which uses a massive 15GB rainbow table file of a little less than 1.5 billion hashes that you can download along with a smaller 684MB file with around 64 million passwords. If you put in the md5 of any of the three previously mentioned phrases then the specified password will be returned almost immediately. Crackstation isn’t limited to md5, but can work on 16 different types of hashing algorithms. While as an administrator we’d like to imagine that our users will choose a smart password, As a user who doesn’t use a password manager, I know that many (or most) users will not choose a secure password.
Even worse, say we’re Mallory, a user on a service called ConnectedIn for job networking and we manage to get ahold of the hashed passwords of the users. The site protects the users’ password by hashing them with Sha-1, but this doesn’t stop us from running these hashes through a rainbow table which ends up giving us:
User | Digest | pass |
---|---|---|
alice | 1cfbd686487a6b2896248da8c4bbe5d6c6957847 | iloveyou |
David | 7c6a61c68ef8b9b6b061b28c348bc1ed7921cb53 | passw0rd |
Erin | c6922b6ba9e0939583f973bc1682493351ad4fe8 | 1qaz2wsx |
mallory(us) | 7fc7ddef5a0a0fbeb8716affafd202b14610df6e | ConnectedIn17 |
Frank | 7fc7ddef5a0a0fbeb8716affafd202b14610df6e | (unknown) |
Richard | 3c75e7e29ea3d5ccc22d9ab61e47366756634a26 | unknown |
Bob | 3c75e7e29ea3d5ccc22d9ab61e47366756634a26 | unknown |
We obtained Alice’s, David’s, and Erin’s passwords through the rainbow table, but the digests also tell us a password for Frank, and key info about Richard and Bob’s password. Note that the digest for Frank is the same as our(Mallory’s) digest. This means our password ConnectedIn17
will work on Frank’s account. Meanwhile, we can note that Richard and Bob have the same digest and therefore their passwords are interchangeable.
So, how can this attack be avoided? We simply add a piece of text that is unique to the user onto the password before hashing it. This piece of additional text is called a salt. Now if we have two users with the same passwords, they should end up with very different digests. While the text needs to be unique for each user, there’s no need to make it secret. It is usually stored with the hashed passwords. Since there’s no need for secrecy, you might consider simply using the username as the salt. This is generally inadvisable both for allowing easy changing of username without necessitating a password change, and the slight security concern of someone generating rainbow tables for common usernames as salts. Could someone generate rainbow tables for each salt though? Well, the salt for SHA-512 in Linux is 128 bit, meaning or around 3.4e38 salts. This is currently considered difficult enough for modern day computers to stop an attacker from generated rainbow tables for each salt.
Now we have the salt, but what about a little pepper? There is in fact a cryptographic term pepper, although it is seldom used. Pepper is the exact compliment to a salt, it is a piece of text that is the same for all users, but must be kept secret. It is obfuscated and stored separately from the salts and hashes. If an attacker compromises the salts and hashes, but the service utilizes a pepper that isn’t compromised, then an attacker with rainbow tables for every salt will still be useless as they lack the pepper. Again, peppers are an oddity in cryptography and not often seen in actual systems.
Key-stretching and Linux passwords
If you’re running a Linux system, then you probably have a file /etc/shadow
which stores the hashed passwords along with the salt and other key information following a specified pattern.
user:$algorithm$salt$digest:::::::
The algorithms in GNU/Linux are:
id | algorithm | default_number_of_iterations | modifiable_iteration_count? |
---|---|---|---|
1 | md5 | 1000 | no |
2a | blowfish | 64 | yes |
2y | blowfish(8-bit safe) | 64 | yes |
5 | sha-256 | 5000 | yes |
6 | sha-512 | 5000 | yes |
It should be noted that Blowfish isn’t supported by default in GNU/Linux, but certain distros opt to add support for Blowfish. You might notice that we mention number of iterations in this table. So you might ask what is a iteration? It took me way too long to figure out how these iterations worked, but it is a key mechanism for protecting passwords, so let’s discuss it.
How iterations don’t work
If you’re like me you might imagine simply running something like this pseudocode:
result = hash(salt+password);
#Option 1
for 1:num_iterations {
result = hash(result);
}
#Option 2
for 1:num_iterations {
result = hash(salt+result);
}
This would present some problems though. Mainly, the fact that it’s possible to find a value x, such that hash(x)=x or simply looping over the same hashes again and again. If the former is true, and there’s around 63% chance it is, then more iterations lead to a chance of many normal passwords resulting in the same hash when run through a large number of these false iterations. Luckily the above pseudocode isn’t how iterating works.
How iterations do work
You can read this code in GNU’s glibc implementation of libcrypt or the BSD implementation. I personally find the BSD code to be easier to follow. If you’re not a coder, hate C, or simply don’t want to read code, there’s an english explanation for md5 on Wikipedia (link):
First the passphrase and salt are hashed together, yielding an MD5 message digest. Then a new digest is constructed, hashing together the passphrase, the salt, and the first digest, all in a rather complex form. Then this digest is passed through a thousand iterations of a function which rehashes it together with the passphrase and salt in a manner that varies between rounds. The output of the last of these rounds is the resulting passphrase hash.
We’re going to look at the specific method for sha-256 and sha-512 in the table below. First the password is hashed with the corresponding sha algorithm before being running through x number of iterations (defaults to 5000). The value that gets hashed in each iteration is decided by the iteration number. The table below shows you what’s getting hashed in each case.
Iteration multiple of 7 | iteration multiple of 3 | Even Iteration | what gets hashed |
---|---|---|---|
password+salt+password+prev_hash | |||
X | prev_hash+salt+password+password | ||
X | password+password+prev_hash | ||
X | X | prev_hash+password+password | |
X | password+salt+prev_hash | ||
X | X | prev_hash+salt+password | |
X | X | password+previous_hash | |
X | X | X | prev_hash+password |
While it is possible to run into looping hashes while following this plan, it is very unlikely and relies on a perfect combination of password salt and previous hashes. In other words, it’s not a viable attack to look for a looping hash since it’s dependent on the previous hash, the password, and the salt.
Why iterate?
So we can use multiple iterations when storing our passwords, awesome! Why though? Hashing functions like MD5, sha1, and sha512 are extremely fast. While we assume the user wants the fastest response possible. It’s unlikely though that a user is going to care about login time if it’s faster than a second. Someone brute-forcing millions of passwords will notice any delay in the time to do a single password. This is called key-stretching which is used to fight brute-force attacks.
Comments
Brandon Moore -
2017, Sep 05, Tue 20:54:19 UTC
Thanks, Bryon. As you said, that does seem to be a pretty obvious conclusion, but it’s best to try to make everything as explicit as possible. I’ll go ahead and change that.
p.s. the comment system should be a bit better now, but I’ll continue to make it more user-friendly.
Bryon D. Foltz -
2017, Sep 03, Sun 17:16:11 UTC
Awesome article Malide! Was just starting to look into proper password security for a website I am building, and you make hashing much simpler than most explanations I have read. One suggestion for you:
As obvious as it seems to you and me you might want to mention the way you verify a password against a stored hash is to hash the password and compare the hashes since hashing is one way.
Bryon D. Foltz -
2017, Sep 03, Sun 17:18:56 UTC
Also…. you may want to fix your commenting system so it simply reloads the page following submission instead of giving users a JSON response from staticman.net .