-
Notifications
You must be signed in to change notification settings - Fork 5
Security Design
For better or for worse, passwords are how we handle authentication (no 2FA for now, at least). We follow standard practice by salting and hashing passwords with a cryptographic hashing function. Currently, we use SHA256 with PBKDF2 for key derivation.
We use a custom format for storing the hashes in the database. Historically, we used to use single round MD5 for hashing. When we migrated to the more secure SHA256 function, we decided it was not a good idea to invalidate passwords for the vast majority of the accounts (as alums log in infrequently, if at all). Instead, we would compute a new hash by composing the two algorithms, and if verification is successful, we silently migrate the hashed password to only using the new algorithm. The format we use to store the hashes handles the general case of this problem, so any future migrations should be very straightforward.
For the original discussion, see https://github.com/RuddockHouse/RuddockWebsite/pull/65
Key ideas:
- SHA256(MD5(password)) is far better than MD5(password), and is only slightly worse than SHA256(password).
- The salts at each step must be preserved, to be able to replicate how the hash was computed.
The format:
$algorithm1|...|algorithmN$cost1|...|costN$salt1|...|saltN$hash
- The algorithm is the hashing algorithm to use. The Nth algorithm is the most current one.
- The cost is used for key derivation (running the input through the algorithm multiple times to make it more costly for password crackers to guess a password). Exactly what the cost means depends on the key derivation algorithm; for PBKDF2 it's the number of rounds, while for bcrypt it's the cost parameter. For algorithms like MD5, where there is no concept of a cost, this should be the empty string.
- The salt is a randomly generated string used to make attacks with precomputed dictionaries of hashes infeasible. A new salt is generated as part of each algorithm migration, so there should be N of them in total.
- The hash is the hex string (could also be base64; it doesn't really matter) that is the result of all the rounds of hashing.
To verify a password, the hash is computed:
hash = algorithmN(saltN + algorithmN-1(saltN-1 + ... + algorithm1(salt1 + password)))
and compared to the saved hash for equality.
For example, the formatted hash may read $md5|pbkdf2_sha256$|100000$salt1|salt2$deadbeef
, meaning that a candidate password must be hashed with MD5 first and then 100,000 rounds of PBKDF2_SHA256.