Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#21022 Switch to using bcrypt for hashing passwords and security keys #7333

Open
wants to merge 48 commits into
base: trunk
Choose a base branch
from

Conversation

johnbillion
Copy link
Member

@johnbillion johnbillion commented Sep 11, 2024

Latest approach to switching from phpass to bcrypt via the native PHP password hashing functions.

This covers:

  • User passwords
  • Application passwords
  • Post passwords
  • User password reset keys
  • Personal data request keys
  • The recovery mode key

Lots more info about the change can be found in the draft for the make/core announcement post.

Tickets

Notes

  • wp_check_password() remains compatible with phpass hashes, so password checks will continue to succeed when checking a password against its existing hash. There is no need for users to change or resave their password.
  • User passwords are rehashed using bcrypt after successful authentication in wp_authenticate_username_password() or wp_authenticate_email_password().
  • Application passwords are rehashed using bcrypt after successful authentication in wp_authenticate_application_password().
  • wp_check_password() and the rehashing that occurs in the above functions are forward-compatible with future changes to the default bcrypt cost (the default bcrypt cost may get increased in PHP 8.4), so password checks will continue to succeed when checking a password against its existing hash.
  • wp_hash_password() and wp_check_password() retain support for a global $wp_hasher object which can be used to override the password hashing and checking mechanisms.

Elsewhere

FAQs

What about salting?

Salting a bcrypt hash is handled by password_hash() and password_verify(). There is no need to implement salting in userland code.

What about peppering?

Peppering effectively eliminates the ability to crack a password given only its hash by introducing a secret which needs to be stored outside of the database and is used as part of the value that's hashed, as long as the pepper remains secret. This is compelling, however the portability of a password hash is also eliminated and if the secret pepper is lost, changed, or differs between environments, then all password hashes are invalidated. All users will need to reset their passwords in order to log in, and all application passwords and post passwords will need to be changed.

While a secret pepper does prevent an attacker from being able to crack a password hash if they gain access only to the database, the potential usability tradeoff is high. In addition, the intention of switching to bcrypt is to switch to a password hashing algorithm that is highly resistant to cracking in the first place, thus reducing the benefit gained from peppering.

What about layering hashes to immediately protect legacy hashes?

Hash layering is the process of taking an existing password hash (for example one hashed with phpass) and applying the new hashing algorithm on top of it (bcrypt in this case). The intention is to immediately protect stored passwords with the new hashing algorithm instead of waiting for users to log in or change their passwords in order to rehash the password.

This approach carries its own risks including null byte characters present in the raw output of other hashing algorithms, and password shucking. OWASP warns that layering hashes can make the password easier to crack.

An additional consideration is the length of time it could take to rehash all passwords in the database during the upgrade routine. Handling would need to be implemented to cover usage of passwords (for example logging in or changing passwords) while the password upgrade routine runs.

None of this is insurmountable, but it adds complexity for what is in most cases a short term benefit. For this reason, hash layering has not been implemented.

What about the 72 byte limit of bcrypt?

This is an ongoing consideration. There is discussion about this limit in the comments on #21022 and on #50027, and a separate PR at johnbillion#5 implements a workaround. My preference is to retain the vanilla bcrypt implementation. Read on for all the details.

Out of the platforms listed above, only Symfony provides any specific handling for passwords greater than 72 bytes in length (introduced in Symfony 5.3 in 2021). Passwords longer than 72 bytes are hashed with sha512 and then base64 encoded prior to being hashed with bcrypt.

The sha512 hashing is used to retain the full entropy of the original password, regardless of its length. The base64 encoding is to protect against null byte characters. Passwords shorter than 72 bytes do not receive hash layering treatment.

sha512 hashing is also used by Dropbox prior to hashing with bcrypt.

An interesting observation about the Symfony implementation is that a base64 encoded sha512 hash is 88 bytes in length, therefore 16 bytes (19% of the total) still get truncated when the hash is passed to bcrypt. This could mean that a password between 73 and 88 bytes in length actually has less entropy than one 72 bytes in length. A bit of a diversion though.

This PR in its current form does not include any specific handling for the 72 byte limit, but there are options:

  1. Ignore the 72 byte limit like many platforms do.
    • A maximum of 72 bytes can contribute to the password entropy.
    • Any subsequent bytes are discarded both during hashing and during checking.
    • Relatively risk free as we're not layering hashing.
  2. Implement sha384 or sha512 pre-hashing to preserve the entropy of passwords of any length.
  3. Implement a 72 byte password length limit. This could end up being unnecessarily complex to implement:
    • Does the user need to be informed when they have or attempt to use a password longer than 72 bytes?
    • A 72 character limit on password input fields would need active warnings rather than just maxlength attributes because the input value is obscured when typing/pasting/autofilling.
    • 72 bytes does not equal 72 characters and vice versa due to multibyte characters.

What about DoS attacks from long passwords?

The bcrypt implementation in PHP is not susceptible to a DoS attack from long passwords because only the first 72 bytes of the password value are read, therefore there is no need to guard against a long password value prior to it being hashed or checked.

While phpass can be susceptible to a DoS attack via a very long password value, the phpass implementation in WordPress protects against this via a 4096 byte limit when hashing a password and when checking a password. This is unrelated to the password length limit discussed above.

What about the cost factor?

The default cost factor will be used. This is currently 10 and is due to be increased to 12 in PHP 8.4. Hashes remain portable between installations of PHP that use different cost factors because the cost factor is encoded in the hash output.

It's beyond the scope of WordPress to make adjustments to the cost factor used by bcrypt. If you are planning on updating to PHP 8.4 then you should consider whether the default cost is appropriate for the resources available on your server. The wp_hash_password_options filter is available to change the cost factor should it be needed.

What about using PASSWORD_DEFAULT instead of PASSWORD_BCRYPT?

The intention of the PASSWORD_DEFAULT constant in PHP is to take advantage of future changes to the default algorithm that's used to hash passwords. There are currently no public plans to change this algorithm (at least, I haven't found any), but for safety it makes sense to be explicit about the use of bcrypt in WordPress. This can easily be changed in the future.

What about Argon2?

Unfortunately it's not possible to rely on argon2 being available because it requires both libargon2 to be available on the server and for PHP to be built with argon2 support enabled. Using argon2 via sodium_compat still requires the optional libsodium extension to be installed. Conditionally using argon2i, argon2id, or bcrypt depending on what is available on the server would increase complexity and limit the portability of hashes.

What about scrypt?

There is no native support for scrypt in PHP, and using scrypt via sodium_compat still requires the optional libsodium extension to be installed

Is this change compatible with existing plugins that implement bcrypt hashing?

It should be, yes. If you've used such a plugin to hash passwords with bcrypt then those hashes should be compatible with this bcrypt implementation and you should be able to remove the plugin.

What effect will this have on my database when a user logs in?

The first time that each user subsequently logs in after this change is deployed to your site, their password will be rehashed using bcrypt and the value stored in the database. This will result in an additional UPDATE query to update their user_pass field in the users table.

The query will look something like this:

UPDATE wp_users
SET user_pass = '<hash>', user_activation_key = ''
WHERE ID = <id>

This query performs an UPDATE but makes use of the primary key to target the row that needs updating, therefore it should remain very performant. The query only runs once for each user. When they subsequently log in again at a later date their password will not need to be rehashed again.

Note that when a user logs in, WordPress already writes to the database via an UPDATE query on the usermeta table to store their updated user session information in the session_tokens meta field. This happens every time a user logs in.

How do I use an algorithm other than bcrypt on my website?

The wp_hash_password(), wp_check_password(), and wp_password_needs_rehash() functions are all pluggable. See wp-password-bcrypt as an example of overwriting them in a plugin.

Todo

  • Make a decision on the 72 byte limit imposed by bcrypt
  • Decide whether the options array for password_hash() should be filterable.
  • Decide whether wp_hash_password() and wp_check_password() need to retain back-compat support for the global $wp_hasher
  • Fully support changes to the default bcrypt cost
  • Tests for the hash upgrade routine for a user password
  • Tests for the hash upgrade routine for an application password
  • Tests for handling post password hashes
  • Address md5 hashing of passwords in repair.php

Testing

There's good test coverage for this change. Here are some manual testing steps that can be taken.

Remaining logged in after the update

  • Start on the trunk branch
  • Log in to WordPress
  • Switch to this branch
  • Ensure you have remained logged in
  • Log out and back in again
  • Confirm it works as expected
  • Confirm that the user_pass field for your user account in the wp_users table in the database has been updated -- it should be prefixed with $2y$ instead of $P$

Post passwords

  • Start on the trunk branch
  • Publish a post and set a password for it
  • View the post and enter the password to view it
  • Switch to this branch
  • Confirm you can still see the post without having to re-enter its password
  • Edit the post and change its password
  • View the post and confirm that you can re-enter the new password and view it

Password resets

  • Start with the "Lost your password?" link on the login screen
  • Click the confirmation link sent to your email
  • Follow the process of resetting your user password
  • Confirm you can log in with your new password

Personal data requests

  • Start on the trunk branch
  • Log in to WordPress as an Administrator
  • Initiate a data export from Tools -> Export Personal Data
  • Switch to this branch
  • Click the confirmation link sent to the email address and confirm that it still works
  • Initiate another data export from Tools -> Export Personal Data
  • Click the confirmation link sent to the email address and confirm that it works

@johnbillion johnbillion marked this pull request as draft September 11, 2024 18:38
Copy link

github-actions bot commented Sep 11, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @soatok.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

Core Committers: Use this line as a base for the props when committing in SVN:

Props johnbillion, haozi.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@johnbillion johnbillion changed the title #21022 Switch to using bcrypt for password and security key hashing #21022 Switch to using bcrypt for hashing passwords and security keys Sep 11, 2024
@johnbillion johnbillion marked this pull request as ready for review October 22, 2024 22:09
*/
$options = apply_filters( 'wp_hash_password_options', array() );

return password_hash( trim( $password ), PASSWORD_BCRYPT, $options );
Copy link

@soatok soatok Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be advisable to prevent the 72-byte truncation by design.

Consider this:

$prehashed = base64_encode( hash_hmac( 'sha512', trim( $password ), CONFIG_SALT_GOES_HERE, true ) );
return password_hash( $prehashed, PASSWORD_BCRYPT, $options );

You can verify that this is safer than a naked password_hash() with a simple unit test:

  • Hash the letter A repeated 72 times.
  • Attempt to verify a password consisting of the letter A repeated 73 times, using the hash from the previous step.
  • This should succeed for password_hash() with BCRYPT, but fail with my strategy.

Why hash_hmac()?

This binds the prehash to a server-side secret, that can exist alongside the usual WordPress secrets. In essence, without this secret, passwords cannot even begin to be cracked.

Why SHA-512 + Base64?

SHA-512 produces 64 bytes of output, which is just shy of the 72 byte limit for bcrypt. However, there can be a NUL character in the midst, so it could truncate early if not encoded.

By base64-encoding the SHA-512 hash (really, HMAC output), you prevent NUL characters. This is the optimal trade-off.

  • By itself, SHA-512 hashes have a collision resistance of 256 bits.
  • The base64-encoded hash will be about 86 bytes (88 with padding).
  • Although bcrypt silently truncates the last 14 (16 with padding) characters, we still have 72 base64-encoded characters (about 432 bits of security).
  • Each WordPress install will have a different HMAC key, which will frustrate crackers tremendously (especially if they can only leak the contents of the SQL database, not the filesystem.

I believe this small bit of complexity is worth the obvious security benefits.

Can we do it later?

I don't believe we can. We either adopt this strategy as part of the phpass migration, or we don't do it ever.

$hasher = new PasswordHash( 8, true );
$check = $hasher->CheckPassword( $password, $hash );
} else {
$check = password_verify( $password, $hash );
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with above, the congruent change here would be:

$prehashed = base64_encode( hash_hmac( 'sha512', trim( $password ), CONFIG_SALT_GOES_HERE, true ) );
$check = password_verify( $prehashed, $hash );

@soatok
Copy link

soatok commented Nov 20, 2024

I see @johnbillion beat me to the general strategy in his other pull request. I think HMAC-SHA512 is a better approach than SHA384, but both are acceptable.

There are libraries that do SHA384 (not HMAC), but then encrypt the password hash with a key. I think the operational security of both approaches is congruent (you need a key to begin password cracking attempts).

In my opinion, doing SHA384 without HMAC, and without encryption, is less beneficial than the other two approaches. However, al 3 are better than raw bcrypt. And bcrypt is a giant leap in security over phpass.

@devhaozi
Copy link

devhaozi commented Nov 21, 2024

Please refer OWASP's documentation https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#pre-hashing-passwords-with-bcrypt, they don't recommend pre-hashing passwords when using bcrypt, including use base64 to encoding the sha hash result.

Perhaps we can add input length restrictions to prevent users from using too long passwords?

@johnbillion
Copy link
Member Author

I covered that a bit in the FAQ. The main problem is that 72 bytes != 72 characters once you get outside of ASCII, and you'll need active warnings instead of just a maxlength attribute due to the input masking in browsers.

@devhaozi
Copy link

I covered that a bit in the FAQ. The main problem is that 72 bytes != 72 characters once you get outside of ASCII, and you'll need active warnings instead of just a maxlength attribute due to the input masking in browsers.

In this case, I prefer to ignore them, like Laravel framework.
Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

@soatok
Copy link

soatok commented Nov 21, 2024

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

@devhaozi
Copy link

devhaozi commented Nov 21, 2024

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

If we upgrade to Argon2 one day, all pre-hashing passwords will need to be re-hashed (Argon2 does not have the 72-byte limit), or we can continue to keep base64+sha384, but this is not necessary in argon2.

If we use PHP's default implementation directly, we won't have these troubles (PHP will handle it automatically in future version).

@soatok
Copy link

soatok commented Nov 21, 2024

Using SHA for pre-hashing will cause more trouble in future upgrades (assuming we need to upgrade to argon2id one day) and will not benefit the vast majority of users and will reduce performance.

Can you be specific? What "more trouble" are you envisioning?

If we upgrade to Argon2 one day, all pre-hashing passwords will need to be re-hashed (Argon2 does not have the 72-byte limit), or we can continue to keep base64+sha384, but this is not necessary in argon2.

If we use PHP's default implementation directly, we won't have these troubles (PHP will handle it automatically in future version).

Consider:

if (hash_equals('$2', substr($pwhash, 0, 2)) {
    $prehash = base64_encode( hash_hmac( /* ... */ ) );
    if (password_verify($prehash, $pwhash)) {
        $newHash = password_hash($password, /* ... */).
    }
}

@devhaozi
Copy link

devhaozi commented Nov 21, 2024

Another example for reference is laravel/framework#49752 , where they seem to have decided to do nothing.

@johnbillion
Copy link
Member Author

johnbillion commented Nov 21, 2024

That's correct. Out of the PHP projects that I listed in the PR, only symfony/password-hasher includes any handling for passwords greater than 72 bytes, and even then it still truncates the resulting hash due to the use of sha512+base64 which outputs 88 bytes.

The discussion around hash layering feels dangerously close to rolling your own crypto in order to provide a small amount of additional benefit to an exceptionally small proportion of users.

@soatok
Copy link

soatok commented Nov 21, 2024

Thankfully, this isn't anywhere near "rolling our own crypto". The SHA2+base64 paradigm has been implemented in several proprietary systems I've worked on over the past decade.

I also cited password_lock above as an open source implementation that does a morally equivalent job. If you need a cryptography expert to sign off on this design, ask the developer of that library to review whatever we propose.

and even then it still truncates the resulting hash due to the use of sha512+base64 which outputs 88 bytes.

SHA-384 and SHA-512 are the exact same algorithm, except SHA-384 has a different initialization vector and is truncated to 384 bits. I don't think it makes any sense to obsess over the truncation here.

If you base64-encode a raw SHA-512 hash to get 88 bytes, and then bcrypt truncates it to a "mere" 72, you still have an input keyspace of 6^72 or 486 bits of entropy. This is still in far excess of the 128 bits of entropy necessary to be considered secure.

The SHA-2 family of hash functions is secure in most usages (exception: if you're building a MAC by truncating a key and the message). Using HMAC eliminates length-extension, and you can treat HMAC with a secret key as a random oracle.

If you want to be extra paranoid, instead of base64_encode, prefer sodium_bin2base64 if it's available.


At the risk of being mildly self-promotional, I'd like to refer any curious parties to some of my previous writing on this subject if anything I've said here (or are likely to say in subsequent comments) is a little over your head:

These algorithms, and this specific way of combining these algorithms, is well-understood.

@devhaozi
Copy link

Thankfully, this isn't anywhere near "rolling our own crypto". The SHA2+base64 paradigm has been implemented in several proprietary systems I've worked on over the past decade.

I also cited password_lock above as an open source implementation that does a morally equivalent job. If you need a cryptography expert to sign off on this design, ask the developer of that library to review whatever we propose.

and even then it still truncates the resulting hash due to the use of sha512+base64 which outputs 88 bytes.

SHA-384 and SHA-512 are the exact same algorithm, except SHA-384 has a different initialization vector and is truncated to 384 bits. I don't think it makes any sense to obsess over the truncation here.

If you base64-encode a raw SHA-512 hash to get 88 bytes, and then bcrypt truncates it to a "mere" 72, you still have an input keyspace of 6^72 or 486 bits of entropy. This is still in far excess of the 128 bits of entropy necessary to be considered secure.

The SHA-2 family of hash functions is secure in most usages (exception: if you're building a MAC by truncating a key and the message). Using HMAC eliminates length-extension, and you can treat HMAC with a secret key as a random oracle.

If you want to be extra paranoid, instead of base64_encode, prefer sodium_bin2base64 if it's available.

As mentioned before, passwords longer than 72 bytes are rarely used, so it is worth considering whether it is worth doing this.

For sodium_bin2base64, it is impossible, WP usually only accepts PHP's own implementation (which is why Argon2 is not used, Argon2 requires PHP to use additional compilation parameters).

There is still a long time before 6.8, let's see what other developers think.

@soatok
Copy link

soatok commented Nov 21, 2024

For sodium_bin2base64, it is impossible,

if (!is_callable('sodium_bin2base64')) {
/**
* @see ParagonIE_Sodium_Compat::bin2base64()
* @param string $string
* @param int $variant
* @return string
* @throws SodiumException
* @throws TypeError
*/
function sodium_bin2base64(
#[\SensitiveParameter]
$string,
$variant
) {
return ParagonIE_Sodium_Compat::bin2base64($string, $variant);
}
}

:)

@devhaozi

This comment was marked as off-topic.

@johnbillion
Copy link
Member Author

I've covered argon2 and scrypt in the PR description. Unfortunately there is not.

@soatok
Copy link

soatok commented Nov 21, 2024

As far as I'm aware, sodium_compat does not include scrypt or argon2.

@johnbillion
Copy link
Member Author

Again, I've covered this already in the PR description. sodium_compat still requires the optional libsodium extension in order to support argon2 or scrypt. There is no polyfill for either in sodium_compat.

I think we should put a pin in this conversation about the 72 byte limit of bcrypt. It's an interesting computer science problem but it's so far removed from any real life problem that it distracts from assessing the underlying switch to bcrypt.

A 72 byte password has 576 bits of entropy. As soon as a password is over 16 bytes in length, you're beyond the widely recommended 128 bits of entropy to consider a password secure. There is no practical need to retain a higher entropy for passwords greater than 72 bytes.

@soatok
Copy link

soatok commented Nov 21, 2024

it's so far removed from any real life problem that it distracts from assessing the underlying switch to bcrypt.

I respectfully disagree, for two reasons:

  1. WordPress has historically bent over backwards to accommodate a small minority of users. This is evident in the minimum PHP bump from 5.2.4 to 5.6 taking so many years after the PHP team stopped supporting it. That only a few users will see a tangible difference has, historically, not been a valid reason to make a security change.
  2. Bcrypt + SHA2 + Base64, as discussed above, materially improves the security and raises the bar for the PHP ecosystem, making WordPress a leader rather than a follower.

The 72-byte limit isn't the only footgun of vanilla bcrypt. Truncating on NUL characters is the other (which is why eschewing the base64 encoding is a bad move).

That being said, consider users with long passphrases. but where each word in the passphrase is low-entropy (e.g. diceware). They're well-served by this change.

That being said:

I think we should put a pin in this conversation about the 72 byte limit of bcrypt.

That's fine. This is my final word on this topic, unless explicitly queried by another user.

@devhaozi
Copy link

devhaozi commented Nov 21, 2024

it's so far removed from any real life problem that it distracts from assessing the underlying switch to bcrypt.

I respectfully disagree, for two reasons:

  1. WordPress has historically bent over backwards to accommodate a small minority of users. This is evident in the minimum PHP bump from 5.2.4 to 5.6 taking so many years after the PHP team stopped supporting it. That only a few users will see a tangible difference has, historically, not been a valid reason to make a security change.
  2. Bcrypt + SHA2 + Base64, as discussed above, materially improves the security and raises the bar for the PHP ecosystem, making WordPress a leader rather than a follower.

The 72-byte limit isn't the only footgun of vanilla bcrypt. Truncating on NUL characters is the other (which is why eschewing the base64 encoding is a bad move).

That being said, consider users with long passphrases. but where each word in the passphrase is low-entropy (e.g. diceware). They're well-served by this change.

That being said:

I think we should put a pin in this conversation about the 72 byte limit of bcrypt.

That's fine. This is my final word on this topic, unless explicitly queried by another user.

The criterion for WordPress to stop supporting PHP is the usage rate (as far as I know, this value is less than 5%).
I think the percentage of users using passwords longer than 72 bytes is much less than 5%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants