Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline command to correct all permissions at once #1921

Open
unixpi opened this issue Oct 29, 2020 · 7 comments
Open

Offline command to correct all permissions at once #1921

unixpi opened this issue Oct 29, 2020 · 7 comments

Comments

@unixpi
Copy link
Contributor

unixpi commented Oct 29, 2020

Is your feature request related to a problem? Please describe.
If one upgrades from an older install of nimbus, some permissions are not set correctly. This concerns both the secrets and validators folders (as well as the database?). As it stands, when the user tries to run the beacon node she is presented with error messages of the form

ERR 2020-10-28 14:17:31.504+01:00 File has insecure permissions ... current_permissions="0644 (rw-r--r--)" required_permissions="0600 (rw-------)"
ERR 2020-10-28 14:17:31.504+01:00 Password file has insecure permissions  ... 

Even though permissions are set correctly on new installs, forcing existing users to manually modify permissions in order to get the software working is not great from a UX perspective.

Describe the solution you'd like
An offline command, similar to Dustin's ncli_db pruneDatabase, that corrects all permissions at once. The error messages should reference this command.

Describe alternatives you've considered
Automatically checking and setting the correct permissions when upgrading. I'm open to other suggestions. @cheatfate in particular had some great thoughts yesterday.

@cheatfate
Copy link
Contributor

@sachayves only sensitive files and folder are protected, database files do not contain sensitive information. Here is a full list of all sensitive files in data-dir:

<folder permissions> on Posix systems (Linux, MacOS, BSDs) right now its 0750 but can be changed to 0700. On Windows systems access is bound to current user only (all inherited ACEs are got removed) with container and object inheritance enabled. So folder will have <current username>:(OI)(CI)(F) ACE only.

<file permissions> on Posix systems (Linux, MacOS, BSDs) its 0600. On Windows systems access is bound to current user only (all inherited ACEs are got removed) without container and object inheritance enabled. So file will have <current username>:(F) ACE only.

  1. <data-dir> (folder which is specified using --data-dir option). Must have <folder permissions> permissions set.

  2. <data-dir>/network_keystore.json - network key file (usually present and used only for bootstrap nodes, most of the users are missing this file). Must have <file permissions> permissions set.

  3. <data-dir>/validators/<folder with validator's public key name>/keystore.json - validator's keystore file. Must have <file permissions> permissions set.

Also we have password files, which will be deprecated soon.

  1. <data-dir>/secrets/<file with validator's public key name> - file which holds password to unlock validator's public key keystore. Must have <file permissions> permissions set.

After some investigation i think more checks should be added (right now permissions on this folders are not checked):

  1. <data-dir>/validators - folder which holds validator's data, slashing protection database and validator keystores folders. Must have <folder permissions> permissions set.

  2. <data-dir>/validators/<validator's public key> - folder which holds validator's keystore file. Must have <folder permissions> permissions set.

  3. <data-dir>/secrets - folder which holds password files. Must have <folder permissions> permissions set.

Now i think it's pointless to discuss why these checks are needed, but i dont have strong opinion on 0750 vs 0700 (this could be discussed).

Now about why check is so strict and do not allow beacon_node to be started:

  1. It is mentioned explicitly in audit issue ( [SEC] CLI - Do not create world readable secret files #1702 [SEC] beacon_node CLI - harden folder/file permissions (createDir, ...) #1688 beacon_chain::getPersistentNetKeys - harden file permissions for key material #1320 ).
  2. It is a security incident which should be investigated by user (why permissions on folders/files was changed). And that's why we should not implicitly correct such permissions and hide the problem, and its why set file and dir permissions #1922 is incorrect solution. Imagine you come home and see the front door is broken open. Do you call police to investigate or you call worker to fix the door?

My proposal is to add maintenance command to beacon_node, using which you will be able to correct permissions and any other problems (for example remove password files, compress database and so on).

@zah
Copy link
Contributor

zah commented Oct 30, 2020

Besides what @cheatfate already listed, I would also suggest the following additional check:

  1. <data-dir>/validators/<validator's public key>/keystore.json should have <file permissions> set.

The idea of having a single maintanance command sounds good. They user may be able to specify more specific tasks through a sub-command or by specifying additional flags.

One general question that we should answer in our design guidelines is how to deliver such error messages to the user. We've tried to use the logging framework for this in the expectation that many users will be processing their logs with solutions such as LogStash and Splunkd that should capture the entire output of the program, but there is some genuine concern that the log format is not the best media to explain a complex issue such as the one discussed here. This was also brought up during the audit (e.g. #1692)

@cheatfate
Copy link
Contributor

@zah i think your 8 is copy of my 3 which is already checked.

@zah
Copy link
Contributor

zah commented Oct 30, 2020

Right, sorry for missing that.

@unixpi
Copy link
Contributor Author

unixpi commented Oct 30, 2020

Thanks for taking the time to clarify @cheatfate !

The idea of having a single maintanance command sounds good. They user may be able to specify more specific tasks through a sub-command or by specifying additional flags.

I'm leaning towards this solution too. it's clean, and avoids the concerns raised around #1922.

@arnetheduck
Copy link
Member

any maintenance / lint command should probably have a read-only mode that's default, and a fix-it mode that tries to fix the issues - the latter is important because if we had a good, automatic fix that works in all situations, we wouldn't be putting it in a separate maintenance command - ie this command sounds garbage-bin for anything that we're not sure how to handle automatically and want the users input - "my permissions are wrong" and "my database could use a pruning" don't have much in common really, beyond that.

@unixpi unixpi added the UX label Oct 30, 2020
@cheatfate
Copy link
Contributor

@arnetheduck this command could perform upgrades, for example there should be no such situation when you changing database schema and everybody should start syncing from scratch with a new build, and because very soon we will be forced to upgrade our database engine and/or our file storage engine we will need such tool to keep upgrades painless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants