-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use newer tarfile to handle bundles more safely #197
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @ekoyle!
It would be great if you could add a new test validating the fix. This project uses BTest and tests are in testing/tests
.
CI is currently failing due to lots of style issues checked in pre-commit hooks. To run this locally you would need to install pre-commit
, e.g., via pipx install pre-commit
; you can then run checks identical to CI with pre-commit run -a
. I suggest below to not fork python-3.12's tarfile
and instead outsource this to a dependency, so you shouldn't need to worry about style issues in zeekpkg/tarfile_fallback.py
.
zeekpkg/tarfile_fallback.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not fork this into our codebase -- could we instead use a dependency like e.g., backports.tarfile (ideally: with a comment in pyproject.toml
that we need it to fix XYZ until we bump to python-N.M.K
)? If you could add a test for the issue this fixes we could validate that it works for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like ci is installing python dependencies anywhere.
This brings up an issue with this approach: up until now, all of the dependencies needed to run zkg
have been available from the OS package manager. This dependency won't be available for the zeek-zkg
package.
Using a library from pypi may require users to find a way to manage a python virtualenv for zkg (recent versions of pip refuse to install packages globally, so users would need to use virtualenv or something like pipx, pipenv, etc).
@@ -69,37 +86,83 @@ def make_symlink(target_path, link_path, force=True): | |||
raise error | |||
|
|||
|
|||
def safe_tarfile_extractall(tfile, destdir): | |||
"""Wrapper to tarfile.extractall(), checking for path traversal. | |||
def zkg_tarfile_create(basedir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add PEP257 docstrings for the functions you are adding?
I'd like to double-check a few things before adding finishing touches on this:
|
We currently require
AFAICT zkg should in principle support installing executable scripts so I'd think we'd want to preserve executable bits if present (but not force it for all installed files). Write we'd only need by the owner. I currently do not see that we would need to preserve setuid bits.
Owner should be the user installing the package. This is to support installing into a user path, e.g., I use a $ zkg config
...
[paths]
state_dir = /Users/bbannier/.zkg
script_dir = /Users/bbannier/.zkg/script_dir
plugin_dir = /Users/bbannier/.zkg/plugin_dir
bin_dir = /Users/bbannier/.zkg/bin
...
Since we'd want to preserve at least some permission bits (see above) we should make sure that both install and subsequent bundle preserve these. Ideally we'd have a test validating that roundtrip. |
Although I was able to get the permissions the way I intended during the extraction process, there can still be issues due to the umask when we build a package. Would it be reasonable to have zkg emit an error or a warning when run with a umask more restrictive than 022, or maybe even set the umask before building? It seems like a package manager should be allowed to ignore the umask if it is going to cause problems, but I'm not sure if anyone would be bothered by such a change. |
I am not exactly following 😅. I think the issue is exactly that zkg's active umask leaks into installed, bundled and unbundled files. Couldn't we preserve permissions on installed files, and pick a reasonable default for generated ones? |
This PR resolves some issues with permissions leaking into and out of the tarball, however zkg is still inheriting the umask from the environment where it runs. If this umask is restrictive, then when zkg (or a subprocess) copies or creates files, they are created with restrictive permissions. This seems to be a major problem if there are compiled extensions in the bundle, as the build is going to copy and create files. The simplest fix for this would be to call |
Use the new `filter` option in tarfile's `extractall()` to mitigate more issues that could arise from loading a malicious tar file. Try to normalize permissions when creating the tar and only pay attention to whether the user `x` bit is set (permissions should be `0o755` if executable and `0o644` otherwise). Omit the original username in the tarball and replace with root:root. When unpacking, the same rules are applied on permission mode, and the user is left as the user executing zkg. Use [tarfile.py](https://github.com/python/cpython/raw/v3.8.19/Lib/tarfile.py) from python 3.8.19 and rename to tarfile_fallback.py so that we can use the newer features. Once our minimum python versions are greater than 3.12, 3.11.4, 3.10.2, 3.9.17, or 3.8.17, we can safely remove this file as well as the conditional import in _util.py.
Add a btest for the new permission handling. This is using output of ls and tar. I'm not positive this will work with variants other than GNU. If this doesn't work, I'm not sure how to accomplish this portably short of writing scripts to print out the desired information.
… awk to filter columns
@ekoyle I pushed added some fixups to make your test pass both with a GNU and a BSD stack and a minor vendoring cleanup. Please feel free to rebase aggressively. |
import types | ||
|
||
import git | ||
import semantic_version as semver | ||
|
||
from .vendor import tarfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should continue to use tarfile
from the Python distribution if it's modern enough. If anything changes in the distribution version, at least it would be captured.
Something like that might work.
import tarfile as builtin_tarfile
if hasattr(builtin_tarfile, "data_filter"):
tarfile = builtin_tarfile
else
from .vendor import tarfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this vendoring all that much, but after staring a bit at the tarfile difference, think it's okay.
|
||
|
||
def zkg_update_perms(new_attrs, member, extract): | ||
# we are doing our own thing with `mode` here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just the inverse of what was discussed in the ticket:
If a user creates a tar by hand or with custom tooling that contains a 0600 file and extracts it as zeek:zeek
user on the target system, today it'll be a somewhat expected 0600
. With this change the file will be world-readable on the target system.
Having a 0600
file with zkg/git may be practically difficult to achieve and I'm not sure there's an actual use case, but not sure putting a world-readable label on everything is great.
If we'd warn users about strict umask settings and the consequences and a possible opt-in to set laxer permissions within the bundle, maybe that would suffice, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unless there is a really strong use-case against it, we should favor the solution that will work for everyone. I feel like the package manager should be in charge of the permissions rather than carrying over (often accidental) permissions from the system on which the bundle is created (more so if they are creating them by hand) -- especially since there are also unexpected interactions depending on the umask zkg runs with (the permissions of the unpacked files don't have any effect on compiled code, even with this PR). A package should be portable, and if the end-user needs to worry about permissions on both the bundling system and the unbundling system, I don't think the package is very portable.
What is the best practice for running zkg? It seems like it would be best for zkg to run as a user that is different from the user running zeek, as well as a non-root user. It would be great if we could mask permissions with say 0o027
, but that would require zkg to know a group that is shared between the user unpacking the bundle and the user running zeek (or at least for end users to know to set the zkg users primary group to something the zeek user is a member of). Unless zkg is going to be opinionated on users and groups (which I don't think it should), I think it would be better to default to world-readable permissions.
If we want to provide a way for users to control permissions of files written by zkg, maybe it would be better to provide a cli/config option for umask
, and default it to 0o022? Then, we could set that as the umask for the zkg process early on, and also mask the permissions we are calculating.
The only use-case I have come up with for wanting some files to have more restrictive permissions than others is if a user is including sensitive files in a bundle, which I hope they are not -- however even this use-case could be handled by using a more restrictive umask for all files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only use-case I have come up with for wanting some files to have more restrictive permissions than others is if a user is including sensitive files in a bundle, which I hope they are not
Yeah, that was the only thought, and can also see that it's a "don't do that".
I think I'm still leaning on the side of putting the portable/appropriate permission bits into the bundle/tar, but during extraction at least not make them more permissible. Maybe I'm too timid and someone else should chime in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to separate that piece into another ticket/PR?
Also, since the process umask is honored while building (for files created during the build step), should we also honor the process umask while unbundling to be more consistent? I'm not really sure what the best approach is there, but I really don't like the inconsistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to separate that piece into another ticket/PR?
I'd wait and see if @ckreibich or others opine. As said, maybe I'm just too timid and making everything 0644/0755 is okay.
should we also honor the process umask while unbundling to be more consistent?
Hmm, with the current state of the PR, I'd even say you could open-up/force the umask to 022
before building so that compiled files are world readable. In your test, the .so
file produced has 0750
, which wouldn't be usable for non-root or non-zkg-user.
-rwxr-x--- root/root 31448 2024-07-25 16:44 Demo_Rot13/lib/Demo-Rot13.linux-x86_64.so
And even with the stricter permission handling, maybe a warning is appropriate if umask is found to be too strict before building.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest commit (1ff7c78) should set the umask on build based on a new zkg_umask
attribute on the Manager class, as well as applying that umask to files as they are extracted from a bundle. If we want to keep it, I think we could make zkg_umask
user-configurable. It is currently hardcoded to 0o022
, but it seems to behave as I expected when changing that mask.
Remove the default for dest_path is it is not optional. Co-authored-by: Arne Welzel <[email protected]>
Add docstrings and update zkg_update_perms to return a dict rather than accept and modify a dict as the first argument.
Add zkg_umask to Manager and apply that mask when unbundling or building.
@rsmmr - do you have opinions on umask and permissions of files within bundles and those created during the There's much comments here, but in short: Any strong feelings/reactions about making files/directories |
I haven't followed the discussion in detail but generally I agree that a packager manager should be able to control its permissions and set them the way it wants things to work. That said, I think it would be good if a |
Set the umask during `_stage()` in the install process to hopefully give similar results to bundling and unbundling. Needs btests to verify. We may need a custom copy function if this doesn't result in the same behavior as bundling/unbundling. Looking at this, it might be better to make changes to Manager._stage() to enforce more consistent permissions when copying.
I made some changes to try to keep the permissions similar during install, but I think we may want to replace Still needs a btest for the install as well. |
Use tarfile from python 3.12 to handle bundles in a safer way.
Ignore user permissions where possible, and don't copy unsafe permissions/files.
tarfile_fallback.py
was copied from the latest python 3.8 (backported from python 3.12) -- if we aren't on a new enough version of python, use that instead of the built-in one.Fixes #196