Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: protected files redesign #371

Closed
pwmarcz opened this issue Feb 2, 2022 · 47 comments
Closed

RFC: protected files redesign #371

pwmarcz opened this issue Feb 2, 2022 · 47 comments

Comments

@pwmarcz
Copy link
Contributor

pwmarcz commented Feb 2, 2022

Summary

Make protected files configurable per mount, and move handling to LibOS.

Problems

This proposal addresses the following problems:

  • Bug: Protected files do not support renaming ([Pal/Linux-SGX] Fix renaming issue with protected files #31). This is difficult to fix for the following reasons:

    • Protected files are tracked in PAL by path
    • We store the path in file header (so renaming needs to modify the file)
    • Broken reference counting (see below)
    • It's easy to move a file out of list of protected-files paths, effectively "un-protecting" it. The other way is also possible: you could move a non-protected file to protected path.
  • Bug: Protected files in PAL use a broken reference counting scheme

    • I'm not sure what the impact is. Probably memory/FD leaks, as we do not close the files when we should.
  • Bug: If host returns the same FD for multiple files, Gramine segfaults (SGX protected files possible vulnerability graphene#2360)

    • This is difficult to fix because of the way they're tracked by PAL.
  • Bug: Path normalization for protected files is buggy (foo and ./foo are treated differently)

The following problems are NOT addressed by this proposal:

Current design

As an overview, here is the current implementation of protected files.

  1. Manifest: The user specifies sgx.protected_files (as well as protected_{mrenclave,mrsigner}_files). These are lists of paths (files AND directories) in the host.

  2. LibOS: does not know which files are protected, but (based on specified mounts) translates some I/O operations to PAL operations on host paths.

  3. SGX PAL (db_files.c): during file operations, checks if the path corresponds to a protected file. If so, forwards the operations

    • ⚠️ Note that struct protected_file is NOT associated with the PAL handle directly, but looked up in a hash table (indexed by path) every time.
    • We need to ensure a single protected file handle is open for a path, even if there are multiple PAL handles for the file
    • ⚠️ There's a (broken) reference counting scheme that ensures we unload the protected file handle.
  4. SGX PAL adapter to protected files (enclave_pf.[ch]). This is a module responsible for tracking open protected files.

    • struct protected_file: describes a protected file path (:warning: either a single file, or a directory!)
    • Stored in a hash map (by path)
    • If we encounter a new file that should be protected (i.e. path is under a protected directory), we create a new struct protected file
    • Contains pf_context_t, which is set when the file is loaded
    • ⚠️ Stores multiple global keys (g_pf_wrap_key, g_pf_{mrenclave,mrsigner}_key) and decides to use one of them
  5. Protected files module (protected_files.[ch]). This is a generic module for handling protected files.

    • No real dependency on PAL (I think)
    • Operates on pf_context_t handles (no global map)
    • Functions are parametrized by wrap key (no global key)
    • The user registers callbacks for I/O (file read/write, debug log) and cryptography

I marked with ⚠️ the parts that I particularly do not like.

Proposed changes

Basically, I propose scrapping everything except for point 5 above.

  • Configure protected files by mount:
[fs.mount.foo]
type = "chroot"
path = "foo"
uri = "file:/foo"
protected = true
key = "123456"

This sidesteps problems with renaming (it's impossible to rename a file "out of protected files", because it would mean moving it out of a mount). It also allows setting different keys for different path.

  • Move protected files handling to LibOS. If the protected files are configured by mount, then the LibOS filesystem code is the perfect place to determine whether to encrypt/decrypt a file and what key to use. Also, unlike PAL, it already tracks all files, so a protected file handle could be stored in the inode.

    As a bonus, protected files will not be SGX-only, which should make them easier to test.

  • Move the generic module (part 5 above) to Common. It looks generic enough, and probably doesn't need any major changes to implement the above.

@dimakuv
Copy link

dimakuv commented Feb 2, 2022

One immediate comment: we need backwards-compatibility in our manifest file. While I agree that per-mount properties (protected, key) are better, we still need to support sgx.protected_files for some time. I guess LibOS can look for this key in the TOML manifest and automatically construct mount points. Ugly but required...

Another immediate comment: how can we support an libos.entrypoint (the binary to actually run) being a protected file? If possible, I'd like to have this support transparently implemented in the new design.

@boryspoplawski
Copy link
Contributor

We also need to carefully design how keys are deployed - they are never stored in manifest (outside of testing). While PF can be a generic feature, some of the mentioned keys (e.g. mrenclave_key) are SGX specific

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Feb 2, 2022

One immediate comment: we need backwards-compatibility in our manifest file. While I agree that per-mount properties (protected, key) are better, we still need to support sgx.protected_files for some time. I guess LibOS can look for this key in the TOML manifest and automatically construct mount points. Ugly but required...

Yeah, if arbitrary files can be protected, then probably the best way to emulate them is by separate mountpoints.

However, the "right way of doing things" would probably involve a single mount of a whole directory, and optionally a list of allowed files (if we want to only allow some files from that directory):

# old way:
sgx.protected_files = ["file:pf/x", "file:pf/y", "file:pf/z"]

# new way:
[fs.mount.pf]
type = "chroot"
path = "/pf"
uri = "file:/pf"
protected = true

# and, if we want to allow access to only some files from pf/:
[fs.mount.pf]
allowed_files = ["file:pf/x", "file:pf/y", "file:pf/z"]

This suggests that allowed_files (as @dimakuv mentioned elsewhere) can be also moved to LibOS.

Does that make sense? I guess it would make it harder to mix protected and non-protected files from the same directory (since you mount the whole directory as protected). Do people do that? Should they?

Another immediate comment: how can we support an libos.entrypoint (the binary to actually run) being a protected file? If possible, I'd like to have this support transparently implemented in the new design.

libos.entrypoint is being loaded using the filesystem layer of LibOS, same as any other file. So if you mount it as protected, it should work without any specialcasing. The same applies to executing files using execve.

@mkow
Copy link
Member

mkow commented Feb 2, 2022

One immediate comment: we need backwards-compatibility in our manifest file.

Would it be acceptable if instead of keeping the compatibility in Gramine itself we published a tool to automatically translate old manifests to the new format? And mention the tool if parsing fails for someone.
And this is in general, not just for this particular case.

@dimakuv
Copy link

dimakuv commented Feb 3, 2022

  1. Protected files module (protected_files.[ch]). This is a generic module for handling protected files.
    • No real dependency on PAL (I think)
  • Move the generic module (part 5 above) to Common. It looks generic enough, and probably doesn't need any major changes to implement the above.

This is exactly correct, the Protected Files module (https://github.com/gramineproject/gramine/tree/master/Pal/src/host/Linux-SGX/protected-files) was specifically designed in such a way as to be independent from the rest of the Gramine codebase (bar a couple handy macros).

For example, this Protected Files module constitutes the bulk of the pf_crypt ( renamed to gramine-sgx-pf-crypt during install) standalone utility of Gramine: https://github.com/gramineproject/gramine/tree/master/Pal/src/host/Linux-SGX/tools/pf_crypt.

So I'm all in for moving Protected Files under common/. This would be very handy when/if we support new PALs such as for Intel TDX.

@dimakuv
Copy link

dimakuv commented Feb 3, 2022

I created an example dir hierarchy. I think it reflects a typical application well, and it covers all things that we want to support with allowed/trusted/protected files.

.
├── logs
│   ├── al_info_log
│   └── pr_secret_log_key0
├── pr_inputs_key1
│   └── ...files...
├── pr_inputs_key2
│   └── ...files...
├── pr_outputs
│   └── ...files...
├── tr_libs
│   ├── tr_lib1
│   └── tr_lib2
├── pr_privatesecret_mrenclave
├── pr_sharedsecret_mrsigner
└── tr_script

Legend:

  • al_ for allowed files, pr_ for protected files, tr_ for trusted files.
  • Keys for protected files: _mrencave for SGX mrenclave-based key, __mrsigner for SGX mrsigner-based key, _key0 for user-provided key 0, _key1 for user-provided key 1, _key2 for user-provided key 2.

Obviously, files under logs/ are created on the fly by the application, as well as files under pr_outputs/.

Notice how it is typical to mix allowed, trusted and/or protected files under one directory. Also notice how it is typical to mix allowed, trusted and/or protected sub-directories under one directory.

@dimakuv
Copy link

dimakuv commented Feb 3, 2022

Regarding the manifest syntax. I highly dislike the hack of fs.mount.pf.allowed_files proposed by Pawel. I think users must be forced to create per-file mount points in cases like this ("several allowed files in the otherwise protected-files mount").

I would propose the following manifest syntax, using my example dir tree above (I skip type = "chroot" for brevity):

# Gramine must not implicitly add the root mount point, here I describe it explicitly
[fs.mount.root]
path = "/"
uri = "file:./"
shielding= "trusted"   # can be "allowed", "trusted", "protected"

# for protected keys, if `user_defined = false`, then it is the responsibility of PAL to generate the key
[fs.mount.privatesecret]
path = "/pr_privatesecret_mrenclave"
uri = "file:./pr_privatesecret_mrenclave"
shielding= "protected"
protected_key = [user_defined = false, name = "mrenclave"]

[fs.mount.sharedsecret]
path = "/pr_sharedsecret_mrsigner"
uri = "file:./pr_sharedsecret_mrsigner"
shielding= "protected"
protected_key = [user_defined = false, name = "mrsigner"]

# for protected keys, if `user_defined = true`, then user can supply the key and override the "default"
[fs.mount.logs]
path = "/logs/"
uri = "file:./logs/"
shielding= "protected"
protected_key = [user_defined = true, name = "key0", default = "1234567890"]

# note that all newly created files under logs/ become protected, except this one
[fs.mount.logs_allowed_info]
path = "/logs/al_info_log"
uri = "file:./logs/al_info_log"
shielding= "allowed"

[fs.mount.inputs1]
path = "/pr_inputs_key1/"
uri = "file:./pr_inputs_key1/"
shielding= "protected"
protected_key = [user_defined = true, name = "key1", default = ""]  # empty default means key *must* be provided by user

[fs.mount.inputs2]
path = "/pr_inputs_key2/"
uri = "file:./pr_inputs_key2/"
shielding= "protected"
protected_key = [user_defined = true, name = "key2", default = ""]

[fs.mount.outputs]
path = "/pr_outputs/"
uri = "file:./pr_outputs/"
shielding= "protected"
protected_key = [user_defined = true, name = "key1", default = ""]  # only user with key1 will be able to decrypt

# note that `tr_libs` and `tr_script` are covered by root FS mount point

@dimakuv
Copy link

dimakuv commented Feb 3, 2022

Regarding the renaming feature:

  • Renaming of/to Trusted Files doesn't make sense and must be disallowed
  • Renaming of Allowed Files must only be allowed to another Allowed File (i.e., under the path that resolves to the mount point with shielding = "allowed")
  • Renaming of Protected Files must only be allowed to another Protected File (i.e., under the path that resolves to the mount point with shielding = "protected" and with the same protected_key)
    • I know of use cases where Protected Files must be moved (= renamed) from one directory to another, controlled by the same key
    • I do not know of use cases where Protected Files must be moved (= renamed) across directories with different keys (does it make sense? I'm unsure here...)

@mkow
Copy link
Member

mkow commented Feb 3, 2022

While doing such a big rework, let's consider some breaking changes:

  • +1 to enforcing separate mounts (like in Linux) if you want to have a bunch of files in different mode - you just overlay them at the top of another mount.
  • Renaming allowed to passthrough.
  • Renaming trusted to something more meaningful (prehashed? hardcoded? we probably need some better ideas).
  • Renaming uri to host_path (and maybe getting rid of URI prefixes altogether?).
  • I really like shielding = key name, though we'll need to consider it in the context of the values we choose (e.g. shielding = trusted sounds a bit weird to me).
  • name = "mrenclave" - maybe "mode" would be better?
  • then user can supply the key and override the "default" - we need to ensure that people won't ship this to prod, maybe via our warnings at the beginning?

@dimakuv
Copy link

dimakuv commented Feb 3, 2022

I agree that maybe we can just rename everything :)

I dislike the historical allowed, trusted and protected words. If redesigning this from scratch, I would go with two boolean values: shield_integrity = true and shield_confidentiality = true. This would correspond to:

  • allowed files become the ones with shield_integrity = shield_confidentiality = false
  • trusted files become the ones with shield_integrity = true and shield_confidentiality = false
  • protected files become the ones with shield_integrity = shield_confidentiality = true

Obviously, shield_integrity = false and shield_confidentiality = true must be prohibited.

The nice thing about this proposal is that it is obvious and uses well-defined terms. The bad thing is that we won't be able to introduce e.g. another encryption scheme in addition to the one (SGX Protected Files) we have currently. But maybe this is not a bad thing after all -- the less options we have, the better.

But if this proposal is too dramatic, I'm fine with keeping the list of keywords and just renaming them like Michal suggests.

Renaming uri to host_path (and maybe getting rid of URI prefixes altogether?).

Yes. Nobody understands what URIs are and why Gramine needs them, including myself. I cannot imagine that Gramine will at some point implement a tcp: URI (to fetch a file from some web site).

name = "mrenclave" - maybe "mode" would be better?

Yes, "mode" sounds better.

@mkow
Copy link
Member

mkow commented Feb 3, 2022

I like the idea with 2 booleans, although it will be quite a breaking change :) But in the long term it should be more logical for new users.

@boryspoplawski
Copy link
Contributor

allowed -> passthrough
trusted -> hashed/measured
protected -> encrypted
?

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Feb 3, 2022

Some comments (from me, but based on discussion with @mkow @boryspoplawski @woju):

I don't think boolean flags are a good idea, they obscure the fact that we have 3 distinct mechanisms, each with its own quirks (protected files use a key which you can specify, trusted files are read-only). I like the idea of mode = "passthrough|measured|encrypted" parameter for a mount. Or shielding if you want.

I think these mechanisms are confusing right now because 1. the names are bad, 2. you configure them as three separate "whitelists" on top of the mounts, instead of making a decision per mounted file/directory. So maybe converting them to mounts would help.

There's also the matter of compatibility and interpreting the old manifest in a new system. For instance, given the manifest in the current syntax:

[fs.mount.pf]
path = "/usr/share/pf"
uri = "file:pf"

sgx.protected_files = [
  "file:pf/file1.txt",
  "file:pf/file2.txt",
  "file:pf/dir1/",
]

we would have to process sgx.protected_files and translate them to mounts:

[fs.mount.pf]
path = "/usr/share/pf"
uri = "file:pf"

[fs.mount.file1]
path = "/usr/share/pf/file1.txt"
uri = "file:pf/file1.txt"
mode = "encrypted"

[fs.mount.file2]
path = "/usr/share/pf/file2.txt"
uri = "file:pf/file2.txt"
mode = "encrypted"

[fs.mount.dir1]
path = "/usr/share/pf/dir1/"
uri = "file:pf/dir1/"
mode = "encrypted"

Note the following:

  • Interpreting sgx.protected_files as mounts is non-trivial, because we have to convert host path to mount path (based on existing mounts).
  • For backward compatibility, the fs.mount.pf mount itself would have to function as "passthrough" in gramine-direct, and "blocked" in gramine-sgx (i.e. no file accessible directly from this mount is readable). Longer term, we probably want to disallow such mounts with no mode.

Regarding renaming, I would keep the usual Linux rule, i.e. you can rename files within the same mounted filesystem. If the user wants to rename between 2 protected-file paths, they need to be within the same mounted directory, not two individual mounted files.

Would it be acceptable if instead of keeping the compatibility in Gramine itself we published a tool to automatically translate old manifests to the new format? And mention the tool if parsing fails for someone. And this is in general, not just for this particular case.

Sounds nice for us (a conversion tool could be written in Python!), but much less convenient for the user...

Renaming uri to host_path (and maybe getting rid of URI prefixes altogether?).

Yes. Nobody understands what URIs are and why Gramine needs them, including myself. I cannot imagine that Gramine will at some point implement a tcp: URI (to fetch a file from some web site).

I kind of like the fact that host paths have the file: prefix, it makes it easier to distinguish host paths from Gramine paths. Without it, it's hard to say which paths are host and which are in Gramine, since the variable name doesn't say that (loader.{argv,env}_src_file, fs.start_dir, sgx.*_files, loader.preload... all expect host paths).

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Feb 3, 2022

Another thing (thanks @woju): this will require converting fs.mount to be a list, not a dictionary (same as we did for trusted_files etc.) We need the mounts to be ordered, and also big dictionaries in TOML cause performance problems.

@mkow
Copy link
Member

mkow commented Feb 3, 2022

We even have an issue for changing it to a list: #23

@mkow
Copy link
Member

mkow commented Feb 3, 2022

I don't think boolean flags are a good idea, they obscure the fact that we have 3 distinct mechanisms, each with its own quirks (protected files use a key which you can specify, trusted files are read-only). I like the idea of mode = "passthrough|measured|encrypted" parameter for a mount. Or shielding if you want.

+1, when we discussed it more I changed my mind and don't like the bools idea anymore. It would be viable if we could make all three mount modes have the same interface, but there are too many differences the user needs to know right now, as Paweł mentions.

I kind of like the fact that host paths have the file: prefix, it makes it easier to distinguish host paths from Gramine paths. Without it, it's hard to say which paths are host and which are in Gramine, since the variable name doesn't say that (loader.{argv,env}_src_file, fs.start_dir, sgx.*_files, loader.preload... all expect host paths).

I'd prefer if we had this in key names (e.g. fs.mount.dir1.host_path = "asdf"), not values. When this is in values then it gives an impression that the user chooses it, while only file: is valid in basically all places. And also, file: doesn't really suggests that it comes from the host - it only does this for us, because we already know Gramine quirks.

@dimakuv
Copy link

dimakuv commented Feb 7, 2022

I like Borys's suggestion:

allowed -> passthrough
trusted -> hashed (I don't like measured -- this word is associated with SGX measurements)
protected -> encrypted

I agree with Michal that the file: prefix is misleading and very alien to new users. But maybe we just rename it to host: and keep the prefix-for-host-files idea?

I agree with the like-Linux renaming rule.

I don't care about the keywords shielding or mode -- whatever people prefer. Both sound good for me.

Regarding compatibility, I don't have a good suggestion. I dislike the idea of a Python tool to convert old manifests to new manifests -- every time I used such conversion tools, I ended up converting my files manually. And I think it will the same experience of Gramine users -- they will try to use the tool, the tool will not cover some corner cases or will emit some hard-to-decipher warnings, and the proficient users will simply do the conversion themselves and the less skilled users will create GitHub issues.

I would definitely want to keep compatibility inside Gramine itself, i.e., Gramine manifest loader parses deprecated sgx.{...}_files lists and reconstructs the FS mount points (even if in an idiotic inefficient way).

@dimakuv
Copy link

dimakuv commented Feb 9, 2022

How do we go about auto-generated sgx.trusted_files.sha256 items? Where do we put these hash values? Like this?

[fs.mount.root]
path = "/"
uri = "file:./"
shielding= "trusted"
files = [
    { uri = "file:/tr_script", sha256 = "..."},
    { uri = "file:/...", sha256 = "..."},
]

@mkow
Copy link
Member

mkow commented Feb 12, 2022

I agree with Michal that the file: prefix is misleading and very alien to new users. But maybe we just rename it to host: and keep the prefix-for-host-files idea?

But shouldn't this prefix be in the key names? Here it suggests that there could be something different in its place, i.e. now it suggests that the user chooses to serve this file from the host.

How do we go about auto-generated sgx.trusted_files.sha256 items? Where do we put these hash values? Like this?
[...]

Yeah, probably something like that.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Feb 14, 2022

How do we go about auto-generated sgx.trusted_files.sha256 items? Where do we put these hash values? Like this?

I'm not sure if I like it, the nesting is pretty deep. It's a table, nested in an array, nested in a table, nested in an array (since the mounts will be an array-of-tables). Notice that it's no longer even possible to write in the section format.

So I think I'd prefer keeping trusted files as a global array.

[[fs.mount]]
path = "/"
uri = "file:./"
shielding = "trusted"

[fs]
trusted_files = [
    { uri = "file:/tr_script", sha256 = "..."},
    { uri = "file:/...", sha256 = "..."},
]

@svenkata9
Copy link
Contributor

This sidesteps problems with renaming (it's impossible to rename a file "out of protected files", because it would mean moving it out of a mount).

@pwmarcz Just to make sure I understand this and we are on the same page, does it mean renaming is just not permitted?

@mkow
Copy link
Member

mkow commented Feb 14, 2022

@pwmarcz Just to make sure I understand this and we are on the same page, does it mean renaming is just not permitted?

No, it is permitted, but only inside the same mount (same as it works on Linux).

@svenkata9
Copy link
Contributor

Okay. At least for the failure we encountered with the workload, it should work I guess.

@svenkata9
Copy link
Contributor

svenkata9 commented Feb 14, 2022

trusted -> hashed (I don't like measured -- this word is associated with SGX measurements)

verified might be better than hashed.

@fnerdman
Copy link
Contributor

Defining allowed, trusted and protected files via mount points solves #359 so I am closing it.
For naming I would like to throw in: unprotected, integrity_protected and confidentiality_protected

Another question I have is, how do we handle symbolic links - within the same mount and crossing mount borders?

@boryspoplawski
Copy link
Contributor

@lead4good Symlinks are handled purely on LibOS layer and they merely point to another location, so they do not affect protected/trusted files handling

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Mar 3, 2022

Going a bit deeper in terms of implementation: I noticed that currently only PAL does crypto (e.g. is linked against mbedtls), and generally PAL performs most of the validation, so that results returned from PAL can be trusted. Implementing protected files entirely in LibOS seems to go against that.

I see several choices:

  1. Everything in LibOS. That means LibOS opens a host file through PAL handle, and then encrypts and decrypts it using the protected-files library.

    This means linking LibOS against mbedtls, and generally having more security-sensitive code in LibOS. It would work, but seems like going against the overall design so far.

  2. Separate PAL handle type. Implement protected files as a special handle type in PAL (PAL_HANDLE). LibOS would open and close such handles as necessary. LibOS would manage the keys and ensure there is only one handle for each file.

    But fitting this within the current framework (DkStreamOpen etc.) is difficult. To follow the current design, this would be another URI scheme, and probably the key would also be part of that scheme (e.g. enc:foo.txt:123456789abcdef), or creating a separate function specifically for opening a protected file (to be fair, we're already planning to go in that direction).

    Also, PAL seems to assume that handles are implemented per host, not in the "common" PAL code. Data structures are defined per host, sending handles to another process (i.e. sending a file descriptor) is also per host.

    Basically, while it sounds tempting to implement a protected file handle as a a handle that wraps another handle, PAL doesn't look flexible enough for such tricks.

  3. Separate set of functions. Implement protected files as a separate set of functions that operate on a PAL handle. LibOS calls these functions as in described in option 2 above.

    struct pal_encrypted_file;
    
    int DkEncryptedFileOpen(PAL_HANDLE handle, struct pal_encrypted_file** out_ef);
    int DkEncryptedFileClose(struct pal_encrypted_file* ef);
    int DkEncryptedFileRead(struct pal_encrypted_file* ef, const void* buf, size_t count, size_t offset);
    int DkEncryptedFileWrite(struct pal_encrypted_file* ef, void* buf, size_t count, size_t offset);
    int DkEncryptedFileTruncate(struct pal_encrypted_file* ef, size_t size);

    The struct pal_encrypted_file is a data type that wraps a PAL handle. It won't be possible to send struct pal_encrypted_file to another process directly, but it will be possible to send the PAL_HANDLE inside, and call DkEncryptedFileOpen on the far end.

    The downside is that there are a few extra Dk* functions specific to this file type, instead of using DkStream*.

So far the solution number 3 seems to be the most straightforward to me. It looks like a big improvement in terms of code separation:

  • on LibOS side, there probably will be a different filesystem,
  • on PAL side, handling protected files and other files is not mixed together,
  • protected files are not SGX-specific, but can be used in all hosts,
  • LibOS can count on PAL doing the necessary validation,
  • it's possible to write unit tests for the PAL functions, without involving LibOS.

@dimakuv
Copy link

dimakuv commented Mar 4, 2022

I dislike solution number 3. What would we do with Trusted Files? Have a separate set of functions for them too?

And how do we implement DkStreamChangeName() then? Do we have several versions of this function?

Also, with solution number 3 we grow the PAL API and its complexity. I think this goes against Gramine design: PAL is supposed to be simple and stateless, and LibOS is supposed to implement all host-agnostic generic functionality. So to me it makes more sense to lift all PF logic to the LibOS layer. When we add a new PAL, e.g. Windows-SGX, having PFs in LibOS will be a much cleaner approach than re-implementing the Linux-SGX logic in the new Windows-SGX PAL.

So I'm actually voting for solution number 1.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Mar 4, 2022

I dislike solution number 3. What would we do with Trusted Files? Have a separate set of functions for them too?

I think so, although trusted files, being read-only, have a different set of operations.

And how do we implement DkStreamChangeName() then? Do we have several versions of this function?

Renaming a protected file means (1) renaming the host file and (2) modifying its header. So it would also be a separate operation that invokes DkStreamChangeName under the hood.

So to me it makes more sense to lift all PF logic to the LibOS layer. When we add a new PAL, e.g. Windows-SGX, having PFs in LibOS will be a much cleaner approach than re-implementing the Linux-SGX logic in the new Windows-SGX PAL.

My idea was to keep that in the "common" part of PAL, not the host-specific part. But you're right, it does look like going against the design.

I guess what I'm proposing is a set of functions that wrap PAL handle operations. These functions might be a part of PAL API, but they might also be in LibOS, the difference is not big.

So I'm actually voting for solution number 1.

I'm almost convinced, but what worries me is that solution number 1 is breaking the precedent that "only PAL does crypto". Is that meaningful? Do you have any comment on that?

Also, do you propose the same for trusted files? I think the situation is very similar.

@dimakuv
Copy link

dimakuv commented Mar 4, 2022

I'm almost convinced, but what worries me is that solution number 1 is breaking the precedent that "only PAL does crypto". Is that meaningful? Do you have any comment on that?

I see no reason to limit crypto operations to PAL only. Realistically speaking -- Gramine will be 99% used for TEEs, and limiting the main component of Gramine to not use crypto makes no sense. I would actually push for moving more and more crypto into the LibOS layer, so that writing new backends (vendor-specific TEE PALs) would be a pretty trivial effort. In the worst case, we can hide all this crypto under a compile-time switch (-DDISABLE_CRYPTO), if someone wants to have minimal crypto-less LibOS binary.

These functions might be a part of PAL API, but they might also be in LibOS, the difference is not big.

I'm not sure what exactly you mean by "the difference is not big". Expanding the PAL API makes a huge difference -- I'm currently in the process of writing a new PAL, and it's already a pain in the neck with our APIs.

@dimakuv
Copy link

dimakuv commented Mar 4, 2022

Also, do you propose the same for trusted files? I think the situation is very similar.

Yes, I propose to move all allowed/trusted/protected files logic into the LibOS layer.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Mar 4, 2022

I'm not sure what exactly you mean by "the difference is not big". Expanding the PAL API makes a huge difference -- I'm currently in the process of writing a new PAL, and it's already a pain in the neck with our APIs.

By "writing a new PAL", do you mean not reusing any code from Pal/src?

@dimakuv
Copy link

dimakuv commented Mar 4, 2022

By "writing a new PAL", do you mean not reusing any code from Pal/src?

I mean writing a new set of headers/sources under Pal/src/host/MyAwesomeBackend, by copying Skeleton and adding actual implementations to it.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Mar 4, 2022

OK, then what I had in mind was not adding anything to Pal/src/host/... for every backend, but adding a set of functions implemented entirely in the common part of PAL (Pal/src/...). It should not impose any additional work if you implement PAL for another host.

I'll try it out with a LibOS implementation, though.

@dimakuv
Copy link

dimakuv commented Mar 4, 2022

OK, then what I had in mind was not adding anything to Pal/src/host/... for every backend, but adding a set of functions implemented entirely in the common part of PAL (Pal/src/...). It should not impose any additional work if you implement PAL for another host.

Ah, I see. This could be done, yes, and could help. But now we unnecessarily bloat and complicate the PAL component. To me, it sounds more reasonable to expand the LibOS codebase.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Mar 15, 2022

I implemented most of the functionality, allowing me to explore the design better. Here is a detailed proposal for the feature. Please comment, I'm close to submitting initial PRs.

Keep in mind that this is an incremental change. I'm not planning to do anything with trusted files, remove URI prefixes, etc. The scope is mostly limited to the current features of protected files (except for bugs/features mentioned in the original post).


Mounting encrypted files

Protected files are renamed to encrypted. To mount a file or directory as encrypted, use type = "encrypted":

fs.mounts = [
  { type = "encrypted", path = "/mnt/pf", uri = "file:pf" },
]

Why? It makes the most sense to implement "encrypted" as a separate filesystem in LibOS, since many of the operations are going to be completely different. Later, we can introduce type = "hashed" for read-only hashed (trusted) files, and rename "chroot" to type = "passthrough" for files accessed directly.

Keys

Encrypted files have a wrap key, identified by name. The default key is called "default". You can use a different key:

fs.mounts = [
  { type = "encrypted", path = "/mnt/pf/default", uri = "file:pf/default" },
  { type = "encrypted", path = "/mnt/pf/custom", uri = "file:pf/custom", key = "custom" },
]

The keys can be specified directly in manifest:

fs.keys.default = "00112233445566778899aabbccddeeff"
fs.keys.custom  = "ffeeddccbbaa99887766554433221100"

They can also be provided from inside Gramine, by writing the key to /dev/attestation/keys/<key>. If a key is not provided in manifest, Gramine will start, but it will not be possible to access mounts using this key (file operations will fail with EACCES) until the key is provided.

Special keys (SGX MRENCLAVE and MRSIGNER)

You can use special keys provided by the PAL host. Currently, Linux-SGX provides two such keys:

  • key = "_sgx_mrenclave"
  • key = "_sgx_mrsigner"

Special keys have names beginning with underscore. If we are running under a host that does not implement a given key, Gramine will start, but it will not be possible to access mounts using this key (file operations will fail with EACCES).

Why? Otherwise, gramine-direct will refuse to even start manifests that use MRENCLAVE and MRSIGNER keys. Also, this matches the behavior for keys provided through /dev/attestation.

Transition

When the feature is launched, Gramine will translate sgx.protected_files* to a list of type = "encrypted" mounts. We will keep this for two versions (for the deprecation period).

However, this is a big feature. I want to merge a non-feature-complete version earlier, without such compatibility layer, and without mention in documentation (or documented as "experimental").

Implementation

Here's how the implementation looks so far:

  • The protected_files module is moved to common.
  • There's a new chroot_encrypted filesystem. It uses some operations from normal chroot (e.g. unlink, chmod), and implements other operations in terms of encrypted files (e.g. open, read).
  • The next layer is a wrapper over protected_files module and PAL handles. It lives in LibOS, and provides an "encrypted file" structure. It has a use count, and keeps an open PAL handle if the count is non-zero.
  • The "encrypted file" object is kept in an inode. The use count is a number of open user handles. This is because we want to keep the file open as long as it's needed, but not multiple copies of it.
  • PAL is used to provide the special keys, using a function (DkGetSpecialKey) that takes the name ("_sgx_mrenclave", "_sgx_mrsigner").

@dimakuv
Copy link

dimakuv commented Mar 16, 2022

I like the proposal a lot. However, some open questions:

However, this is a big feature. I want to merge a non-feature-complete version earlier, without such compatibility layer, and without mention in documentation (or documented as "experimental").

Do I understand correctly that you want to have at least several commits in Gramine's master branch that will break user manifests? I'm against this. We have many users who track the latest commit of Gramine, and they will spam us with "nothing works" messages.

Even if this compatibility layer is too big, we still need it in the same PR as the main PF changes. Alternatively, if you want the PRs to be more easy-to-review, then you could submit two PRs (second PR with a base of the first PR) and we'll review them together and merge them together.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Mar 16, 2022

That's not what I want to do.

First, I want to add type = "encrypted" mounts, completely separately of protected files. These mounts will not support all the use cases of protected files (e.g. it will not be possible to mmap them, or use MRENCLAVE/MRSIGNER keys), and will not be advertised to users yet. No current manifests will break, and protected files will keep working in the same way.

Then, once type = "encrypted" mounts have all the necessary features, I will remove the old protected files code and add a compatibility layer, so that sgx.protected_files uses type = "encrypted" mounts under the hood.

@pwmarcz
Copy link
Contributor Author

pwmarcz commented Apr 21, 2022

Checklist for the transition to encrypted mounts:

Most likely this will be two PRs: one for mmap, the other one for everything else.

EDIT: Updated after PRs: #550, #566

@g302ge
Copy link

g302ge commented Apr 25, 2022

if the rename syscall is still error when we using in gramine ? Maybe if there some plan about when to fix this syscall, which makes me can not introduce the rocksdb into gramine env

@dimakuv
Copy link

dimakuv commented Apr 26, 2022

@g302ge Renaming of Protected (= Encrypted) Files is supported now in Gramine, but it is currently not yet advertised, because @pwmarcz is still finalizing this redesign. This redesign should be finished in a couple weeks hopefully, and then all new features will be advertised to users.

That said, you can try the latest Gramine with this new redesign. Note that Protected Files are renamed to Encrypted Files. Note also that rename() works only on the same FS mount point (in the example below, you can only rename files under /tmp_enc/ directory). The new way of specifying them is through FS mount points, smth like this in your manifest:

fs.mounts = [
  ...,
  { type = "encrypted", path = "/tmp_enc", uri = "file:tmp_enc", key_name = "my_custom_key" },
]

# in reality, `my_custom_key` is sent during Secret Provisioning, not specified in the manifest
fs.insecure__keys.my_custom_key = "00112233445566778899aabbccddeeff"

Some examples of how Encrypted Files work now:

  • { type = "encrypted", path = "/tmp_enc", uri = "file:tmp_enc", key_name = "my_custom_key" },
  • fs.insecure__keys.default = "ffeeddccbbaa99887766554433221100"
    fs.insecure__keys.my_custom_key = "00112233445566778899aabbccddeeff"
  • def test_035_rename_unlink_enc(self):
    os.makedirs('tmp_enc', exist_ok=True)
    file1 = 'tmp_enc/file1'
    file2 = 'tmp_enc/file2'
    # Delete the files: the test overwrites them anyway, but it may fail if they are malformed.
    for path in [file1, file2]:
    if os.path.exists(path):
    os.unlink(path)
    try:
    stdout, _ = self.run_binary(['rename_unlink', file1, file2])
    finally:
    for path in [file1, file2]:
    if os.path.exists(path):
    os.unlink(path)
    self.assertIn('TEST OK', stdout)

@g302ge
Copy link

g302ge commented Apr 29, 2022

@dimakuv thx for your job working on protected file system. But the protected_mrenclave_file is still not work on rename. Moreever, if I use the encrypted file system. How can I dynamically set the key when the binary is running, Over the pesudo file system endpoints to write and read ?

@dimakuv
Copy link

dimakuv commented May 2, 2022

@g302ge The rework of the Protected File System is not yet finished (and I am not the one who does this job, it is @pwmarcz). The features requested by you are still not fully implemented and not documented. Please wait another week or two for them to appear in Gramine.

@Villain88
Copy link

Hi, I tried to use encrypted fs. When I try to get a list of files in a directory (even an empty one), I get an error, see the log below:

[P1:T1:busybox] trace: ---- shim_stat("/tmp/testdir", 0x13cbdb78) = 0x0
[P1:T1:busybox] trace: ---- shim_openat(AT_FDCWD, "/tmp/testdir", O_RDONLY|0x90800, 0000) = 0x3
[P1:T1:busybox] trace: ---- shim_newfstatat(3, "", 0x13cbdb20, 4096) = 0x0
[P1:T1:busybox] trace: ---- shim_getdents64(3, 0x15366380, 0x8000) = 0x60
[P1:T1:busybox] trace: ---- shim_lstat("/tmp/testdir/1", 0x13cbdb28) = 0x0
[P1:T1:busybox] trace: ---- shim_getdents64(3, 0x15366380, 0x8000) = 0x0
[P1:T1:busybox] error: Internal memory fault at 0x00000000 (0x1f0b9184, VMID = 1, TID = 1)

I used the following manifest:

fs.mounts = [
{ path = "/lib", uri = "file:{{ gramine.runtimedir() }}" },
{ path = "{{ arch_libdir }}", uri = "file:{{ arch_libdir }}" },
{ path = "/etc", uri = "file:/etc" },
{ path = "/tmp/testdir", uri = "file:/tmp/testdir", type = "encrypted" }
]
fs.insecure__keys.default = "ffeeddccbbaa99887766554433221100"

The same situation with the python os.walk, etc

How can I fix this problem?

@dimakuv
Copy link

dimakuv commented May 4, 2022

@pwmarcz Could you please look into this? Doesn't look good:

[P1:T1:busybox] error: Internal memory fault at 0x00000000 (0x1f0b9184, VMID = 1, TID = 1)

UPDATE: This was fixed in #564. @Villain88 Try your stuff again.

@Villain88
Copy link

UPDATE: This was fixed in #564. @Villain88 Try your stuff again.

Thanks, this case is working now

@dimakuv
Copy link

dimakuv commented Jan 31, 2023

I believe we implemented everything that was mentioned in this RFC. Closing.

@dimakuv dimakuv closed this as completed Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants