Skip to content

Commit

Permalink
Fill in RTD site getting started, usage, configuration, and plugin pa…
Browse files Browse the repository at this point in the history
…ge TODOs (#82)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ryan Mast <[email protected]>
  • Loading branch information
3 people authored Nov 27, 2023
1 parent 999eca7 commit dc2f90f
Show file tree
Hide file tree
Showing 4 changed files with 303 additions and 14 deletions.
203 changes: 199 additions & 4 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,213 @@

## Build configuration file

TODO: add details about making a config file
A configuration file contains the information about the sample to gather information from. Example JSON configuration files can be found in the examples folder of this repository.

**extractPaths**: (required) the absolute path or relative path from location of current working directory that `surfactant` is being run from to the sample folders, cannot be a file (Note that even on Windows, Unix style `/` directory separators should be used in paths)\
**archive**: (optional) the full path, including file name, of the zip, exe installer, or other archive file that the folders in **extractPaths** were extracted from. This is used to collect metadata about the overall sample and will be added as a "Contains" relationship to all software entries found in the various **extractPaths**\
**installPrefix**: (optional) where the files in **extractPaths** would be if installed correctly on an actual system i.e. "C:/", "C:/Program Files/", etc (Note that even on Windows, Unix style `/` directory separators should be used in the path). If not given then the **extractPaths** will be used as the install paths

## Example configuration files

Lets say you have a .tar.gz file that you want to run surfactant on. For this example, we will be using the HELICS release .tar.gz example. In this scenario, the absolute path for this file is /home/samples/helics.tar.gz. Upon extracting this file, we get a helics folder with 4 sub-folders: bin, include, lib64, and share.

### Example 1: Simple Configuration File

TODO: Add details
If we want to include only the folders that contain binary files to analyze, our most basic configuration would be:

```json
[
{
"extractPaths": ["/home/samples/helics/bin", "/home/samples/helics/lib64"]
}
]
```

The resulting SBOM would be structured like this:

```json
{
"software": [
{
"UUID": "abc1",
"fileName": ["helics_binary"],
"installPath": ["/home/samples/helics/bin/helics_binary"],
"containerPath": null
},
{
"UUID": "abc2",
"fileName": ["lib1.so"],
"installPath": ["/home/samples/helics/lib64/lib1.so"],
"containerPath": null
}
],
"relationships": [
{
"xUUID": "abc1",
"yUUID": "abc2",
"relationship": "Uses"
}
]
}
```

### Example 2: Detailed Configuration File

TODO: Add details
A more detailed configuration file might look like the example below. The resulting SBOM would have a software entry for the helics.tar.gz with a "Contains" relationship to all binaries found to in the extractPaths. Providing the install prefix of `/` and an extractPaths as `/home/samples/helics` will allow to surfactant correctly assign the install paths in the SBOM for binaries in the subfolders as `/bin` and `/lib64`.

```json
[
{
"archive": "/home/samples/helics.tar.gz",
"extractPaths": ["/home/samples/helics"],
"installPrefix": "/"
}
]
```

The resulting SBOM would be structured like this:

```json
{
"software": [
{
"UUID": "abc0",
"fileName": ["helics.tar.gz"],
"installPath": null,
"containerPath": null
},
{
"UUID": "abc1",
"fileName": ["helics_binary"],
"installPath": ["/bin/helics_binary"],
"containerPath": ["abc0/bin/helics_binary"]
},
{
"UUID": "abc2",
"fileName": ["lib1.so"],
"installPath": ["/lib64/lib1.so"],
"containerPath": ["abc0/lib64/lib1.so"]
}
],
"relationships": [
{
"xUUID": "abc0",
"yUUID": "abc1",
"relationship": "Contains"
},
{
"xUUID": "abc0",
"yUUID": "abc2",
"relationship": "Contains"
},
{
"xUUID": "abc1",
"yUUID": "abc2",
"relationship": "Uses"
}
]
}
```

### Example 3: Adding Related Binaries

TODO: Add details
If our sample helics tar.gz file came with a related tar.gz file to install a plugin extension module (extracted into a helics_plugin folder that contains bin and lib64 subfolders), we could add that into the configuration file as well:

```json
[
{
"archive": "/home/samples/helics.tar.gz",
"extractPaths": ["/home/samples/helics"],
"installPrefix": "/"
},
{
"archive": "/home/samples/helics_plugin.tar.gz",
"extractPaths": ["/home/samples/helics_plugin"],
"installPrefix": "/"
}
]
```

The resulting SBOM would be structured like this:

```json
{
"software": [
{
"UUID": "abc0",
"fileName": ["helics.tar.gz"],
"installPath": null,
"containerPath": null
},
{
"UUID": "abc1",
"fileName": ["helics_binary"],
"installPath": ["/bin/helics_binary"],
"containerPath": ["abc0/bin/helics_binary"]
},
{
"UUID": "abc2",
"fileName": ["lib1.so"],
"installPath": ["/lib64/lib1.so"],
"containerPath": ["abc0/lib64/lib1.so"]
},
{
"UUID": "abc3",
"fileName": ["helics_plugin.tar.gz"],
"installPath": null,
"containerPath": null
},
{
"UUID": "abc4",
"fileName": ["helics_plugin"],
"installPath": ["/bin/helics_plugin"],
"containerPath": ["abc3/bin/helics_plugin"]
},
{
"UUID": "abc5",
"fileName": ["lib_plugin.so"],
"installPath": ["/lib64/lib_plugin.so"],
"containerPath": ["abc3/lib64/lib_plugin.so"]
}
],
"relationships": [
{
"xUUID": "abc1",
"yUUID": "abc2",
"relationship": "Uses"
},
{
"xUUID": "abc4",
"yUUID": "abc5",
"relationship": "Uses"
},
{
"xUUID": "abc5",
"yUUID": "abc2",
"relationship": "Uses"
},
{
"xUUID": "abc0",
"yUUID": "abc1",
"relationship": "Contains"
},
{
"xUUID": "abc0",
"yUUID": "abc2",
"relationship": "Contains"
},
{
"xUUID": "abc3",
"yUUID": "abc4",
"relationship": "Contains"
},
{
"xUUID": "abc3",
"yUUID": "abc5",
"relationship": "Contains"
}
]
}
```

NOTE: These examples have been simplified to show differences in output based on configuration.
48 changes: 45 additions & 3 deletions docs/getstarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,60 @@

## Installation

TODO: Installation steps
### For Users:

1. Create a virtual environment with python >= 3.8 [Optional, but recommended]

```bash
python -m venv cytrics_venv
source cytrics_venv/bin/activate
```

2. Install Surfactant with pip

```bash
pip install surfactant
```

### For Developers:

1. Create a virtual environment with python >= 3.8 [Optional, but recommended]

```bash
python -m venv cytrics_venv
source cytrics_venv/bin/activate
```

2. Clone sbom-surfactant

```bash
git clone [email protected]:LLNL/Surfactant.git
```

3. Create an editable surfactant install (changes to code will take effect immediately):

```bash
pip install -e .
```

To install optional dependencies required for running pytest and pre-commit:

```bash
pip install -e ".[test,dev]"
```

## Understanding the SBOM Output

### Software

TODO: Section information
This section contains a list of entries relating to each piece of software found in the sample. Metadata including file size, vendor, version, etc are included in this section along with a uuid to uniquely identify the software entry.

### Relationships

TODO: Section information
This section contains information on how each of the software entries in the previous section are linked.

**Uses**: this relationship type means that x software uses y software i.e. y is a helper module to x\
**Contains**: this relationship type means that x software contains y software (often x software is an installer or archive such as a zip file)
### Observations

TODO: Section information
42 changes: 38 additions & 4 deletions docs/plugins.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,51 @@
# Plugins

TODO: About the plugin system
The surfactant plugin system uses the [pluggy](https://pluggy.readthedocs.io/en/stable) module. This module is used by projects such as pytest and tox for their plugin systems; installing and writing plugins for surfactant is a similar to using plugins for those projects. Most of the core surfactant functionality is also implemented as plugins (see [surfactant/output](https://github.com/LLNL/Surfactant/tree/main/surfactant/output), [surfactant/infoextractors](https://github.com/LLNL/Surfactant/tree/main/surfactant/infoextractors), [surfactant/filetypeid](https://github.com/LLNL/Surfactant/tree/main/surfactant/filetypeid), and [surfactant/relationships](https://github.com/LLNL/Surfactant/tree/main/surfactant/relationships)).

## Creating a Plugin

### Step 1: Write Plugin

TODO: Function implementation instructions
In order to create a plugin, you will need to write your implementation for one or more of the functions in the [hookspec.py](https://github.com/LLNL/Surfactant/tree/main/surfactant/plugin/hookspecs.py) file. Which functions you implement will depend on the goals of your plugin.

#### Brief overview of functions
[identify_file_type](https://github.com/LLNL/Surfactant/tree/main/surfactant/plugin/hookspecs.py#L15)
- Return a string representation of the type of file passed in

[extract_file_info](https://github.com/LLNL/Surfactant/tree/main/surfactant/plugin/hookspecs.py#L29)
- Determine how file info is supposed to be extracted

[establish_relationships](https://github.com/LLNL/Surfactant/tree/main/surfactant/plugin/hookspecs.py#L47)
- Determines how to establish relationships between the software/metadata that has been passed to it

[write_sbom](https://github.com/LLNL/Surfactant/tree/main/surfactant/plugin/hookspecs.py#L70)
- Determine what format to write the SBOM to file

[read_sbom](https://github.com/LLNL/Surfactant/tree/main/surfactant/plugin/hookspecs.py#L80)
- If reading from input SBOMs, specifies what format the input SBOMs are

### Step 2. Write .toml File

TODO: Plugin metadata details
Once you have written your plugin, you will need to write a pyproject.toml file. Include any relevant project metadata/dependencies for your plugin, as well as an entry-point specification (example below) to make the plugin discoverable by surfactant. Once you write your .toml file, you can `pip install .` your plugin.
More information on entry points can be found [here](https://setuptools.pypa.io/en/latest/userguide/entry_point.html#entry-points-syntax)

#### Example

TODO: Example .toml files
#### sampleplugin.py
```python
import surfactant.plugin
from surfactant.sbomtypes import SBOM

@surfactant.plugin.hookimpl
def write_sbom(sbom: SBOM, outfile) -> None:
outfile.write(sbom.to_json(indent=10))
```
#### pyproject.toml
```toml
... generic pyproject info ...
[project.entry-points."surfactant"]
sampleplugin = "sampleplugin"
```
From the same folder as your sampleplugin files, run `pip install .` to install your plugin and surfactant will automatically load and use the plugin.

Another example can be found in the [surfactantplugin-checksec.py](https://github.com/LLNL/Surfactant/tree/main/surfactantplugin-checksec.py) folder. There you can see the [pyproject.toml](https://github.com/LLNL/Surfactant/tree/main/surfactantplugin-checksec.py/pyproject.toml) file with the `[project.entry-points."surfactant"]` entry. In the [surfactantplugin_checksec.py](https://github.com/LLNL/Surfactant/tree/main/surfactantplugin-checksec.py/surfactantplugin_checksec.py) file, you can identify the hooked functions with the `@surfactant.plugin.hookimpl` hook.
24 changes: 21 additions & 3 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,30 @@

## Identify Sample File

TODO: Information about downloadable files to test on
In order to test out surfactant, you will need a sample file/folder. If you don't have one on hand, you can download and use the portable .zip file from <https://github.com/ShareX/ShareX/releases> or the Linux .tar.gz file from <https://github.com/GMLC-TDC/HELICS/releases>.

## Running Surfactant

TODO: List options and commands
```bash
$ surfactant generate [OPTIONS] CONFIG_FILE SBOM_OUTFILE [INPUT_SBOM]
```

**CONFIG_FILE**: (required) the config file created earlier that contains the information on the sample\
**SBOM OUTPUT**: (required) the desired name of the output file\
**INPUT_SBOM**: (optional) a base sbom, should be used with care as relationships could be messed up when files are installed on different systems\
**--skip_gather**: (optional) skips the gathering of information on files and adding software entires\
**--skip_relationships**: (optional) skips the adding of relationships based on metadata\
**--skip_install_path**: (optional) skips including an install path for the files discovered. This may cause "Uses" relationships to also not be generated\
**--recorded_institution**: (optional) the name of the institution collecting the SBOM data (default: LLNL)\
**--output_format**: (optional) changes the output format for the SBOM (given as full module name of a surfactant plugin implementing the `write_sbom` hook)\
**--input_format**: (optional) specifies the format of the input SBOM if one is being used (default: cytrics) (given as full module name of a surfactant plugin implementing the `read_sbom` hook)\
**--help**: (optional) show the help message and exit


## Merging SBOMs

TODO: Instructions on how to merge
A folder containing multiple separate SBOM JSON files can be combined using merge_sbom.py with a command such the one below that gets a list of files using ls, and then uses xargs to pass the resulting list of files to merge_sbom.py as arguments.

`ls -d ~/Folder_With_SBOMs/Surfactant-* | xargs -d '\n' python3.8 merge_sbom.py --config_file=merge_config.json --sbom_outfile combined_sbom.json`

If the config file option is given, a top-level system entry will be created that all other software entries are tied to (directly or indirectly based on other relationships). Specifying an empty UUID will make a random UUID get generated for the new system entry, otherwise it will use the one provided.

0 comments on commit dc2f90f

Please sign in to comment.