Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure trivy to scan local images instead of downloading them again #8140

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

jandubois
Copy link
Member

@jandubois jandubois commented Jan 24, 2025

Run as root so trivy has access to the containerd socket.

Don't output the trivy JSON to the browser; this is extremely slow when the output is several megabytes.

Also don't output the trivy JSON to images.log, as it makes the file unwieldy.


On the first scan trivy needs to download trivy.db, which is almost 600MB. Subsequent scans are then almost instantaneous. Even results for python:latest show up in a second, which used to take almost a minute when the JSON was output to the browser.

Fixes #539

@jandubois jandubois marked this pull request as draft January 24, 2025 17:04
@jandubois jandubois marked this pull request as ready for review January 24, 2025 17:30
@jandubois jandubois force-pushed the trivy branch 2 times, most recently from 4f695e2 to 9db372d Compare January 24, 2025 17:42
@mook-as mook-as self-requested a review January 24, 2025 17:58
Copy link
Contributor

@mook-as mook-as left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general most of my complaints stem from setting the environment inline instead of passing it in as a mapping.

Comment on lines 157 to 158
* @param taggedImageName
* @param namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please actually document the arguments, or just drop the JSDoc bits if you don't want to do that. (It's generally better to add docs, of course…)

* Run trivy with the given arguments; the first argument is generally a
* subcommand to execute.
* Run trivy inside the VM with the given arguments; the 'trivy' command must be included
* in the args, so it can be called like `CONTAINERD_NAMESPACE=k8s.io trivy image ...`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only reason this exists (instead of having the concrete class calling this.executor.spawn directly) is to call processChildOutput?

It's probably better to take env as a second argument (or first), instead of setting it as part of the arg… or even just take SpawnOptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this.executor.spawn supports env options. And even if it did, they would not survive the sudo call without extra effort.

This will require a ton of changes and not solve the underlying issue that processChildOutput is not designed to be called with anything but docker or nerdctl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't support changing the environment, but that's why we have PRs… How about we at least call this something other than runTrivyCommand (say, runTrivyImageCommand) so it's not pretending to be more generic than it is? If we specialize it just for that it's less confusing.

Or just drop this wrapper and have the caller do things manually. But I think just making this function specific to trivy image is fine.

Comment on lines 259 to 260
// don't dump megabytes of trivy JSON output into the browser
if (sendNotifications && commandName !== 'trivy') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the caller set sendNotifications correctly instead of this weird hack?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the logic behind sendNotifications, but it is later used to process the output. If you set it to false, then the output is discarded.

@@ -255,20 +249,23 @@ export abstract class ImageProcessor extends EventEmitter {
*/
async processChildOutput(child: ChildProcess, subcommandName: string, sendNotifications: boolean, args?: string[]): Promise<childResultType> {
const result = { stdout: '', stderr: '' };
const commandName = args ? 'trivy' : this.processorName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be so confusing when we start trying to use args for one of the non-trivy commands…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but you won't. This is already a hack. args for other commands are not passed in at all, except for subcommandName, which is needed for another hack to treat nerdctl images differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, subcommandName (per the method docs) is for displaying to the user; the fact that it's used for nerdctl images is the hack, but we still need to preserve it for its primary purpose.

A better way to do this would be to just pass it in:

type ProcessChildOutputOptions = {
  commandName?: string;
  subcommandName: string;
  sendNotifications: boolean;
}
async processChildOutput(child: ChildProcess, options: ProcessChildOutputOptions, args?: string[]): Promise<childResultType> {
    const commandName = options.commandName ?? this.processorName;

Copy link
Member Author

@jandubois jandubois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update the comments, but the rest is not reasonable without a complete refactoring of this package.

* Run trivy with the given arguments; the first argument is generally a
* subcommand to execute.
* Run trivy inside the VM with the given arguments; the 'trivy' command must be included
* in the args, so it can be called like `CONTAINERD_NAMESPACE=k8s.io trivy image ...`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this.executor.spawn supports env options. And even if it did, they would not survive the sudo call without extra effort.

This will require a ton of changes and not solve the underlying issue that processChildOutput is not designed to be called with anything but docker or nerdctl.

@@ -255,20 +249,23 @@ export abstract class ImageProcessor extends EventEmitter {
*/
async processChildOutput(child: ChildProcess, subcommandName: string, sendNotifications: boolean, args?: string[]): Promise<childResultType> {
const result = { stdout: '', stderr: '' };
const commandName = args ? 'trivy' : this.processorName;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but you won't. This is already a hack. args for other commands are not passed in at all, except for subcommandName, which is needed for another hack to treat nerdctl images differently.

Comment on lines 259 to 260
// don't dump megabytes of trivy JSON output into the browser
if (sendNotifications && commandName !== 'trivy') {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the logic behind sendNotifications, but it is later used to process the output. If you set it to false, then the output is discarded.

Copy link
Contributor

@mook-as mook-as left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is piling on too much short-term hacks, which would make things unmaintainable.

If the decision is that we will not maintain this, then we should just not take this PR…

* Run trivy with the given arguments; the first argument is generally a
* subcommand to execute.
* Run trivy inside the VM with the given arguments; the 'trivy' command must be included
* in the args, so it can be called like `CONTAINERD_NAMESPACE=k8s.io trivy image ...`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't support changing the environment, but that's why we have PRs… How about we at least call this something other than runTrivyCommand (say, runTrivyImageCommand) so it's not pretending to be more generic than it is? If we specialize it just for that it's less confusing.

Or just drop this wrapper and have the caller do things manually. But I think just making this function specific to trivy image is fine.

@@ -255,20 +249,23 @@ export abstract class ImageProcessor extends EventEmitter {
*/
async processChildOutput(child: ChildProcess, subcommandName: string, sendNotifications: boolean, args?: string[]): Promise<childResultType> {
const result = { stdout: '', stderr: '' };
const commandName = args ? 'trivy' : this.processorName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, subcommandName (per the method docs) is for displaying to the user; the fact that it's used for nerdctl images is the hack, but we still need to preserve it for its primary purpose.

A better way to do this would be to just pass it in:

type ProcessChildOutputOptions = {
  commandName?: string;
  subcommandName: string;
  sendNotifications: boolean;
}
async processChildOutput(child: ChildProcess, options: ProcessChildOutputOptions, args?: string[]): Promise<childResultType> {
    const commandName = options.commandName ?? this.processorName;


return await this.processChildOutput(child, subcommandName, sendNotifications, args);
return await this.processChildOutput(child, subcommandName, true, args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return await this.processChildOutput(child, subcommandName, true, args);
return await this.processChildOutput(child, subcommandName, false, args);

Why the heck are we passing true here for sendNotifications only to add a hack to disable its effect when processing the argument‽

If you're doing this so we get stderr but not stdout, then we need separate flags for those.

return await this.runTrivyCommand(
[
'CONTAINERD_ADDRESS=/run/k3s/containerd/containerd.sock',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been tested on Windows? You're not going through sudo (nor env) there.

mook-as
mook-as previously approved these changes Jan 28, 2025
Copy link
Contributor

@mook-as mook-as left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've force-pushed the code into what I'd like to see.
@jandubois please check if you're okay with that, and if yes merge it (/set auto-merge)?

Run as root so trivy has access to the containerd socket.

Don't output the trivy JSON to the browser; this is extremely slow when
the output is several megabytes.

Also don't output the trivy JSON to images.log, as it makes the file
unwieldy.

Co-authored-by: Mark Yen <[email protected]>
Signed-off-by: Jan Dubois <[email protected]>
Signed-off-by: Mark Yen <[email protected]>
@jandubois jandubois merged commit d5260b6 into rancher-sandbox:main Jan 31, 2025
20 checks passed
@jandubois jandubois deleted the trivy branch February 3, 2025 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trivy: Image scanning doesn't appear to use local images
2 participants