Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasks contain property that tells the boefjerunner what network it is supposed to run on. #3299

Draft
wants to merge 49 commits into
base: main
Choose a base branch
from

Conversation

Souf149
Copy link
Contributor

@Souf149 Souf149 commented Jul 25, 2024

Changes

Made it so that newly created tasks (made from an IPAddress OOI) contain an attribute that explains what network the task should be run on.

Issue link

Closes #3222

Demo

image
Task not being ran because only boefje runners which check for the network: "internal-cynalytics" (which is currently not being active) has not picked it up.

remote-boefje-demonstration-compressed.mp4

A video showing off a task with a specific not being picked up until another boefje-runner has been started.

QA notes

(Note that a new environment variable has been made called "network_scopes")

Try creating a new IPAddress with network "internet" and observe that it runs successfully. And after that create another IPAddress with a new self-made network and observe that wont be ran.

When you run the boefje container with your new network name included in the environment variable then the task should be ran.


Code Checklist

  • All the commits in this PR are properly PGP-signed and verified.
  • This PR only contains functionality relevant to the issue.
  • I have written unit tests for the changes or fixes I made.
  • I have checked the documentation and made changes where necessary.
  • I have performed a self-review of my code and refactored it to the best of my abilities.
  • Tickets have been created for newly discovered issues.
  • For any non-trivial functionality, I have added integration and/or end-to-end tests.
  • I have informed others of any required .env changes files if required and changed the .env-dist accordingly.
  • I have included comments in the code to elaborate on what is not self-evident from the code itself, including references to issues and discussions online, or implicit behavior of an interface.

Checklist for code reviewers:

  • The code does not violate Model-View-Template and our other architectural principles.
  • The code prioritizes readability over performance where appropriate.
  • The code does not bypass authentication or security mechanisms.
  • The code does not introduce any dependency on a library that has not been properly vetted.
  • The code contains docstrings, comments, and documentation where needed.

Checklist for QA:

  • I have checked out this branch, and successfully ran a fresh make reset.
  • I confirmed that there are no unintended functional regressions in this branch:
    • I have managed to pass the onboarding flow
    • Objects and Findings are created properly
    • Tasks are created and completed properly
  • I confirmed that the PR's advertised feature or hotfix works as intended.
  • I checked the logs for errors and/or warnings and made issues where necessary

What works:

  • bullet point + screenshot (if useful) per tested functionality

What doesn't work:

  • bullet point + screenshot (if useful) per tested functionality

Bug or feature?:

  • bullet point + screenshot (if useful) if it is unclear whether something is a bug or an intended feature.


# TODO: see what types have a network attribute. How?
network_scope = "internet"
if ooi.object_type in ["IPAddressV4", "IPAddressV6"]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could check if ooi has an attribute network, but this would rely on every OOI needing to name their network network. Is there a better solution to this?

@underdarknl
Copy link
Contributor

The filtering logic in the Boefje runner in this PR looks like what I expect.
Did you also test with a runner that has access to more than 1 network, eg "internet" and "dentist"?
The logic of finding the associated network for each Job by querying the OOI itself can probably be improved on. @jpbruinsslot maybe you can chime in on that?

Furthermore, for now Scopes are 'networks', but scopes might in the future also become Hosts or even containers in which a dedicated runner can scan.

@Souf149
Copy link
Contributor Author

Souf149 commented Jul 31, 2024

Yes, boefje runners containing more than 1 network-scope are also able to query for both the types of networks.

Furthermore, for now Scopes are 'networks', but scopes might in the future also become Hosts or even containers in which a dedicated runner can scan.

Could you elaborate what you mean with this?
I think the current network_scopes should keep only containing scopes on what networks can be reached for that runner.

Another field could be introduced that explains other limitations of the machine (such as not being able to handle IPv6 requests).
Actually instead of doing that we can also add a new filter that makes the boefjerunner not accept tasks that contain an IPv6 ooi. Would love to have a discussion about this.

@jpbruinsslot
Copy link
Contributor

The filtering logic in the Boefje runner in this PR looks like what I expect. Did you also test with a runner that has access to more than 1 network, eg "internet" and "dentist"? The logic of finding the associated network for each Job by querying the OOI itself can probably be improved on. @jpbruinsslot maybe you can chime in on that?

Furthermore, for now Scopes are 'networks', but scopes might in the future also become Hosts or even containers in which a dedicated runner can scan.

I'd recommend expanding on the BoefjeTask (internal to scheduler, BoefjeMeta elsewhere) to add network_scopes field. On this field can be filtered on when popping.

@Souf149
Copy link
Contributor Author

Souf149 commented Jul 31, 2024

The filtering logic in the Boefje runner in this PR looks like what I expect. Did you also test with a runner that has access to more than 1 network, eg "internet" and "dentist"? The logic of finding the associated network for each Job by querying the OOI itself can probably be improved on. @jpbruinsslot maybe you can chime in on that?
Furthermore, for now Scopes are 'networks', but scopes might in the future also become Hosts or even containers in which a dedicated runner can scan.

I'd recommend expanding on the BoefjeTask (internal to scheduler, BoefjeMeta elsewhere) to add network_scopes field. On this field can be filtered on when popping.

I think @underdarknl was talking about this piece of code: #3299 (comment)

Currently only tasks created from IPAddressV4s and IPAddressV6s since they have an attribute network that can be used to decide the network_scope (I know there is more than these 2, but in the current way these would have to be hardcoded). Is it safe to assume that the network attribute of all OOIs (when they have them) are always going to be called network? Or is there a better solution?

@underdarknl
Copy link
Contributor

related: #273

@Souf149
Copy link
Contributor Author

Souf149 commented Oct 7, 2024

I have ran into an issue. I am not able to make the query I want to make using the mula API without restructuring the existing models inside of the scheduler's database.

What I want to happen:

scheduler

When tasks get created, depending on the OOI this task is created from. Set a new field on the task called requirements. This field is an array of str which can contain the following:

  • "ipv4": To announce that the boefje runner should be able to execute IPv4 requests
  • "ipv6": The same as the previous, but with IPv6 instead
  • A string starting with "network/": To announce that the boefje runner can reach the network that is written after this prefix. Some examples include: "network/internet", "network/dentist" and "network/remote-location".

This information gets saved inside the tasks together with the OOI inside the data field.

boefje runner

There is a new environment variable called BOEFJES_TASK_CAPABILITIES that contains an array of all the things this boefje runner can reach / do. This array can contain all the strings that the aforementioned requirements can contain.
When a new task gets popped, the boefje runner can specify what kind of tasks they want to get. So imagine the following (simplified) example.

Example

Database:

[
    {
        "id":"abc42",
        "status":"queued",
        "data":{
            "requirements":[
                "ipv4",
                "network/internet"
            ]
        }
    },
    {
        "id":"zyh24",
        "status":"queued",
        "data":{
            "requirements":[
                "ipv6",
                "network/dentist",
                "network/internet"
            ]
        }
    }
]

If a boefje with the capabilities ["ipv4", "network/dentist"] would request a task. ❌ No tasks would be found

If a boefje with the capabilities ["ipv4", "network/internet"] would request a task. ✅ The first task would be found

If a boefje with the capabilities ["ipv4", "network/dentist", "network/internet"] would request a task. ✅ The first task would be found

If a boefje with the capabilities ["ipv6", "network/dentist"] would request a task. ❌ No tasks would be found

If a boefje with the capabilities ["ipv4", "ipv6", "network/dentist", "network/internet"] would request a task. ✅✅ Both tasks would be found.

I would love to hear/discuss feedback on how this could be improved.

@Souf149
Copy link
Contributor Author

Souf149 commented Oct 7, 2024

A simplified version of what the requests should look like.

image

@underdarknl
Copy link
Contributor

I think it would be wise to keep 'scopes' (where does this OOI live), and capabilities (what capabilies does the boefje runner need to have) seperated.

A boefje could 'add/specify' certain capabilities, like needing ipv6 support, whereas an OOI 'lives' somewhere which will change depending on the OOI itself.
Boefjes that do ipv6 checks only make sense if the runner has an ipv6 address,regardless of what OOI they are asked to ingest, whereas multiple OOI's can live in different parts of the graph on different networks regardless of the Boefjes that are running on them.

There's a bunch of caveats there, obviously.

  • For local IP's (bound to network|dentist) it's still needed to run the Findings hydration boefjes on the Internet. (I think internet access should be the capability our runner presents). These OOI's (FindingTypes) should always bind to Network|Internet perhaps?
  • Whereas running Shodan on local Ip's never makes sense, and as such we should never produce jobs for that in the first place.
  • Nmap, regardless of where its being ran should check (based on the OOITYPE) if it needs ipv4 or ipv6 capabilities.
  • Website scanning should be ran in the network scope that's associated with the OOI itself.

@jpbruinsslot jpbruinsslot mentioned this pull request Oct 8, 2024
9 tasks
@Souf149
Copy link
Contributor Author

Souf149 commented Oct 16, 2024

I think it would be wise to keep 'scopes' (where does this OOI live), and capabilities (what capabilies does the boefje runner need to have) seperated.

Do you think this is necessary? Since both the capabilities and the network requirements are now both based on the OOI of the task, they would always be created together. And querying for tasks only happens in one place now. From how I see it, splitting those two up would only cause the Tasks to be more bloated.

For local IP's (bound to network|dentist) it's still needed to run the Findings hydration boefjes on the Internet. (I think internet access should be the capability our runner presents). These OOI's (FindingTypes) should always bind to Network|Internet perhaps?

Good realisation! Inside Rocky (since we have access to octopoes already there) we can specifically check if the OOI type is of FindingTypeType and then specify that those tasks should always have access to the internet.

A question I have had is that if it is okay for the task to be created by Rocky. Should rocky be allowed to decide the requirements of the task? This would make it easier because rocky has full access about the OOI. (a month ago the scheduler also did, but this was removed recently)

Whereas running Shodan on local Ip's never makes sense, and as such we should never produce jobs for that in the first place.

Hmm interesting situation, do you think the boefjes that work like Shodan should contain a property that notifies the KAT-alogus that it should always be ran on the internet network? Or can we hardcode this at the task creation?

Nmap, regardless of where its being ran should check (based on the OOITYPE) if it needs ipv4 or ipv6 capabilities.

This is planned to be done already. The task created will be off a Ipv4 or Ipv6 type by using the OOI. And the Nmap boefje already checks whether the given IPAddress is of type ipv4 or 6.

Website scanning should be ran in the network scope that's associated with the OOI itself.

Currently websites do not have a network attribute. So it would be hard to find out on what network they lie on.
Would it be an idea to add a network attribute to the base OOI class? This could also fix the issue of FindingTypeType always having to be of internet network.

This is however something that can be looked at after the base functionality of this PR has been reached.

@Souf149
Copy link
Contributor Author

Souf149 commented Oct 25, 2024

Current version works satisfactory in my opinion. Tasks that get created from OOIs that contain a network attribute now only gets popped by boefje runners that have that network inside their env.

I think it is fine to hardcode tasks based from Findingtypes to the internet since they are a special kind of OOI. This has yet to be implemented.

Instead of adding a new attribute to boefje's resources that tell the scheduler that they should only be ran on OOIs. We could also make the boefjes check their OOI before to see if they should run. (so for example, a shodan boefje will not push any raw data if the OOI they received is from a network outside the internet.

Websites/URLs are a bit odd since they don't already have a network attribute. A solution I can think of is that we can make every OOI see if their existing references have a network attribute and then assign a network to websites as well.

I am currently implementing:

I think it would be wise to keep 'scopes' (where does this OOI live), and capabilities (what capabilies does the boefje runner need to have) seperated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog / To do
Development

Successfully merging this pull request may close these issues.

There is currently no way of having a boefje gather information outside the OpenKAT instance.
3 participants