Fix Garbage collection and disappearing ports issue #3214

Donnype · 2024-07-09T19:27:27Z

Changes

This, ideally, fixes the garbage collection issue with nmap and nmap-udp.

It also adds a rather involved migration script that should still be thoroughly tested somehow. There we take all origins that should be updated and add data from bytes to the new source_method field. (The name for that field is very much to be discussed, by the way.) Note that it has been added to the boefjes module because it has connections to both Bytes and Octopoes already and I guess I'm more comfortable there. I wouldn't mind if there's a solid argument for using Rocky instead.

Issue link

Closes #2875

QA notes

Follow the instructions in #2875 and hopefully the issue is now gone.

However, there is a much more intricate part to QA here: the migration script. We need to update the existing origins using Octopoes and Bytes at the same time. This means: getting a realistic working setup, building, and triggering the migration using Docker:

git checkout fix/disappearing-ports
git pull
make kat
docker compose exec boefje python -m tools.upgrade_v1_16_0

Now the issue should not be present anymore.

The logs ideally show no exceptions and end with total_failures=0. If not, it can be rerun multiple times in case a random network error appeared. I say this because we could potentially hit the complete database and hence the script had a high chance of failing some API calls in production systems. If there are consistent exceptions however, there is probably a bug.

UPDATE:

Benchmark without new filter:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    7.903    7.903 /app/boefjes/tests/integration/test_bench.py:19(test_migration)
        1    0.000    0.000    4.364    4.364 /app/boefjes/tools/upgrade_v1_16_0.py:39(upgrade)
        1    0.003    0.003    3.953    3.953 /app/boefjes/tools/upgrade_v1_16_0.py:71(migrate_org)
       30    0.001    0.000    3.193    0.106 /app/boefjes/tools/upgrade_v1_16_0.py:133(update_origin)
        1    0.000    0.000    2.362    2.362 /app/boefjes/tests/conftest.py:1(<module>)
        2    0.000    0.000    1.362    0.681 /app/boefjes/tests/conftest.py:173(octopoes_api_connector)
        1    0.000    0.000    1.314    1.314 /app/boefjes/boefjes/app.py:1(<module>)
      121    0.000    0.000    0.879    0.007 /app/boefjes/boefjes/clients/bytes_client.py:20(wrapper)
        2    0.000    0.000    0.773    0.386 /app/boefjes/boefjes/clients/bytes_client.py:48(login)
        2    0.000    0.000    0.773    0.386 /app/boefjes/boefjes/clients/bytes_client.py:62(_get_authentication_headers)

Benchmark with new filter:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    5.428    5.428 /app/boefjes/tests/integration/test_bench.py:19(test_migration)
        1    0.000    0.000    2.399    2.399 /app/boefjes/tests/conftest.py:1(<module>)
        1    0.000    0.000    1.427    1.427 /app/boefjes/tools/upgrade_v1_16_0.py:40(upgrade)
        2    0.000    0.000    1.410    0.705 /app/boefjes/tests/conftest.py:173(octopoes_api_connector)
        1    0.000    0.000    1.349    1.349 /app/boefjes/boefjes/app.py:1(<module>)
        1    0.000    0.000    1.012    1.012 /app/boefjes/tools/upgrade_v1_16_0.py:72(migrate_organisation)
        1    0.001    0.001    0.823    0.823 /app/boefjes/tools/upgrade_v1_16_0.py:147(collect_boefjes_per_normalizer)
        1    0.000    0.000    0.821    0.821 /app/boefjes/boefjes/dependencies/plugins.py:55(_get_all_without_enabled)
        1    0.000    0.000    0.806    0.806 /app/boefjes/boefjes/local_repository.py:28(get_all)
      100    0.000    0.000    0.802    0.008 /app/boefjes/boefjes/clients/bytes_client.py:20(wrapper)

Note that the migrate_org method first takes almost 4 seconds, and then this drops to 1 second, of which 800 ms is from collect_boefjes_per_normalizer which we only call once. Actually calling the new bulk endpoint takes only 28 ms for 30 origins. Still 30 origins per 200 ms for a 30% rate of to-be-updated origins (e.g. nmap) means we could do 150 origins per second in my setup.

Code Checklist

All the commits in this PR are properly PGP-signed and verified.
This PR only contains functionality relevant to the issue.
I have written unit tests for the changes or fixes I made.
I have checked the documentation and made changes where necessary.
I have performed a self-review of my code and refactored it to the best of my abilities.

For any non-trivial functionality, I have added integration and/or end-to-end tests.
I have included comments in the code to elaborate on what is not self-evident from the code itself, including references to issues and discussions online, or implicit behavior of an interface.

Checklist for code reviewers:

Copy-paste the checklist from the docs/source/templates folder into your comment.

Checklist for QA:

Copy-paste the checklist from the docs/source/templates folder into your comment.

Signed-off-by: Donny Peeters <[email protected]>

Add an origin migration script

stephanie0x00 · 2024-07-10T13:00:42Z

Checklist for QA:

I have checked out this branch, and successfully ran a fresh make reset.
I confirmed that there are no unintended functional regressions in this branch:
- I have managed to pass the onboarding flow
- Objects and Findings are created properly
- Tasks are created and completed properly
I confirmed that the PR's advertised feature or hotfix works as intended.
I checked the logs for errors and/or warnings and made issues where necessary

What works:

Works like a champ! I've tried to get the finding to disappear again by enabling both nmap TCP, nmap Ports and nmap UDP, but with all those boefjes the finding remains as it is. Logs look clear and the finding for the IPs are also identified by the corresponding boefjes.

Can be merged when Review is done.

What doesn't work:

n/a

Bug or feature?:

n/a

Signed-off-by: Donny Peeters <[email protected]>

underdarknl · 2024-07-11T14:14:41Z

Looks like the test trips over a missing Finding. Namely port 3306 being a database port. Did we seed the database with a correct Ports config ooi? If not, the port will be present, but the BIT will not make a finding.

Signed-off-by: Donny Peeters <[email protected]>

Donnype · 2024-07-11T14:20:27Z

@underdarknl I don't think the config will have impact because the defaults should work. I think it's some race condition because it works locally but not in the CI..

boefjes/tools/upgrade_v1_16_0.py

ammar92

Looks good to me. I haven't gone into detail with the script, since it's likely a temporary tool and won't be needed once everyone has migrated. However, here are a few notes about the script to keep in mind:

Since it's more like a script rather than a CLI, using click here is redundant
We've recently introduced structlog, which should be preferred over Python's builtin logging
Instead of using literal numbers for HTTP status codes, you should use the constants provided in httpx.codes

dekkers · 2024-07-12T10:07:07Z

@underdarknl I don't think the config will have impact because the defaults should work. I think it's some race condition because it works locally but not in the CI..

I think the race condition is in the test. recalculate_bits uses the current datetime for the valid time: https://github.com/minvws/nl-kat-coordination/blob/main/octopoes/octopoes/core/service.py#L606

So I think we also need to use a newer valid time in the test after calling recalculate_bits.

dekkers

I am a bit worried that it will take a long time for bigger databases. Do we know how many origins per second it can do?

A more efficient way would be to do more in a single database query. For example get all the source methods in a single query (or batches of 1000 or something), then update the corresponding origins in XTDB with a single transaction. But we might not have the infrastructure / architecture that we can do that easily...

octopoes/tests/integration/test_api_connector.py

boefjes/tools/upgrade_v1_16_0.py

Signed-off-by: Donny Peeters <[email protected]>

ammar92 · 2024-07-16T14:32:37Z

Checklist for QA:

I have checked out this branch, and successfully ran a fresh make reset.
I confirmed that there are no unintended functional regressions in this branch:
- I have managed to pass the onboarding flow
- Objects and Findings are created properly
- Tasks are created and completed properly
I confirmed that the PR's advertised feature or hotfix works as intended.
I checked the logs for errors and/or warnings and made issues where necessary

What works:

Migration tool works; bug fixed; verified and tested together with @Donnype

…ng the `io` endpoints.

Signed-off-by: Donny Peeters <[email protected]>

Co-authored-by: Jan Klopper <[email protected]>

dekkers

Some small things, but it looks good in general 👍

dekkers · 2024-08-05T14:32:03Z

boefjes/pyproject.toml

+    "D:SCHEDULER_API=http://placeholder:8002",
+    "D:BYTES_API=http://placeholder:8003",
+    "D:BYTES_USERNAME=placeholder",
+    "D:BYTES_PASSWORD=placeholder",


If you run the tests in the containers, those variables will already be set, but the tests should use the variables as defined here. This means we shouldn't use D: here.

That makes it hard to perform integration tests where we want a different set of environment variables though. Isn't a containerized unit test a scenario for which we want to create a separate ci container with its own environment as well? Else we are going to have to figure out how to split sets of environment variables for these two scenarios (perhaps by overwriting these in the integration test setup by passing settings objects instead of the global accessing, although I'd say that makes more sense to do for the unit tests)

boefjes/.ci/docker-compose.yml

octopoes/tests/integration/test_api_connector.py

octopoes/octopoes/api/router.py

boefjes/tools/upgrade_v1_16_0.py

stephanie0x00 · 2024-08-06T10:38:26Z

Checklist for QA:

I have checked out this branch, and successfully ran a fresh make reset.
I confirmed that there are no unintended functional regressions in this branch:
- I have managed to pass the onboarding flow
- Objects and Findings are created properly
- Tasks are created and completed properly
I confirmed that the PR's advertised feature or hotfix works as intended.
I checked the logs for errors and/or warnings and made issues where necessary

What works:

Migration scenario works! We tested from version 1.15.0 and then followed the steps as mentioned in the QA notes. Testing was done against mispo.es. After performing nmap tcp + udp scans on 1.15.0 the Open Port finding for port 3306 and 22 disappears. After rescheduling both tcp + udp scans on this branch, the Open Port finding for 3306 and 22 appear in the findings overview.

What doesn't work:

Nothing found.

Bug or feature?:

Nothing found.

* main: Basic audit trails via logging (#3317) Raw upload with Scan OOIS (#3169) Fix Garbage collection and disappearing ports issue (#3214) Updated `Django` and `opentelemetry` packages (#3324)

* feature/mula/refactor-queue: Fix poetry Updates according to code review Basic audit trails via logging (#3317) Raw upload with Scan OOIS (#3169) Fix Garbage collection and disappearing ports issue (#3214) Formatting Formatting Fix formatting Updated `Django` and `opentelemetry` packages (#3324) Restructure scheduler development scripts (#3293) Change report flow to POST requests (#3174)

Donnype added 6 commits June 28, 2024 16:21

Add integration test confirming the deletion bug

2aaebb7

Signed-off-by: Donny Peeters <[email protected]>

Add source_method field with optional boefje id

966e842

Assert that the fix works in the integration test

7ec36e0

Enhance origin API

43d1d58

Add an origin migration script

Add release notes for upgrading

2f9e0d1

Remove 1.16 from release notes for now again

421c6f2

Donnype requested a review from a team as a code owner July 9, 2024 19:27

Merge branch 'main' into fix/disappearing-ports

f25363a

Donnype self-assigned this Jul 9, 2024

Donnype and others added 4 commits July 11, 2024 10:38

Fix Rocky integration test

921d2e4

Signed-off-by: Donny Peeters <[email protected]>

Merge branch 'main' into fix/disappearing-ports

4b09bf2

Fix style

c1419c3

Signed-off-by: Donny Peeters <[email protected]>

Recalculate bits twice for some delay to fix the CI pipeline

032a24e

Signed-off-by: Donny Peeters <[email protected]>

underdarknl and others added 2 commits July 11, 2024 16:14

Merge branch 'main' into fix/disappearing-ports

106aa01

Fix nmap-udp image

5d85d65

Signed-off-by: Donny Peeters <[email protected]>

underdarknl reviewed Jul 12, 2024

View reviewed changes

boefjes/tools/upgrade_v1_16_0.py Outdated Show resolved Hide resolved

ammar92 reviewed Jul 12, 2024

View reviewed changes

dekkers reviewed Jul 12, 2024

View reviewed changes

octopoes/tests/integration/test_api_connector.py Outdated Show resolved Hide resolved

boefjes/tools/upgrade_v1_16_0.py Outdated Show resolved Hide resolved

Bugfix

6760670

Signed-off-by: Donny Peeters <[email protected]>

Donnype marked this pull request as draft July 16, 2024 14:28

Donnype changed the title ~~Fix Garbage collection and disappearing ports issue~~ [Awaiting Performance fixes] Fix Garbage collection and disappearing ports issue Jul 16, 2024

Donnype changed the title ~~[Awaiting Performance fixes] Fix Garbage collection and disappearing ports issue~~ [Awaiting Performance Updates] Fix Garbage collection and disappearing ports issue Jul 17, 2024

Donnype added 3 commits July 18, 2024 17:40

initialize benchmark setup in boefjes CI

9daba3c

Seed a lot more of a smaller set of types

d324309

Create octopoes dump without source_method for the migration test usi…

c3d63bf

…ng the `io` endpoints.

Donnype changed the title ~~[Awaiting Performance Updates] Fix Garbage collection and disappearing ports issue~~ Fix Garbage collection and disappearing ports issue Jul 30, 2024

Remove duplicate recalculate_bits call

9276d5f

Signed-off-by: Donny Peeters <[email protected]>

Donnype marked this pull request as ready for review July 30, 2024 13:20

Donnype and others added 16 commits July 30, 2024 15:20

Merge branch 'main' into fix/disappearing-ports

d251d2c

Get new findings list with current datetime

e3582fb

Signed-off-by: Donny Peeters <[email protected]>

Style

534ee68

Signed-off-by: Donny Peeters <[email protected]>

Fix valid time management in disappearing ports integration test

9e745bc

Signed-off-by: Donny Peeters <[email protected]>

Change valid time variable names

1835ba7

Signed-off-by: Donny Peeters <[email protected]>

Test CI timing

a5c62d1

Signed-off-by: Donny Peeters <[email protected]>

Remove several sleeps

3490dc4

Signed-off-by: Donny Peeters <[email protected]>

remove more sleeps

9429b73

Signed-off-by: Donny Peeters <[email protected]>

move around sleeps

a8a85a2

Signed-off-by: Donny Peeters <[email protected]>

Remove one sleep

86ed67a

Signed-off-by: Donny Peeters <[email protected]>

Add some sleeps again around the recalculate bits method

e49153c

Signed-off-by: Donny Peeters <[email protected]>

Merge branch 'main' into fix/disappearing-ports

9c076a5

Merge branch 'main' into fix/disappearing-ports

e15cb9b

Merge branch 'main' into fix/disappearing-ports

a6fb676

Update boefjes/tools/upgrade_v1_16_0.py

a53eda2

Co-authored-by: Jan Klopper <[email protected]>

Merge branch 'main' into fix/disappearing-ports

3dabb52

dekkers reviewed Aug 5, 2024

View reviewed changes

Part of PR comments

37cb2d3

Donnype added 2 commits August 6, 2024 13:40

Merge branch 'main' into fix/disappearing-ports

0863afa

Merge branch 'main' into fix/disappearing-ports

7bdf3de

Donnype mentioned this pull request Aug 8, 2024

Fix integration test vs. unit test environment variables for the boefjes #3333

Open

underdarknl approved these changes Aug 8, 2024

View reviewed changes

underdarknl merged commit 3623dd1 into main Aug 8, 2024
28 checks passed

underdarknl deleted the fix/disappearing-ports branch August 8, 2024 08:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Garbage collection and disappearing ports issue #3214

Fix Garbage collection and disappearing ports issue #3214

Donnype commented Jul 9, 2024 •

edited

Loading

stephanie0x00 commented Jul 10, 2024

underdarknl commented Jul 11, 2024

Donnype commented Jul 11, 2024

ammar92 left a comment

dekkers commented Jul 12, 2024

dekkers left a comment

ammar92 commented Jul 16, 2024

dekkers left a comment

dekkers Aug 5, 2024

Donnype Aug 6, 2024 •

edited

Loading

Donnype Aug 8, 2024

stephanie0x00 commented Aug 6, 2024

Fix Garbage collection and disappearing ports issue #3214

Fix Garbage collection and disappearing ports issue #3214

Conversation

Donnype commented Jul 9, 2024 • edited Loading

Changes

Issue link

QA notes

Code Checklist

Checklist for code reviewers:

Checklist for QA:

stephanie0x00 commented Jul 10, 2024

Checklist for QA:

What works:

What doesn't work:

Bug or feature?:

underdarknl commented Jul 11, 2024

Donnype commented Jul 11, 2024

ammar92 left a comment

Choose a reason for hiding this comment

dekkers commented Jul 12, 2024

dekkers left a comment

Choose a reason for hiding this comment

ammar92 commented Jul 16, 2024

Checklist for QA:

What works:

dekkers left a comment

Choose a reason for hiding this comment

dekkers Aug 5, 2024

Choose a reason for hiding this comment

Donnype Aug 6, 2024 • edited Loading

Choose a reason for hiding this comment

Donnype Aug 8, 2024

Choose a reason for hiding this comment

stephanie0x00 commented Aug 6, 2024

Checklist for QA:

What works:

What doesn't work:

Bug or feature?:

Donnype commented Jul 9, 2024 •

edited

Loading

Donnype Aug 6, 2024 •

edited

Loading