Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store job files in the database instead of ursadb iterators #381

Open
msm-cert opened this issue Feb 15, 2024 · 1 comment · May be fixed by #420
Open

Store job files in the database instead of ursadb iterators #381

msm-cert opened this issue Feb 15, 2024 · 1 comment · May be fixed by #420
Assignees
Labels
zone:backend Backend oriented tasks
Milestone

Comments

@msm-cert
Copy link
Member

msm-cert commented Feb 15, 2024

That way we can stop (ab)using iterators (and maybe even deprecate them in ursadb - they're a bit problematic in case of failed jobs).

And with postgres it won't be a problem.

@msm-cert msm-cert added the zone:backend Backend oriented tasks label Sep 16, 2024
@msm-cert
Copy link
Member Author

msm-cert commented Sep 16, 2024

During a team meeting I've mentioned that some metadata is still in Redis instead of Postgres. Looks like I was wrong, and this was (the last thing) fixed in Feb.

But there are still some things that are not in Redis but should be. This includes the list of matched files.

In the query_ursadb function, we first select files into ursadb iterator (by using the into iterator {query} statement in the query), and then in the run_yara_batch function we "pop" files from the iterator and run yara on them.

Instead, we should run a normal query, save all prefiltered files into the database, and then read unprocessed files from the database instead of from ursadb.

This should be a separate table (not Match) with just job Id and file path. It should work a bit like a task queue and after processing files should be removed.

In short, the roadmap:

  • remove into iterator {query}; from query in ursad
  • then the result is a list of files instead of an iterator. They must be saved into a new table in the database
  • prepare a migration that creates this table
  • finally, rework run_yara_batch such that it gets files to process from the database instead of by using ursadb pop
  • also remove dead code (like ursadb.pop) later.

This was referenced Sep 16, 2024
@msm-cert msm-cert added this to the v1.5.0 milestone Sep 29, 2024
@msm-cert msm-cert modified the milestones: v1.5.0, Sprint 1 Oct 17, 2024
@mickol34 mickol34 linked a pull request Oct 17, 2024 that will close this issue
4 tasks
@msm-cert msm-cert modified the milestones: v1.5.0, v1.6.0 Nov 18, 2024
@msm-cert msm-cert modified the milestones: v1.6.0, v1.7.0 Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
zone:backend Backend oriented tasks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants