Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excavate enhancement handle RAW_TEXT events #1634

Closed
domwhewell-sage opened this issue Aug 5, 2024 · 3 comments
Closed

Excavate enhancement handle RAW_TEXT events #1634

domwhewell-sage opened this issue Aug 5, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@domwhewell-sage
Copy link
Contributor

domwhewell-sage commented Aug 5, 2024

Description
Now that RAW_TEXT events are being raised from parsed FILESYSTEM events (And corrections have been made to scope distance of such events/yara rules have been implemented) the next step should be to make changes to the internal excavate.py module to consume RAW_TEXT and extract useful tidbits (URLS/DNS_NAMES etc.)

As the data of RAW_TEXT events is a string instead of an object (as in HTTP_RESPONSE) the handle_event() of excavate might require changes.

This is the error im getting currently

2024-08-05 15:35:15,958 [ERROR] bbot.scanner scanner.py:1195 Error in excavate.handle_event(RAW_TEXT("Example http://schemas.example.com/Example/2", module=unstructured, tags={'distance-1'})): /root/lib/python3.12/site-packages/bbot/modules/internal/excavate.py:931:handle_event(): 'str' object has no attribute 'get'
2024-08-05 15:35:15,958 [TRACE] bbot.scanner logger.py:132 Traceback (most recent call last):
  File "/root/lib/python3.12/site-packages/bbot/scanner/scanner.py", line 1172, in _acatch
    yield
  File "/root/lib/python3.12/site-packages/bbot/modules/base.py", line 637, in _worker
    await self.handle_event(event)
  File "/root/lib/python3.12/site-packages/bbot/modules/internal/excavate.py", line 931, in handle_event
    body = event.data.get("body", "")
           ^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'get'

I have some time so will look into this this week

@domwhewell-sage domwhewell-sage added the enhancement New feature or request label Aug 5, 2024
@TheTechromancer
Copy link
Collaborator

This is a super exciting feature. Feeding the contents of binary files like PDFs and word docs into excavate is a pretty unique capability, that will only get more insane as we keep adding filetypes. I'm imagining being able to download an app from the app store, decompile it, and feed its entire contents into excavate and trufflehog 🙌

@domwhewell-sage
Copy link
Contributor Author

Some work being done on this #1636

@domwhewell-sage
Copy link
Contributor Author

Closing as merged into dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants