Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add base64 unpacker #25

Open
0xricksanchez opened this issue Aug 14, 2019 · 1 comment
Open

add base64 unpacker #25

0xricksanchez opened this issue Aug 14, 2019 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@0xricksanchez
Copy link
Contributor

Currently there is no unpacker for base64 encoded data streams

@0xricksanchez 0xricksanchez added the enhancement New feature or request label Aug 14, 2019
@0xricksanchez 0xricksanchez self-assigned this Aug 14, 2019
@0xricksanchez
Copy link
Contributor Author

Base64 has no special delimiter symbol. Also there are multiple possible and valid B64 encodings as listed in: https://en.wikipedia.org/wiki/Base64#Variants_summary_table

Unpacking b64 is trivial. However determining if a string sequence is really b64 encoded has no reliable measure. The following has to always hold true for potential b64 match:

  • Always present Character set: A-Za-z0-9=
    • Depending on b64 specification: +/_:.,~-
      • Default for most: +/
      • Default for safe URL encodings: -_
  • string length % 4 == 0

A regex to determine whether a string may be b64 encoded could look like this:

^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)?$

This checks whether the character sequence has zero or more valid b64 blocks of size 4 (because even the empty string is encoded to 4 ASCII characters). If the final sequence block has not length 4 it is checked for padding (either 1 or 2 "=").

Important Note: This still matches things like "aaaa" as it could be valid b64.

In the context of firmware unpacking, decoded strings could be either valid ASCII sequences or byte patterns, hence there is no trivial means for validation for decoded sequences.

m-1-k-3 pushed a commit to m-1-k-3/fact_extractor that referenced this issue Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant