Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the obfuscation process to retain the original patterns in logs. #59

Open
lquerel opened this issue Oct 10, 2023 · 0 comments
Open

Comments

@lquerel
Copy link
Contributor

lquerel commented Oct 10, 2023

The current obfuscation process preserves the length of the original text but doesn't maintain the patterns commonly found in logs. Consider the following logs as an example:

12:30:45 server 123 received http post request
12:31:23 server stopped with error "internal error"
12:31:24 server started
12:31:24 server 123 received http get request
12:31:24 server 456 received http get request

A log entry such as 12:30:45 server 123 received http post request likely results from a printf (or similar) function with the format string "%s server %d received http %s request", using three parameters: timestamp, server id, and http method. The last two log entries follow the same pattern.

We seek an obfuscation method that retains these patterns while adhering to privacy and security constraints.

A potential approach is to split the log entry based on separator characters (e.g., spaces, commas, colons, dots), then obfuscate individual words, and finally reassemble them with the separators. So, instead of completely obfuscated logs like:

12:30:45 server 123 received http post request --> 34DF32dfgre0943tlkfgj0934tjlkjg09u34ldfklg
12:31:24 server 123 received http get request  --> 6u7kdjfhwnsd09wrjklsdmmw35-fd023;lks-56

We might get:

sdf4dv4l 34ft8o 785 qw4532 8ghj ywe4 lyt764
7l:d3:0k 34ft8o 456 qw4532 8ghj 6hy lyt764
  -  -   ------     -----------     -------       <-- preserved patterns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant