Skip to content

Commit

Permalink
Removed ML from well known pattern (#448)
Browse files Browse the repository at this point in the history
* removed extra keys

* removed ml for well-known prefixes patterns

* tests fixed

* benchmark scores fix

* benchmark scores fix 2
  • Loading branch information
babenek authored Nov 1, 2023
1 parent 3467170 commit f76d16a
Show file tree
Hide file tree
Showing 7 changed files with 293 additions and 172 deletions.
8 changes: 4 additions & 4 deletions cicd/benchmark.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,16 @@ Predefined Pattern 326 2 40
Private Key 1001 1 3
Seed, Salt, Nonce 40 4 4
TOTAL: 5307 63688 5644
Detected Credentials: 5993
credsweeper result_cnt : 5337, lost_cnt : 0, true_cnt : 4439, false_cnt : 898
Detected Credentials: 5997
credsweeper result_cnt : 5339, lost_cnt : 0, true_cnt : 4441, false_cnt : 898
Category TP FP TN FN FPR FNR ACC PRC RCL F1
-------------------------- ---- ---- -------- ---- --------- --------- -------- -------- -------- --------
Authentication Key & Token 54 4 28 16 0.125 0.228571 0.803922 0.931034 0.771429 0.84375
Generic Secret 973 3 215 83 0.0137615 0.0785985 0.932496 0.996926 0.921402 0.957677
Generic Token 287 7 596 46 0.0116086 0.138138 0.943376 0.97619 0.861862 0.91547
Generic Token 289 7 596 44 0.0116086 0.132132 0.945513 0.976351 0.867868 0.918919
Other 818 750 63395 258 0.0116923 0.239777 0.984545 0.521684 0.760223 0.618759
Password 995 130 4150 410 0.0303738 0.291815 0.905013 0.884444 0.708185 0.786561
Predefined Pattern 309 2 40 17 0.0476191 0.0521472 0.94837 0.993569 0.947853 0.970173
Private Key 967 0 4 34 0.033966 0.966169 1 0.966034 0.982724
Seed, Salt, Nonce 36 2 6 4 0.25 0.1 0.875 0.947368 0.9 0.923077
4439 898 19428253 868 4.622e-05 0.163558 0.999909 0.831741 0.836442 0.834085
4441 898 19428253 866 4.622e-05 0.163181 0.999909 0.831804 0.836819 0.834304
23 changes: 0 additions & 23 deletions credsweeper/rules/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>(ABIA|ACCA|AGPA|AIDA|AIPA|AKIA|ANPA|ANVA|AROA|APKA|ASCA|ASIA)[0-9A-Z]{16,17})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- A
min_line_len: 20
Expand All @@ -212,7 +211,6 @@
- (^|[^.0-9A-Za-z_/+-])(?P<value>(AKIA|ASIA)[0-9A-Z]{16,17})([^=0-9A-Za-z_/+-]|$)
- (?P<value>[0-9a-zA-Z/+]{40})
filter_type: GeneralPattern
use_ml: true
required_substrings:
- AKIA
- ASIA
Expand All @@ -224,7 +222,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>amzn\.mws\.[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- amzn
min_line_len: 30
Expand All @@ -247,7 +244,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>dt0[a-zA-Z]{1}[0-9]{2}\.[A-Z0-9]{24}\.[A-Z0-9]{64})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- dt0
min_line_len: 90
Expand All @@ -258,7 +254,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>EAAC[0-9A-Za-z]{27,})
filter_type: GeneralPattern
use_ml: true
required_substrings:
- EAAC
min_line_len: 31
Expand All @@ -282,7 +277,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>AIza[0-9A-Za-z_-]{35})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: false
validations:
- GoogleApiKeyValidation
required_substrings:
Expand All @@ -296,7 +290,6 @@
- (?P<value>[0-9]+\-[0-9A-Za-z_]{32}\.apps\.googleusercontent\.com)
- (?<![0-9a-zA-Z_-])(?P<value>[0-9a-zA-Z_-]{24})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: false
validations:
- GoogleMultiValidation
required_substrings:
Expand All @@ -309,7 +302,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>ya29\.[0-9A-Za-z_-]{22,})
filter_type: GeneralPattern
use_ml: true
required_substrings:
- ya29.
min_line_len: 27
Expand All @@ -320,7 +312,6 @@
values:
- (?i)(?P<value>heroku(.{0,20})?[0-9a-f]{8}(-[0-9a-f]{4})+-[0-9a-f]{12})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- heroku
min_line_len: 24
Expand All @@ -331,7 +322,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>IGQVJ[\w]{100,})
filter_type: GeneralPattern
use_ml: true
required_substrings:
- IGQVJ
min_line_len: 105
Expand All @@ -353,7 +343,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>[0-9a-zA-Z]{32}-us[0-9]{1,2})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: false
validations:
- MailChimpKeyValidation
required_substrings:
Expand All @@ -366,7 +355,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>key-[0-9a-zA-Z]{32})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- key-
min_line_len: 36
Expand All @@ -390,7 +378,6 @@
values:
- (?P<value>access_token\$production\$[0-9a-z]{16}\$[0-9a-z]{32})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: false
required_substrings:
- access_token$production$
min_line_len: 72
Expand All @@ -410,7 +397,6 @@
values:
- (?P<value>sk_live_[0-9a-z]{32})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: false
required_substrings:
- sk_live_
min_line_len: 40
Expand All @@ -433,7 +419,6 @@
values:
- (?P<value>SG\.[\w_]{16,32}\.[\w_]{16,64})
filter_type: GeneralPattern
use_ml: false
required_substrings:
- SG.
min_line_len: 34
Expand All @@ -454,7 +439,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>xox[a|b|p|r|o|s]\-[-a-zA-Z0-9]{10,250})
filter_type: GeneralPattern
use_ml: true
validations:
- SlackTokenValidation
required_substrings:
Expand All @@ -467,7 +451,6 @@
values:
- (?P<value>hooks\.slack\.com/services/T\w{8}/B\w{8}/\w{24})
filter_type: GeneralPattern
use_ml: true
required_substrings:
- hooks.slack.com/services/T
min_line_len: 61
Expand All @@ -478,7 +461,6 @@
values:
- (?P<value>sk_live_[0-9a-zA-Z]{24})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
validations:
- StripeApiKeyValidation
required_substrings:
Expand All @@ -491,7 +473,6 @@
values:
- (?P<value>rk_live_[0-9a-zA-Z]{24})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- rk_live_
min_line_len: 32
Expand All @@ -502,7 +483,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>EAAA[0-9A-Za-z_-]{60})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
validations:
- SquareAccessTokenValidation
required_substrings:
Expand All @@ -515,7 +495,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>sq0[a-z]{3}-[0-9A-Za-z_-]{22})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
validations:
- SquareClientIdValidation
required_substrings:
Expand All @@ -528,7 +507,6 @@
values:
- (?P<value>sq0csp-[0-9A-Za-z_-]{43})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: false
required_substrings:
- sq0csp
min_line_len: 50
Expand All @@ -551,7 +529,6 @@
values:
- (^|[^.0-9A-Za-z_/+-])(?P<value>SK[0-9a-fA-F]{32})([^=0-9A-Za-z_/+-]|$)
filter_type: GeneralPattern
use_ml: true
required_substrings:
- SK
min_line_len: 34
Expand Down
2 changes: 1 addition & 1 deletion tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
SAMPLES_CRED_LINE_COUNT: int = 402

# credentials count after post-processing
SAMPLES_POST_CRED_COUNT: int = 293
SAMPLES_POST_CRED_COUNT: int = 296

# with option --doc
SAMPLES_IN_DOC = 422
Expand Down
Loading

0 comments on commit f76d16a

Please sign in to comment.