Skip to content

Commit

Permalink
retrain
Browse files Browse the repository at this point in the history
  • Loading branch information
babenek committed Jul 26, 2024
1 parent d5ff926 commit 8b3a71b
Show file tree
Hide file tree
Showing 10 changed files with 808 additions and 1,104 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ jobs:
- name: Check ml_model.onnx integrity
if: ${{ always() && steps.code_checkout.conclusion == 'success' }}
run: |
md5sum --binary credsweeper/ml_model/ml_model.onnx | grep 62d92ab2f91a18e861d846a7b8a0c3a7
md5sum --binary credsweeper/ml_model/ml_model.onnx | grep 70a864232576f9b88a08296a5e628208
# # # Python setup

Expand Down
28 changes: 14 additions & 14 deletions cicd/benchmark.txt
Original file line number Diff line number Diff line change
Expand Up @@ -223,21 +223,21 @@ FileType FileNumber ValidLines Positives Negatives Templat
.zsh 6 872 12
.zsh-theme 1 97 1
TOTAL: 10332 16987703 11500 60361 5198
credsweeper result_cnt : 10839, lost_cnt : 0, true_cnt : 10344, false_cnt : 495
credsweeper result_cnt : 10944, lost_cnt : 2, true_cnt : 10434, false_cnt : 508
Rules Positives Negatives Templates Reported TP FP TN FN FPR FNR ACC PRC RCL F1
------------------------------ ----------- ----------- ----------- ---------- ----- ---- ----- ---- -------- -------- -------- -------- -------- --------
API 130 3165 185 118 116 2 3348 14 0.000597 0.107692 0.995402 0.983051 0.892308 0.935484
API 130 3165 185 124 122 2 3348 8 0.000597 0.061538 0.997126 0.983871 0.938462 0.960630
AWS Client ID 167 18 0 160 160 0 18 7 0.000000 0.041916 0.962162 1.000000 0.958084 0.978593
AWS Multi 75 14 0 87 75 11 3 0 0.785714 0.000000 0.876404 0.872093 1.000000 0.931677
AWS S3 Bucket 66 24 0 92 66 24 0 0 1.000000 0.000000 0.733333 0.733333 1.000000 0.846154
Atlassian Old PAT token 27 212 3 12 3 8 207 24 0.037209 0.888889 0.867769 0.272727 0.111111 0.157895
Auth 410 2724 76 361 357 4 2796 53 0.001429 0.129268 0.982243 0.988920 0.870732 0.926070
Auth 410 2724 76 370 364 6 2794 46 0.002143 0.112195 0.983801 0.983784 0.887805 0.933333
Azure Access Token 19 0 0 0 0 0 19 1.000000 0.000000 0.000000
BASE64 Private Key 7 2 0 7 7 0 2 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
BASE64 encoded PEM Private Key 7 0 0 5 5 0 0 2 0.285714 0.714286 1.000000 0.714286 0.833333
Bitbucket Client ID 142 1813 9 46 27 18 1804 115 0.009879 0.809859 0.932281 0.600000 0.190141 0.288770
Bitbucket Client Secret 230 535 10 44 33 11 534 197 0.020183 0.856522 0.731613 0.750000 0.143478 0.240876
Certificate 25 461 1 20 20 0 462 5 0.000000 0.200000 0.989733 1.000000 0.800000 0.888889
Certificate 25 461 1 27 20 7 455 5 0.015152 0.200000 0.975359 0.740741 0.800000 0.769231
Credential 94 154 74 90 90 0 228 4 0.000000 0.042553 0.987578 1.000000 0.957447 0.978261
Docker Swarm Token 2 0 0 2 2 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Dropbox App secret 62 114 0 46 36 9 105 26 0.078947 0.419355 0.801136 0.800000 0.580645 0.672897
Expand All @@ -252,19 +252,19 @@ Google OAuth Access Token 3 0 0
Grafana Provisioned API Key 22 1 0 1 1 0 1 21 0.000000 0.954545 0.086957 1.000000 0.045455 0.086957
IPv4 729 405 0 1205 728 342 63 1 0.844444 0.001372 0.697531 0.680374 0.998628 0.809339
IPv6 33 131 0 33 33 0 131 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
JSON Web Token 285 9 2 275 272 3 8 13 0.272727 0.045614 0.945946 0.989091 0.954386 0.971429
JSON Web Token 285 9 2 274 273 1 10 12 0.090909 0.042105 0.956081 0.996350 0.957895 0.976744
Jira / Confluence PAT token 0 4 0 0 0 4 0 0.000000 1.000000
Jira 2FA 14 6 0 10 10 0 6 4 0.000000 0.285714 0.800000 1.000000 0.714286 0.833333
Key 508 8473 464 460 457 3 8934 51 0.000336 0.100394 0.994283 0.993478 0.899606 0.944215
Nonce 84 53 0 85 80 5 48 4 0.094340 0.047619 0.934307 0.941176 0.952381 0.946746
Key 508 8473 464 483 477 6 8931 31 0.000671 0.061024 0.996083 0.987578 0.938976 0.962664
Nonce 84 53 0 88 81 7 46 3 0.132075 0.035714 0.927007 0.920455 0.964286 0.941860
PEM Private Key 1019 1483 0 1023 1019 4 1479 0 0.002697 0.000000 0.998401 0.996090 1.000000 0.998041
Password 1827 7479 2734 1653 1619 32 10181 208 0.003133 0.113848 0.980066 0.980618 0.886152 0.930995
Salt 42 76 2 36 36 0 78 6 0.000000 0.142857 0.950000 1.000000 0.857143 0.923077
Secret 1361 28458 869 1232 1231 1 29326 130 0.000034 0.095518 0.995731 0.999188 0.904482 0.949479
Password 1827 7479 2734 1689 1652 28 10185 175 0.002742 0.095785 0.983140 0.983333 0.904215 0.942116
Salt 42 76 2 38 38 0 78 4 0.000000 0.095238 0.966667 1.000000 0.904762 0.950000
Secret 1361 28458 869 1249 1244 4 29323 117 0.000136 0.085966 0.996057 0.996795 0.914034 0.953622
Seed 1 6 0 0 0 6 1 0.000000 1.000000 0.857143 0.000000
Slack Token 4 1 0 4 4 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
Token 612 3951 438 549 548 1 4388 64 0.000228 0.104575 0.987003 0.998179 0.895425 0.944014
Token 612 3951 438 556 553 3 4386 59 0.000684 0.096405 0.987602 0.994604 0.903595 0.946918
Twilio API Key 0 5 2 0 0 7 0 0.000000 1.000000
URL Credentials 208 125 242 203 203 0 367 5 0.000000 0.024038 0.991304 1.000000 0.975962 0.987835
UUID 3031 1 0 3009 3008 1 0 23 1.000000 0.007588 0.992084 0.999668 0.992412 0.996026
11500 60361 5198 10983 10344 495 59866 1156 0.008201 0.100522 0.977025 0.954332 0.899478 0.926093
URL Credentials 208 125 242 206 206 0 367 2 0.000000 0.009615 0.996522 1.000000 0.990385 0.995169
UUID 3031 1 0 3008 3007 1 0 24 1.000000 0.007918 0.991755 0.999668 0.992082 0.995860
11500 60361 5198 11094 10434 508 59853 1066 0.008416 0.092696 0.978097 0.953573 0.907304 0.929864
Binary file modified credsweeper/ml_model/ml_model.onnx
Binary file not shown.
10 changes: 5 additions & 5 deletions tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,18 @@
NEGLIGIBLE_ML_THRESHOLD = 0.0001

# credentials count after scan
SAMPLES_CRED_COUNT: int = 430
SAMPLES_CRED_LINE_COUNT: int = 447
SAMPLES_CRED_COUNT: int = 429
SAMPLES_CRED_LINE_COUNT: int = 446

# credentials count after post-processing
SAMPLES_POST_CRED_COUNT: int = 407
SAMPLES_POST_CRED_COUNT: int = 401

# with option --doc
SAMPLES_IN_DOC = 411

# archived credentials that are not found without --depth
SAMPLES_IN_DEEP_1 = SAMPLES_POST_CRED_COUNT + 22
SAMPLES_IN_DEEP_2 = SAMPLES_IN_DEEP_1 + 19
SAMPLES_IN_DEEP_1 = SAMPLES_POST_CRED_COUNT + 24
SAMPLES_IN_DEEP_2 = SAMPLES_IN_DEEP_1 + 18
SAMPLES_IN_DEEP_3 = SAMPLES_IN_DEEP_2 + 1

# well known string with all latin letters
Expand Down
Loading

0 comments on commit 8b3a71b

Please sign in to comment.