OWASP-TOP-10 SAST Project

SAST Analysis OWASP

https://owasp.org/www-community/Source_Code_Analysis_Tools

SAST analysis with "Classical ML Algorithmic" Techniques

This Project is about SAST based Application Security using Source code testing and Machine Learning.

Getting started

All the ML models used for automated learning of Security Threat signatures, models can be found in the models folder.

$ cd models

For Example 2. ```$ python3 Ensemble_model.py -t scraper

$ python3 LSTM_model.py -t <path_to_code_to_test>

Introduction:

Static application security testing (SAST) is a set of technologies designed to analyze application source code, byte code and binaries for coding and design conditions that are indicative of security vulnerabilities. SAST solutions analyze an application from the “inside out” in a non-running state. Our goals with respect to this project are:

Project Goals:

Use open source/Proprietary Fuzzers to analyze source code for vulnerabilities in a non-running state.
Work with the features presented by fuzzers to eliminate false-positives
Use a learning algorithm (such as ensemble learning) for the above.

Targeted Information/Data Flow Vulnerabilities:

We have shortlisted 7 information flow vulnerabilities that are featured in the list: OWASP Top 10 Application Security Risks - 2017.

Injection: SQL, NoSQL, OS, and LDAP injection.
- Error-Based SQL Injection
- Boolean-Based SQL Injection
- Time-Based SQL Injection
- Out-of-Band SQL Injection Vulnerability
Broken Authentication: Incorrectly implemented functions related to authentication and session management.
XML External Entities (XXE): External entities can be used to disclose internal files.
Broken Access Control: access unauthorized functionality and/or data.
Cross-Site Scripting (XSS): Application includes untrusted data in a new web page without proper validation.
Using Components with Known Vulnerabilities: Components, such as libraries, frameworks, and other software modules with known vulnerabilities being used.
Insecure Deserialization: Insecure deserialization often leads to remote code execution.

Evaluating the Selected Features/Metrics:

Simply defining a large set of arbitrary features may lead the learning algorithm to discover false correlations. For this reason, we would have to make a careful selection of features, which can only be done once we have data to test the fuzzers on and experiment with the various features they present.

Classical Machine Learning Models:

The potential models that would possibly work the best to eliminate false positives are:

1. Ensemble Learning - Instead of selecting the highest ranked learning model, create an ensemble of the chosen learning models. The technique used to create the ensemble could either be bootstrap aggregation or boosting (Adaboosting, GBMs)

2. RNN with LSTM Units - Correlate the vulnerabilities generated(as opposed to treating them as independent events) , to detect a pattern. This could be achieved using an LSTM model.

Framewroks Technologies use:

We will also be able to test the open-source fuzzers once we have data, so while the following seem promising, we will not be able to confirm unless we test it on actual code.

w3af - Open-source web application security scanner
Vega - Open-source fuzzer by Subgraph
js-fuzz - American Fuzzy Lop-inspired fuzz tester for JavaScript code
Fun-fuzz - JavaScript-based fuzzer by the Mozilla Organization
OWASP Xenotix XSS Exploit Framework - XSS detection and exploitation

Future Enhancements and Plan

Quoting the AI Security Research paper titled ALETHEIA: Improving the Usability of Static Security Analysis, the experimental analysis they have used is. “an extensive set of 3,758 security warnings output by a commercial JavaScript security checker when applied to 1,706 HTML pages taken from 675 top-popular Web sites. These Web sites include all Fortune 500 companies, the top 100 Websites”.

** https://arxiv.org/pdf/2210.07465

** https://arxiv.org/pdf/2109.13916

There is also a Plan to use GENAI based Techniques with LLM ReACT Agents based design pattern to do application Security Testing and analysis with SAST. Ref: https://arxiv.org/html/2401.17459v1

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
models		models
pre-commit-hooks		pre-commit-hooks
Benchmark_1.2-findsecbugs-v1.6.0-408.xml		Benchmark_1.2-findsecbugs-v1.6.0-408.xml
README.md		README.md
Scraper.py		Scraper.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OWASP-TOP-10 SAST Project

SAST Analysis OWASP

SAST analysis with "Classical ML Algorithmic" Techniques

Getting started

Introduction:

Project Goals:

Targeted Information/Data Flow Vulnerabilities:

Evaluating the Selected Features/Metrics:

Classical Machine Learning Models:

Framewroks Technologies use:

Future Enhancements and Plan

About

Releases

Packages

Languages

akramIOT/SAST_ML_TECHNIQUES

Folders and files

Latest commit

History

Repository files navigation

OWASP-TOP-10 SAST Project

SAST Analysis OWASP

SAST analysis with "Classical ML Algorithmic" Techniques

Getting started

Introduction:

Project Goals:

Targeted Information/Data Flow Vulnerabilities:

Evaluating the Selected Features/Metrics:

Classical Machine Learning Models:

Framewroks Technologies use:

Future Enhancements and Plan

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages