Skip to content

VulDeePecker/Comparative_Study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A Comparative Study of Deep Learning-Based Vulnerability Detection System

We collect 368 open source programs (corresponding to 368 CVEs) from the National Vulnerability Database (NVD) and 14,000 programs from the Software Assurance Reference Dataset (SARD). These programs contain 126 types of vulnerabilities, where each type is uniquely identified by a Common Weakness Enumeration IDentifier (CWE ID). These CWE IDs are listed below.

CWE-015, CWE-020, CWE-022, CWE-023, CWE-036, CWE-078, CWE-080, CWE-088, CWE-089, CWE-090, CWE-114, CWE-119, CWE-120, CWE-121, CWE-122, CWE-123, CWE-124, CWE-126, CWE-127, CWE-129, CWE-134, CWE-170, CWE-176, CWE-188, CWE-190, CWE-191, CWE-194, CWE-195, CWE-196, CWE-197, CWE-222, CWE-223, CWE-242, CWE-244, CWE-252, CWE-253, CWE-256, CWE-259, CWE-272, CWE-284, CWE-319, CWE-321, CWE-325, CWE-327, CWE-338, CWE-345, CWE-362, CWE-363, CWE-364, CWE-366, CWE-367, CWE-369, CWE-377, CWE-398, CWE-400, CWE-401, CWE-404, CWE-412, CWE-414, CWE-415, CWE-416, CWE-426, CWE-427, CWE-457, CWE-459, CWE-464, CWE-467, CWE-468, CWE-469, CWE-475, CWE-476, CWE-479, CWE-489, CWE-506, CWE-510, CWE-526, CWE-534, CWE-535, CWE-543, CWE-562, CWE-571, CWE-587, CWE-588, CWE-590, CWE-591, CWE-605, CWE-606, CWE-609, CWE-617, CWE-620, CWE-663, CWE-665, CWE-666, CWE-672, CWE-674, CWE-675, CWE-680, CWE-681, CWE-682, CWE-685, CWE-688, CWE-690, CWE-704, CWE-758, CWE-761, CWE-762, CWE-765, CWE-771, CWE-773, CWE-774, CWE-775, CWE-780, CWE-785, CWE-789, CWE-805, CWE-806, CWE-821, CWE-822, CWE-824, CWE-828, CWE-831, CWE-833, CWE-834, CWE-835, CWE-839, CWE-843.

We collect two datasets from the programs. One dataset contains 68,353 code gadgets (i.e., a number of statements that are semantically related to each other) with data dependency and control dependency (DDCD dataset for short), in which 55,334 code gadgets are generated from training programs and 13,019 code gadgets are generated from target programs. The other dataset contains 98,262 code gadgets with data dependency (DD dataset for short) in which 78,558 code gadgets are generated from training programs and 19,704 code gadgets are generated from target programs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published