Skip to content

Latest commit

 

History

History
76 lines (45 loc) · 3.53 KB

README.md

File metadata and controls

76 lines (45 loc) · 3.53 KB

SandboxLLM

Leveraging the LLM (Large Language Model) with Sandbox technology by SFT (supervised fine tuning) and RAG (Retrieval Augmented Generation) to deliver software analysis reasoning capabilities along with precise Natural Language Processing (NLP) query results.

Demo : please refer to below images or the demo code (by langchain,milvus,bge-m3) : sandboxllm_rag_workflow_demo

Supported Intel hardware and software

  1. OpenVino

image

  1. hardware : AI PC

  2. code example to build api for intel OpenVino

Architecture

alt tag

  1. LLM supervised fine-tuning process
  • construct security domain knowledge dataset (e.g., APT knowledge) and malware analysis pattern dataset (e.g., conclusions and evidential reasoning derived from manual sample analysis by security researchers). And the open-source LLM is selected as base model for SFT since it has good basic capability with long token support for English and Chinese language. So the LLM can be enabled to acquire knowledge and analytical methods specific to the security domain.
  1. Sandbox output report data processing process
  • Firstly, user uploads sample to the sandbox, and the sandbox analyzes the sample then outputs the result in JSON format.
  • Secondly, the JSON-formatted data is converted to NLP text format which LLM can understand more easily.
  • Finally we use overlapping windows to cut long text for vectorization and inserting data into vector database.
  1. User query inference process
  • RAG based LLM querying.
  • It retrieve text from vector database by user query vectorization then merge the text with user query by prompt to get response from LLM.
  • And the sample source code will be appended to the prompt if the sample is script, so LLM can read the source code for better analysis.

Sample Analysis Examples

A. SandboxLLM UI demo alt tag

B. Sample risk behavior analysis by LLM alt tag

C. Sample behavior analysis and security knowledge Q&A alt tag

D. Sample network behavior analysis by LLM alt tag

E. Malicious code analysis by LLM alt tag

Steps to run

  1. Deploy sandbox Cuckoo as the steps here
  2. Deploy and run LLM as the steps here

Evaluation

  1. Q&A accuracy by LLM RAG: 95.08%
  • here for more details about the evaluation code and result
  1. LLM Predict accuracy
    • chatglm3-6b: 82%
    • Qwen1.5-110B-INT4: 97%

License

This repository is licensed under the MIT License.

Please follow other model/project licenses to use the corresponding model weights and other open source project : ChatGLM3, Qwen, LLaMA-Factory, FlagEmbedding