layout | type | altfooter | level | auto-migrated | document | year | order | title | lang | tags | exploitability | detectability | technical | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
col-sidebar |
documentation |
true |
4 |
0 |
OWASP Machine Learning Security Top Ten 2023 |
2023 |
1 |
ML01:2023 Input Manipulation Attack |
en |
|
5 |
3 |
5 |
Input Manipulation Attacks is an umbrella term, which include Adversarial Attacks, a type of attack in which an attacker deliberately alters input data to mislead the model.
Adversarial training: One approach to defending against input manipulation attack is to train the model on adversarial examples. This can help the model become more robust to attacks and reduce its susceptibility to being misled.
Robust models: Another approach is to use models that are designed to be robust against manipulative attacks, such as adversarial training or models that incorporate defense mechanisms.
Input validation: Input validation is another important defense mechanism that can be used to detect and prevent input manipulation attacks. This involves checking the input data for anomalies, such as unexpected values or patterns, and rejecting inputs that are likely to be malicious.
Threat Agents/Attack Vectors | Security Weakness | Impact |
---|---|---|
Exploitability: 5 (Easy) ML Application Specific: 4 ML Operations Specific: 3 |
Detectability: 3 (Moderate) The manipulated image may not be noticeable to the naked eye, making it difficult to detect the attack. |
Technical: 5 (Difficult) The attack requires technical knowledge of deep learning and image processing techniques. |
Threat Agent: Attacker with knowledge of deep learning and image processing techniques. Attack Vector: Deliberately crafted manipulated image that is similar to a legitimate image. |
Vulnerability in the deep learning model's ability to classify images accurately. | Misclassification of the image, leading to security bypass or harm to the system. |
It is important to note that this chart is only a sample based on the scenario below only. The actual risk assessment will depend on the specific circumstances of each machine learning system.
A deep learning model is trained to classify images into different categories, such as dogs and cats. An attacker manipulates the original image that is very similar to a legitimate image of a cat, but with small, carefully crafted perturbations that cause the model to misclassify it as a dog. When the model is deployed in a real-world setting, the attacker can use the manipulated image to bypass security measures or cause harm to the system.
A deep learning model is trained to detect intrusions in a network. An attacker manipulates network traffic by carefully crafting packets in such a way that they will evade the model's intrusion detection system. The attacker can alter the features of the network traffic, such as the source IP address, destination IP address, or payload, in such a way that they are not detected by the intrusion detection system. For example, the attacker may hide their source IP address behind a proxy server or encrypt the payload of their network traffic. This type of attack can have serious consequences, as it can lead to data theft, system compromise, or other forms of damage.