Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: 7.支持向量机 #57

Merged
merged 2 commits into from
Dec 31, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions 人工智能基础与应用/7.支持向量机.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# 支持向量机

## 线性 SVM

### 最大间隔

模型是 $wx+b=0$ , 区别在于 x 在特征空间学习的,而不是像感知机那种在输入空间

本质是最大化几何间隔

$$
\max_{w,b}\frac{y_{i}(wx_{i}+b)}{\Vert w \Vert}
$$

chai-mi marked this conversation as resolved.
Show resolved Hide resolved
而对于一个数据集的集合间隔,是所有样本点的几何间隔最小的:

$$
\max_{x}\frac{y_{i}(wx_{i}+b)}{\Vert w \Vert}
$$

令最小函数间隔大小为 1,则模型就变成:

$$
\min_{w,b}\frac{1}{2}\Vert w \Vert ^{2}\\\\
s.t.\ \ y_i(wx_{i}+b)\ge1
$$

- 在线性可分情况下,训练数据集的样本点中与分离超平面距离最近的样本点的实例称为支持向量 (support vector)
- 支持向量是使约束条件式等号成立的点,即 $y_i (wx_{i}+b)=1$

### 对偶问题

构造拉格朗日函数:

$$
L(w,b,\alpha)=\frac{1}{2}\Vert w \Vert ^{2}-\sum_{i=1}^{N}\alpha_{i}[y_{i}(wx_{i}+b)-1]
$$

求解对偶问题:

$$
\begin{align*}
&\alpha_{i}\ge0 \\
&\frac{\partial L}{\partial w}=0 \Rightarrow w=\sum_{i=1}^{N}\alpha_{i}y_{i}x_{i} \\
&\frac{\partial L}{\partial b}=0 \Rightarrow \sum_{i=1}^{N}\alpha_{i}y_{i}=0 \\
&i=1,2,...,N
\end{align*}
$$

代入拉格朗日函数,得到对偶问题:

$$
\max W(\alpha) = \sum_{i=1}^{N}\alpha_{i}-\frac{1}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\alpha_{i}\alpha_{j}y_{i}y_{j}(x_{i} \cdot x_{j})
$$

## 非线性 SVM

对于非线性问题,可以通过核函数将输入空间映射到高维特征空间,从而解决线性不可分问题

具体方法:使用核函数
Loading