Part1-支持向量机练习
支持向量机
1.1 数据可视化
1 | from scipy.io import loadmat |
1 | data = loadmat('ex6data1.mat') |
1 | plt.scatter(X_0['x1'],X_0['x2'],c='y') |
1.2 尝试\(C=1\)
1 | import sklearn.svm |
0.9803921568627451
画出决策边界
1 | plt.scatter(X_0['x1'],X_0['x2'],c='y') |
高斯核SVM
### 2.1 数据可视化
1 | from scipy.io import loadmat |
1 | data = loadmat('ex6data2.mat') |
1 | plt.scatter(X_0['x1'], X_0['x2'], c='y') |
2.2 预测分类
1 | import sklearn.svm |
1 | x1 = np.linspace(0, 1.1, 500) |
寻找最优参数
3.1 数据可视化
1 | from scipy.io import loadmat |
1 | data_val = pd.DataFrame(np.concatenate((data['Xval'], data['yval']), axis=1), columns=['xval1','xval2','yval']) |
1 | plt.scatter(X_0['x1'], X_0['x2'], c='y') |
3.2 寻找最佳参数
1 | candidate = [0.01, 0.03, 0.1, 0.3, 1, 3, 10, 30, 100] |
1 | res = [] |
(0.965, (0.3, 100))
1 | svc = sklearn.svm.SVC(C=best_param[0], kernel='rbf', gamma=best_param[1], probability=True) |
垃圾邮件分类
4.1 读取数据
1 | from scipy.io import loadmat |
1 | data = loadmat('spamTrain.mat') |
4.2 预测分类
1 | import sklearn.svm as svm |
0.987
4.3 再用逻辑回归试试
1 | from sklearn.linear_model import LogisticRegression |
0.994