0%

Part1-线性回归课后练习

Part1-线性回归课后练习

1、简单的函数

获得一个\(5\times 5\)的单位矩阵

1
2
3
import numpy as np
A = np.eye(5)
A

array([[1., 0., 0., 0., 0.],

​ [0., 1., 0., 0., 0.],

​ [0., 0., 1., 0., 0.],

​ [0., 0., 0., 1., 0.],

​ [0., 0., 0., 0., 1.]])

2、单变量线性回归

2.1 绘制数据

1
2
3
4
import pandas as pd
data = pd.read_csv('ex1data1.txt', sep=',', header=None)
data.columns = ['Population','Profit']
data.plot(x='Population', y='Profit', c='r', kind='scatter',marker='x')

2.2 梯度下降

2.2.1 数据处理

1
2
# 添加一列 “1”,乘以参数后作为偏置参数
data.insert(0, 'Ones',np.ones(len(data)))
1
2
# 观察数据
data.head()

2.2.3 提取X和y

1
2
X = X.values
y = Y.values.reshape(len(Y),1)

2.2.4 定义代价函数

\(J(\theta) = \frac{1}{2m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})^2\)

1
2
3
def compute_cost(X, y, theta):
t = np.power(((X@theta.T)-y),2)
return np.sum(t)/(2*len(X))

2.2.5 梯度下降法

1
2
3
4
5
6
7
8
9
# 梯度下降
def gradient_descent(X, y, theta, alpha=0.003, epoch=1000):
temp = np.zeros(theta.shape)
cost = np.zeros(epoch)
for i in range(epoch):
temp = theta - alpha*((X@theta.T-y).T@X)/len(X)
theta = temp
cost[i] = compute_cost(X,y,theta)
return theta, cost

2.2.6 绘制结果

1
2
3
4
grad_theta, cost = gradient_descent(X, y, np.zeros((1,2)), epoch=10000, alpha=0.003)
x = data.iloc[:,1].values # 原始x,即Populations值
plt.plot(x, X@grad_theta.T)
plt.scatter(x,Y,c='r',marker='x')

2.2.7 绘制代价函数变化

1
plt.plot(np.arange(len(cost)), cost, 'r')

2.3 采用正规方程解

1
2
3
4
# theta_normal = np.linalg.inv(X.T@X)@X.T@y
# theta_normal
m = np.matrix(X.T@X).I
m*np.matrix(X.T)*np.matrix(y)

matrix([[-3.89578088],

​ [ 1.19303364]])

3、多元线性回归

3.1 特征缩放

1
2
3
4
import pandas as pd
import numpy as np
data = pd.read_csv('ex1data2.txt', names=['square', 'bedrooms', 'price'])
data.head()

1
2
data = data.apply(lambda column: (column - column.mean())/column.std())
data.head()

3.2 多元梯度下降

1
2
3
data.insert(0, 'Ones', np.ones(len(data)))
X = data.iloc[:,:3].values
y = data.iloc[:,3].values.reshape(len(X),1)
1
2
def compute_cost(X, y, theta):
return ((X@theta.T-y).T)@(X@theta.T-y)/(2*X.shape[0]) # m*3 3,1 m*1
1
2
3
4
5
6
7
def gradient_descent(X, y, theta, epoch=10000, alpha=0.003):
cost = np.zeros(epoch)
for i in range(epoch):
grad = (X@theta.T-y).T@X/X.shape[0]
theta -= alpha*grad
cost[i] = compute_cost(X,y,theta)
return theta,cost
1
2
3
theta_grad, cost = gradient_descent(X, y, np.zeros((1,3)))
import matplotlib.pyplot as plt
plt.plot(np.arange(len(cost)),cost)

3.3 采用正规方程解

1
2
theta = np.linalg.inv(X.T@X)@X.T@y
theta

array([[-1.11022302e-16],

​ [ 8.84765988e-01],

​ [-5.31788197e-02]])