让梯度下降在八度音程中工作。 (Andrew ng 的机器学习课程,练习一)

Getting gradient descent to work in octave. (Andrew ng's machine learn course, excersise 1)

我正在尝试 implement/solve 来自 Andrew ng 在 coursera 上的机器学习课程的第一个编程练习。 我无法在八度音程中实现线性梯度下降(对于一个变量)。我没有像解决方案中那样得到相同的参数值,但我的参数朝着相同的方向发展(至少我是这么认为的)。所以我的代码中可能有一个错误。有没有比我经验丰富的可以赐教。

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by 
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

theta1 = theta(1);
theta2 = theta(2);

temp0 = 0;
temp1 = 0;

h = X * theta;
for iter = 1:(num_iters)

    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta. 
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.
    %
    temp0 = 0;
    temp1 = 0;
    for i=1:m
        error = (h(i) - y(i));
        temp0 = temp0 + error * X(i, 1));;
        temp1 = temp1 + error * X(i, 2));
    end
    theta1 = theta1 - ((alpha/m) * temp0);
    theta2 = theta2 - ((alpha/m) * temp1);
    theta = [theta1;theta2];

    % ============================================================

    % Save the cost J in every iteration    
    J_history(iter) = computeCost(X, y, theta);

end
end

我对使用 [0;0] 初始化 theta 的练习 1 的预期结果应该是 theta1:-3.6303 和 theta2:1.1664

但我变成了输出 theta1 是 0.09​​5420 而 thetha2 是 0.51890

这是我用于线性梯度下降的公式。

EDIT1: 已编辑代码。现在我得到了 theta1:

87.587

对于 theta2

979.93

我现在知道我的问题是什么了。我将为可能对此感兴趣的人快速描述它。所以我不小心在我的循环之外计算了avriable h。所以每次在循环中它都用相同的值计算。

固定代码如下:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by 
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

theta1 = theta(1);
theta2 = theta(2);

temp0 = 0;
temp1 = 0;
error = 0;

for iter = 1:(num_iters)
    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta. 
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.
    %

    h = X * theta; %heres the variable i moved into the loop

    temp0 = 0;
    temp1 = 0;
    for i=1:m
        error = (h(i) - y(i));
        temp0 = temp0 + (error * X(i, 1));
        temp1 = temp1 + (error * X(i, 2));
        %disp(error);
    end
    theta1 = theta1 - ((alpha/m) * temp0);
    theta2 = theta2 - ((alpha/m) * temp1);
    theta = [theta1;theta2];

    % ============================================================

    % Save the cost J in every iteration    
    J_history(iter) = computeCost(X, y, theta);

end
end