MATLAB Answers

GradientDescent algorithm returning error: left and right side different number of elements

57 views (last 30 days)
Jeani de Jongh
Jeani de Jongh on 11 Apr 2019
Commented: Brendan Hamm on 11 Apr 2019
I am currently busy with an online assignment doing GradientDescent. I keep getting an error message that the code (which was given to us to save the cost of J) cannot be run because the sides have a different number of elements. I have not edited this though, so I know there must be a problem with my code. Just don't know what's wrong.
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1); %
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
hypothesis= X * theta;
error = (hypothesis - y).^2;
theta = theta - alpha * 1/2*m * sum(error * X(:,2));
% ===========================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
I keep getting the error that
J_history(iter) = computeCost (X, y, theta)
have a different number of elements.
EDIT: Here's the computeCost:
function J = computeCost(X,y,theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
data = load('ex1data1.txt'); % read comma separated data
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
hypothesis = (X * theta);
error = ((hypothesis - y).^2);
J = 1/(2*m)*sum(error);
% =========================================================================
end

  0 Comments

Sign in to comment.

Answers (1)

Brendan Hamm
Brendan Hamm on 11 Apr 2019
I would put a break point on the line:
error = ((hypothesis - y).^2);
I would assume that the size of hypothesis and y are different (one row vector the other column), leading to implicit expansion. That is a row vector minus a column vector (or vice versa) returns a matrix, and thus sum(matrix) -> row vector.
If this is the case, youshould only need to transpose one of the variables:
error = ((hypothesis.' - y).^2);

  3 Comments

Jeani de Jongh
Jeani de Jongh on 11 Apr 2019
X is a 97x2 matrix. (First column zeros, second population)
theta is a 2x1 vector.
That means hypothesis should give a 97x1 matrix, right?
y is 97x1 (profits of company according to population)
So I'm assuming the matrices and vectors have the right dimensions?
Brendan Hamm
Brendan Hamm on 11 Apr 2019
If these were indeed the sizes of the arrays, then you would error on this line:
theta = theta - alpha * 1/2*m * sum(error * X(:,2));
That is error is 97x1 and X(:,2) is 97x1, so you cannot have a matrix multiplication (error * X(:,2)).
So, use breakpoints (possibly conditional breakpoints) to determine what the actual size of these arrays are at the time of the error and test it out at the command line in debug mode.
The error message should tell you exactly the line on which the error occurs, possibly in multiple files, but the one you should be concerned with is the line in your code (either or both gradientDescent and computeCost). Put the breakpoint on that line and inspect your workspace, this should give clues as to where your assumptions are not what is actually coded.
Brendan Hamm
Brendan Hamm on 11 Apr 2019
Actually, looking at this in more detail, you are not calculating the gradient of the Cost function correctly. I would suggest looking at the formula in more detail and ensure you understand it. I hesitate to give the answer as this would destroy your learning experience ... but hopefully this leads you on the right track.
Specifically, this is the part that is not right:
1/2*m * sum(error * X(:,2));

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!