Best practice: should a function return a row vector or a column vector
181 views (last 30 days)
Show older comments
Perhaps this has been asked before but if it has I can't find the answer.
If I write a function or method which returns a vector, should it return a row vector or a column vector? I find myself choosing different answers every time and it would be nice to have some consistency.
For example, if my function returns x-y coordinates, it seems logical (to me) that the coordinates should be column vectors
function [x,y] = getCoords() % Return column vectors?
end
Many Matlab functions do this too, for example:
P = [0 0; 0 2; 2 2; 2 0; NaN NaN; 0.5 0.5; 0.5 1.5; 1.5 1.5; 1.5 0.5; NaN NaN; 3 0.5; 3.5 1.5; 4 0.5];
polyin = polyshape(P);
polyin.area([1:3]) % Returns a column vector
On the other hand, there are many Matlab functions which return row vectors.
linspace(1,10,5) % Returns a row vector
What should be considered when deciding whether to return a row or column vector?
I am aware that for loops can accept a row vector as the argument:
for ii = linspace(1,10,5)
disp(num2str(ii))
end
So perhaps if the function gives the kind of output which might be routinely used to iterate a for loop then a row vector should be preferred. Is there anything else that should be considered?
3 Comments
David Goodmanson
on 27 Oct 2023
It pays to remember though, that if x is a matrix with complex entries, ' produces the complex conjugate transpose (hermitian conjugate), and .' gives the straight transpose, no complex conjugate. So x(:)' produces a row with the complex conjugate of x, whereas x(:).' gives a row that does not complex conjugate x.
Jon
on 27 Oct 2023
@David Goodmanson Goodpoint. Yes, that is sometimes an issue. I thought about including that detail, but decided not to delve into that just to make the point. Most of my applications, other than spectral analysis just use real values so it doesn't come into play. I always thought the .' syntax looked a little messy, and also somehow suggested an element by element operation, as with .*, which of course isn't the case.
Answers (3)
John D'Errico
on 27 Oct 2023
Edited: John D'Errico
on 27 Oct 2023
Let me disagree with a basic assumption of your question, in that there must be some best practice at all on this. Instead, I'll claim that a function should return what makes the most sense in context of how you will typically use the result. And much of the time, the row-ness versus column-ness of a vector is irrelevant for much of what you may do with a vector. Furthermore, the .' transpose is an incredibly efficient operator, since it changes only a flag stored with the vector that tells the shape of that vector, so 1xn versus nx1.
I often tend to write code that returns columns, when I am writing linear algebraic operations with the result, since I may build up matrices in terms of columns. Something like a Vandermonde matrix, for example. And if you are solving a problem of the form A*x==b, then x will typically be a column vector. But that is just me, since I do much work with linear agebra.
Conversely, if the main result of an operation is you want to dump it to the command window to see a result, then a row vector makes much more sense. It is easier to view the results when you can see more of them on screen. Having to scroll the command window just to see an entire vector is just silly, IF that is the common result of what you will be doing.
As long as you know the code you wrote, then all that matters is that you know what it returns. And DOCUMENT your code. This is one best practice you should always follow. Documentation is hugely important, not only for you next year when you have forgotten what this code does, but for your successor when you get run down by the crosstown bus. ALWAYS include help that tells what is the result of your code. It should say what the variables mean, what shape they will be, and of course what the input arguemnts should be, etc.
4 Comments
James Tursa
on 27 Oct 2023
Edited: James Tursa
on 27 Oct 2023
@Dyuman Joshi Because the dimensions are physically stored in the variable mxArray header. This has been how MATLAB stores dimensions since its inception. The reshape( ) function for full vectors does the same thing as transpose( ) for full vectors does ... changes the dimensions without changing the data memory ordering via a shallow copy. Although I haven't looked at the mxArray header in detail in a few years, I would be surprised if this has changed and they have added a transpose flag. Maybe I will look at that over the weekend to confirm, but I doubt there could be any changes in that regard because that would completely mess up some of the mex API function calls such as mxGetDimensions and mex assumptions about data ordering in memory.
I will caveat my answer by noting that the MATLAB parser will sometimes avoid physical transposes altogether when doing a matrix multiply by simply passing transpose flags onto the BLAS routines it calls in the background. E.g.,
A = some full matrix
B = some full matrix
C = A' * B;
In the above, A' is not physically generated. Instead, A and B data pointers are passed into the appropriate BLAS matrix multiply routine along with a transpose flag for A. The BLAS routine virtually transposes A in the algorithms it uses in the background.
James Tursa
on 28 Oct 2023
Edited: James Tursa
on 28 Oct 2023
Follow-up: After thinking about it a bit I realize there is no need to check this over the weekend. MATLAB can't implement transposes as a flag in their mxArray headers because doing so would break every single mex routine that has ever been written which assumes the reported variable dimensions match the column-major data ordering in memory. Implementing transposes as a flag is akin to changing the data ordering to row-major without telling the mex routine anything about it.
Paul
on 27 Oct 2023
I just ran into a case where I had to make sure the output vector from a function had to have the same dimensions as a vector input argument to the function. That is, the operation of the function was agnostic to the dimensions of the input, but it made sense for the dimensions of the output to match the dimensions of the input.
0 Comments
Cris LaPierre
on 27 Oct 2023
Edited: Cris LaPierre
on 27 Oct 2023
Just my opinion here. I think it is entirely dependent on how you intend to use the output.
For your (x,y) example, I would use columns because the plot function treats each column as a data series.
If the output were going to be used as the index of a for loop, I would choose row vector, since a for loop iterates across columns, not down rows.
One consideration is that MATLAB uses column-major order, which means when saving to memory, columns are stacked on top of each other. This means it is more efficient to index out entire columns in MATLAB, since those values are stored next to each other (contiguous memory). Of course, for a vector, this doesn't really matter.
These are just some quick ideas. Others may have different opinions.
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!