## Announce

Puki contents have been moved into SONOTS Plugin (20070703)

## Matlab Conding Style

First Edition: 07/29/2008

## Introduction

This is a matlab coding style used in my own toolbox Matlab Computer Vision and Pattern Recognition toolbox.

Creating a coding convention is a good method to work in a group such as a research laboratory, an open source project. This coding style would be reusable as a draft for your project.

## Help Documentation

The help documentations of the built-in matlab functions are not easily readable. For example, try

>> doc polyval

(or you may click the above link) and find the meaning of S variable. Probably, it is not easy to do. The reason is mostly because they do not list arguments and their meanings.

I looked several ways of writing documentations in not only matlab but also other programming languages, and I finally settled into a similar manner with UNIX command documentation (man).

### Syntax

% <Function Name> - <Abstract>
%
% Synopsis
%   [<Output Arguments>] = <Function Name>(<Input Arguments>)
%
% Description
%   <Description>
%
% Inputs ([]s are optional)
%   (<Type>) <Variable Name> <Explanation>
%   ....
%
% Outputs ([]s are optional)
%   (<Type>) <Variable Name> <Explanation>
%   ....
%
% Examples
%   <Example Code>
%
%
% Requirements
%   <Function Name> (<Toolbox Name>)

% References
%   <Reference List like Papers>
%   ...
%
% Authors
%
%
% Changes
%   <Date>  <Statement>
%   ...

### Example

function [Gmm, P] = cvGmmTrain(X, K, maxIter, verbose)
% cvGmmTrain - Train Gaussian Mixture Models (GMM)
%
% Synopsis
%   [Gmm, [P]] = cvGmmTrain(X, K, [maxIter], [verbose])
%
% Description
%   cvGmmTrain trains Gaussian Mixture Models (GMM) using the Expectation
%   Maximization (EM) algorithm.
%
% Inputs ([]s are optional)
%   (matrix) X        D x N matrix representing feature vectors by
%                     columns where D is the number of dimensions and N
%                     is the number of vectors.
%   (scalar) K        The number of mixtures (clusters).
%   (scalar) [maxIter = 10]
%                     The maximum number of iterations of EM algorithm.
%   (bool)   [verbose = 0]
%                     Show progress or not.
%
% Outputs ([]s are optional)
%   (struct) Gmm      The Gaussian Mixture Model (GMM)
%   - (cell)   Mu     K cell arrays where each cell contains a D x 1
%                     vector representing a mean vector in the mixture.
%   - (cell)   Sigma  K cell arrays where each cell contains a D x D
%                     matrix representiang a covariance matrix in the
%                     mixture.
%   (vector) [P]      K x 1 vector representing the estimated prior
%                     probabilities for mixtures.
%
% Examples
%   X = [1 0 -1
%         1 0 -1];
%   Gmm = cvGmmTrain(X, 3);
%
%   cvGaussPdf
%
% Requirements
%   kmeans (Statistics Toolbox)

% References
%   [1] Xuedong Huang, et al. "Spoken Language Processing, "
%   Prentice Hall PTR, 2001.
%
% Authors
%   Naotoshi Seo <sonots(at)sonots.com>
%
%   contact the authors if you are interested in using the software
%   for commercial purposes. The software must not modified or
%   re-distributed without prior permission of the authors.
%
% Changes
%   04/01/2006  First Edition

### Explanation

#### THE 1ST LINE

• Write the function name and abstract at the 1st line.

The 1st lines of comments in all files under a directory are displayed when users type

>> help <directoryname>

when Contents.m does not exist (MATLAB Reference - help).

Furthermore, start to write the documentation comments under the function declaration. This rule is necessary for subfunctions, otherwise, help command does not show the documentation for subfunctions.

>> help primaryfunction>subfunction

#### Synopsis

• Write the format (or interface or definition) of the function.
• Surround option arguments by []. ([] is a common convention to express variables that can be omitted.)

#### Description

• Write the detailed description.

#### Inputs an Outputs

• Write variable types, names, and explanations.

<Type>

• (scalar) scalar including real, imaginary, int
• (string) string
• (vector) a vector
• (matrix) 2 or more dimensional array
• (cell) cell array
• (struct) structure
• (func) function handle

optional

• (int) integer
• (bool) boolean 1 or 0

<Variable Name>

• Start from the 14th column at the line
• Surround by [] if it is a option (argument which can be omitted)
• Write default value as [varname = default]

<Explanation>

• Start from the 22nd column at the line upto the 80th column (matlab default).
• If a variable name exceeds the 22nd column, start from the next line.

#### Examples

• Write exmaple codes.

Do not write >> or or something which means prompt because it disturbs to run the example codes by copy and paste. #### See also • List related functions. Matlab 'help' or 'doc' generates links for matlab functions written in the 'See also' section. 'See also' can not be 'See' or 'See Also' or 'SEE ALSO' and matlab function names must be as 'func' not 'func.m', i.e., can not have .m extension to generate links. #### Requirements • List required functions and their belonging toolboxes. #### BLANK LINE 'help' or 'doc' shows comment lines until they meet a blank line or an executable code line (non-comment line). #### References • Write references like papers. #### Authors • List authors and their contact information(s). #### License • Write license statements. Below are examples. % The program is free for non-commercial academic use. Please % contact the authors if you are interested in using the software % for commercial purposes. The software must not modified or % re-distributed without prior permission of the authors. % <http://www.gnu.org/licenses/old-licenses/gpl-2.0.html> GPL v2 % <http://www.gnu.org/licenses/gpl.html> GPL v3 % The program is free to use for non-commercial academic purposes, % but for course works, you must understand what is going inside to use. % The program can be used, modified, or re-distributed for any purposes % if you or one of your group understand codes (the one must come to % court if court cases occur.) Please contact the authors if you are % interested in using the program without meeting the above conditions. #### Changes • List changes and dates. ## Naming Conventions Let me define some terms. lowerCamel • start with lowercase and compound words without spaces or underscores but capitalize within the words as "lowerCamel." UpperCamel • start with uppercase and compound words without spaces or underscores but capitalize within the words as "UpperCamel." lower_case • all lowercase and compound words with _ (underscore) as "lower_case" ### Variables #### One or Greek Letter • Use one letter or greek letter variable names to express math equations. The one way of naming variables is to use names in math equations. The below table show a convention to convert math variable names into matlab variable names (The math convention would be different among fields of study. The below table is for signal or image processing, for example. )  Math Matlab Math Meaning Conversion Rule \textbf{U,V,W,Y,X,S} U,V,W,Y,X,S Matrix As is for one letter names \textbf{u,v,w,y,x,s} u,v,w,y,x,s Vector As is for one letter names L,M,K,N,T L,M,K,N,T Size As is for one letter names l,m,k,n,t l,m,k,n,t Iterator As is for one letter names \mathbf{\mu},\mathbf{\Sigma} Mu,Sigma Vector or Matrix UpperCamel for bold greeks \mu,\sigma mu,sigma Scalar lowerCamel for non-bold (scalar) greeks \textbf{x}' x_p Something Postfix _p for prime \hat{\textbf{x}} x_h Estimated Value Postfix _h for hat \tilde{\textbf{x}} x_t Something Postfix _t for tilde \textbf{x}_k^i x(k,i) Index As is if a subscript or a superscript means an index \textbf{x}_s,\textbf{X}_s xs,Xs Related Another Variable Add subscript if it is to mean another variable If you want to explicitly differentiate bold character x and non-bold character x, you may use a rule, Postfix _b for bold chars. But, when once bold characters x,y are appeared in the context, non-bold characters x,y are never appeared in the context usually. Therefore, I believe that we do not need to differentiate x and x. You should put comments to explain meanings of each variable or at least equation numbers of references because the variable names do not contain meanings. All variables are not defined in math equations of course. Use the programming convention at the next section. #### Other (Local) Variables • Word variable names are used to express meanings of variables and they must follow the following programming convention. Rule) • Use "UpperCamel" for vectors, matrices, cell arrays, and structs. • Use "lowerCamel" for others such as scalars, strings. • Do not use plural to express matrices and vectors. • Use prefix i,j,k for iterators (e.g., iClass) and n for number of objects (e.g, nClass). • Always use upper case only for the first character of a word even if it is an abbreviation, as Pdf (Probability Density Function). Example) • (matrix) GaussPdf • (scalar) gaussPdf • (iterator) iClass • (number) nClass  Reason Using "lower_case" for variable names are also a popular standard. However, I believe use of "lower_case" is an old fashion in the programming world. See GNU C coding standards. They use _ (underscore). See Java, prototype.js (A famous javascript framework), PHP Coding Standards, OpenCV Coding Style Guide (Computer Vision C Library), and OpenGL (C Graphics Library). They use CamelCase. Furthermore, there was an official matlab example using lowerCamel for variable names such as persistent - Matlab Reference . The one reason why recent programming styles use CamelCase is because it reduces number of characters. Matlab has a convention to use up to 80 characters in one line. Therefore, CamelCase is suitable. See wikipedia (2.2.2, 3.1) for more advantages of CamelCase. I wanted to differentiate scalars and matrices by using upper case. Therefore, variable names would become as "Upper_Case" for matrices when the rule "lower_case" is selected. However, "Upper_Case'' looks ugly because it looks _ (underscore) is not necessary (see UpperCase is already tells us where is the separation of words.) This is the another reason to use CamelCase. By the way, Latex would also become a reason to avoid _. When we paste codes on Latex, we have to escape _ as \_ (no problem in the verbatim environment, though). #### Persistent (Static) variables • Use the same rules with local variables #### Constant Variables There is no syntax in MATLAB for constant definition. Therefore, only naming conventions can tell which variables are constant. • Use UPPER_CASE. Example) • NUMBER_OF_CLASS But, mostly I do not use constant variables itself in matlab. #### Global Variables • Use lower_case and add package (toolbox) name as a prefix. Example) • cv_number_of_class But you should not use global variables as much as possible of course especially when you create a library function. Do not make other people worry about name conflicts. ### Functions #### M-Functions The naming convension of the built-in matlab functions is not easily readable. They use lowercase without _ mostly. We should use another convention. The file names are used as function names in matlab, therefore, we have to think about filesystem concurrently. (I came up with two conventions and I am still wondering which is better) One Rule • Use lower_case and put the package (toolbox) name as a prefix. Example) • cv_knn  Reason Function names of matlab are used for filenames too. Because windows filesystem does not differentiate lowercases and uppercases, we should use only lowercases or uppercases. It prohibits us to use lowerCamel or UpperCamel. "lower_case" should be better than "UPPER_CASE". One Rule • Use lowerCamel and put the package (toolbox) name as prefix Example) • cvKnn  Reason The rule "lower_case" is an old fashion as I stated in the Variables section, and I chose to use "lowerCamel" for variable names. Selecting the same rule for function names is reasonable. The disadvantage of using lowerCamel is that windows filesystem does not differentiate lowercases and uppercases. However, I believe nobody tries to name a function as Pca when there is a function pca already. Therefore, I may not need to care of it. There was an official example using lowerCamel for function names such as Nested Function - Matlab Reference. #### Nested or Sub Functions • Use prefix i which means inner.  Reason There is a programming convention to put prefix _ for private functions such _create(). But, matlab does not allow us to use prefix _ for function names. I chose to use 'i' which means 'inner' instead. #### Test Functions Test functions are created to verify behaviors of fcuntions by developers for debugging. • Add tests prefix or create a tests directory. Example) • testsCvKmeans.m  Reason There is a programming convention that test functions have a 'tests' prefix so that test functions can be easily bundled in one place. See examples such as Python UnitTest, PHP UnitTest. Use 'tests' instead of 'test' because the word test is often used for simple experiments not for verification test. #### Demo Functions Demo functions are created to demonstrate functions for users. • Add Demo postfix or create a demo directory Example) • cvKmeansDemoClassifi • cvKmeansDemoVq  Reason There might exist two or three demos, so adding demo prefix makes filenames as demoCvKmeansClassifi, and demoCvKmeansVq. However, the word 'demo' and its purpose should align together and demoCvKmeansVq looks like that it is a demo function of cvKmeansVq.m. It is bad. #### Run Functions I define that Run functions are user interfaces to be used in command line. For example, a function, cvGaussFilter2Run(infile, outfile, sigma) is a run function to call O = cvGaussFilter2(I, sigma).  This allows users to run the codes as cvGaussFilter2Run('test.jpg', 'test.out.jpg', 0.2); instead of I = imread('test.jpg'); O = cvGaussFIlter2(I, 0.2); imwrite(O, 'test.out.jpg'); This especially helps to run matlab scripts in the UNIX command line as  matlab -nosplash -nodesktop -r "cvGaussFilter2Run('test.jpg', 'test.out.jpg', 0.2);"

In this example, the Run interface did not do many jobs, but it can do something more generally. It is a good convention to create Run functions.

A matlab trick to use 'varargin' may helps you to write a Run interface.

function cvGaussFilter2Run(imfile, outfile, varargin)
O = cvGaussFilter2(I, varargin{:});
imwrite(O, outfile);
end

With this trick, you do not need to modify the Run interface even if number of arguments (options) in the main function were increased.

## Layout

### Indenting

• Use the matlab editor indentation (4-spaces).

### Control Structure

These include if, for, while, switch, etc. Here is an example if statement, since it is the most complicated of them:

if attendance >= 0.90
pass = 1;
elseif ((condition2) || (condition4))
pass = 1;
else
fail = 1;
end;

Control statements should have one space between the control keyword.

### Function Calls

Functions should be called with no spaces between the function name, the opening parenthesis, and the first parameter; spaces between commas and each parameter, and no space between the last parameter, the closing parenthesis, and the semicolon. Here's an example:

[Cluster, Codebook] = kmeans(X.', K);

### Colon

Use colon operators without space

i = 1:M

### Matrix Arguments

Matrix arguments are specified without space

A(:,k)

This makes clear to differentiate with function calls.

## Practical Issues

### Vector

• Construct a vector as a column vector.
 Reason This is a convention in the math world. Therefore, this convention make you easy to follow paper's math equations. Matlab basic built-in functions have a convention that a vector is a row vector, for example, cov(matrix) assumes a matrix is a set of row vectors, constructing a vector as A=[];A(1)=1;A(2)=1;, gives a row vector. But, Forget it. Following math equations is the best practice, and in fact, many matlab toolboxes (even officials) receive a vector as a column vector.

### Matrix

• Compose a 2-dimensional matrix as a set of column vectors.
 Reason A vector is a column vector in the math convention.
• If a 3-dimensional matrix is a set of 2-dimensional matrices, construct as mat(:,:,i)
• If a N-dimensional matrix is a set of N-1 dimensional matrices, construct as mat(:,:,:...,:,i)
 Reason mat(:,:,i) == mat(:,:), mat(:,:,:,i) == mat(:,:,:), and so on. mat(i,:,:) != mat(:,:)

FYI: reshape and permute would be useful to take care of shapes of matrices.

### Transpose

• Use .' instead of ' usually.
 Reason ' is not the transpose, but the conjugate transpose. Transpose is faster than the conjugate transpose of course.

### Integer i, j

• You may avoid to use integer i, j
 Reason i and j are reserved as an imaginary unit in matlab. However, I use i and j for loops when I am sure that I do not use imaginary numbers because using i, j, k for loops are common conventions in many programming languages. But, in the case too, be sure to set i, j in local spaces such as inside functions.

### Processing input arguments of a function

Default values of option arguments of a function can be set as

function Gmm = cvGmmTrain(X, K, maxIter, verbose)
error(nargchk(2, 4, nargin));
if ~exist('maxIter', 'var') || isempty(maxIter)
maxIter = 10;
end
if ~exist('verbose', 'var') || isempty(verbose)
verbose = false;
end

By allowing empty [], users can set 'verbose' option without setting 'maxIter' option.

cvGmmTrain(X, K, [], 1)

This way is better than using nargin as

function Gmm = cvGmmTrain(X, K, maxIter, verbose)
if nargin < 4, verbose = false; end
if nargin < 3, maxIter = 10; end

because the nargin way requires to modify codes when developers want to change the order of arguments and does not allow [].

### Verbose or Plot option

Your function may want to plot or print inter-mediate states or results. However, you may want not to plot or print anything for batch processing porposes. In such cases, your function should have a verbose or plot option. I recommend one convention for the case.

• Set a verbose or plot option as a last argument.
 Reason Verbose or plot option is unimportant factor in terms of numerical computations, so put it to the latter. But, verbose or plot option is often used common option, so put it at the last where is clear and meaningful position.

### repmat or * ones(1,N)

• Use repmat, but * ones(1, N) is also allowed.

Situation)

We sometimes want to expand a vector to matrix as

A = [1
2
3];
% into
M = [1 1 1
2 2 2
3 3 3];

This can be realized by using repmat or * ones(1,N)

M = repmat(A, 1, 3);
M = A * ones(1, 3);
 Reason repmat is faster than * ones(1,N) which constructs 1s and multiplies. But, if you are reading papers, * ones(1,N) is allowed because math equations use this way, repmat does not exist in the math world.

### 1 for rows, 2 for columns

Matlab has convention as "1 for rows, 2 for columns" such as

ones(10, 5) => 10 by 5 matrix
[nRow, nCol] = size(A);
repmat(A, 3, 4) = (nRow*3) by (nCol*4) matrix

1st argument is for number of rows, 2nd is for number of columns.

sum(A, 1) => 1 by nCol
sum(A, 2) => nRow by 1

1 is for summation respect to rows, make number of rows 1. 2 is for summation respect to columns, make number of columns 1.

PS. Now, I again think, it is weird behavior that matlab creates a 1 by nCol vector by A=[];A(1)=1;A(2)=1;. Why it proceeds into the 2nd direction...

### Avoid a structure or create a constructor

A structure enforces names of properties. It is a disadvantage. Look following two function definitions.

function func1(a, b, c)
function func2(struct)
a = struct.a;
b = struct.b;
c = struct.c;

Users can call the former function with any variable names, but the latter enforces users construct a structure with fixed properties' names beforehand. Avoid structures.

But, when you need to have a structure anyway, create a constructor function definitely. Then, users can call as

func2(structConstruct(d,e,f))

without caring of structure properties' names and you can write a documentation about the structure in the constructor function.