You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

279 lines
12 KiB

% mtimesx does a matrix multiply of two inputs (single, double, or sparse)
%******************************************************************************
%
% MATLAB (R) is a trademark of The Mathworks (R) Corporation
%
% Function: mtimesx
% Filename: mtimesx.m
% Programmer: James Tursa
% Version: 1.10
% Date: December 08, 2009
% Copyright: (c) 2009 by James Tursa, All Rights Reserved
%
% This code uses the BSD License:
%
% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions are
% met:
%
% * Redistributions of source code must retain the above copyright
% notice, this list of conditions and the following disclaimer.
% * Redistributions in binary form must reproduce the above copyright
% notice, this list of conditions and the following disclaimer in
% the documentation and/or other materials provided with the distribution
%
% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
% AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
% ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
% LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
% CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
% SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
% INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
% CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
% POSSIBILITY OF SUCH DAMAGE.
%
%--
%
% mtimesx is a fast general purpose matrix and scalar multiply routine that utilizes
% BLAS calls and custom code to perform the calculations. mtimesx also has extended
% support for n-Dimensional (nD, n > 2) arrays, treating these as arrays of 2D matrices
% for the purposes of matrix operations.
%
% "Doesn't MATLAB already do this?" For 2D matrices, yes, it does. However, MATLAB does
% not always implement the most efficent algorithms for memory access, and MATLAB does not
% always take full advantage of symmetric cases. The mtimesx 'SPEED' mode attempts to do
% both of these to the fullest extent possible. For nD matrices, MATLAB does not have
% direct support for this. One is forced to write loops to accomplish the same thing that
% mtimesx can do faster.
%
% The usage is as follows (arguments in brackets [ ] are optional):
%
% Syntax
%
% M = mtimesx( [mode] )
% C = mtimesx(A [,transa] ,B [,transb] [,mode])
%
% Description
%
% mtimesx performs the matrix calculation op(A) * op(B), where:
% A = A single or double or sparse scalar, matrix, or array.
% B = A single or double or sparse scalar, matrix, or array.
% transa = A character indicating a pre-operation on A:
% transb = A character indicating a pre-operation on B:
% The pre-operation can be any of:
% 'N' or 'n' = No operation (the default if trans_ is missing)
% 'T' or 't' = Transpose
% 'C' or 'c' = Conjugate Transpose
% 'G' or 'g' = Conjugate (no transpose)
% mode = 'MATLAB' or 'SPEED' (sets mode for current and future calculations,
% case insensitive, optional)
% M is a string indicating the current calculation mode, before setting the new one.
% C is the result of the matrix multiply operation.
%
% Examples:
%
% C = mtimesx(A,B) % performs the calculation C = A * B
% C = mtimesx(A,'T',B) % performs the calculation C = A.' * B
% C = mtimesx(A,B,'G') % performs the calculation C = A * conj(B)
% C = mtimesx(A,'C',B,'C') % performs the calculation C = A' * B'
%
% mtimesx has two modes:
%
% 'MATLAB' mode: This mode attempts to reproduce the MATLAB intrinsic function mtimes
% results exactly. When there was a choice between faster code that did not match the
% MATLAB intrinsic mtimes function results exactly vs slower code that did match the
% MATLAB intrinsic mtimes function results exactly, the choice was made to use the
% slower code. Speed improvements were only made in cases that did not cause a mismatch.
% Caveat: I have only tested on a PC with later versions of MATLAB. But MATLAB may use
% different algorithms for mtimes in earlier versions or on other machines that I was
% unable to test, so even this mode may not match the MATLAB intrinsic mtimes function
% exactly in some cases. This is the default mode when mtimesx is first loaded and
% executed (i.e., the first time you use mtimesx in your MATLAB session and the first
% time you use mtimesx after clearing it). You can set this mode for all future
% calculations with the command mtimesx('MATLAB') (case insensitive).
%
% 'SPEED' mode: This mode attempts to reproduce the MATLAB intrinsic function mtimes
% results closely, but not necessarily exactly. When there was a choice between faster
% code that did not exactly match the MATLAB intrinsic mtimes function vs slower code
% that did match the MATLAB intrinsic mtimes function, the choice was made to use the
% faster code. Speed improvements were made in all cases that I could identify, even
% if they caused a slight mismatch with the MATLAB intrinsic mtimes results.
% NOTE: The mismatches are the results of doing calculations in a different order and
% are not indicative of being less accurate. You can set this mode for all future
% calculations with the command mtimesx('SPEED') (case insensitive).
%
% Note: You cannot combine double sparse and single inputs, since MATLAB does not
% support a single sparse result. You also cannot combine sparse inputs with full
% nD (n > 2) inputs, since MATLAB does not support a sparse nD result. The only
% exception is a sparse scalar times an nD full array. In that special case,
% mtimesx will treat the sparse scalar as a full scalar and return a full nD result.
%
% Note: The �N�, �T�, and �C� have the same meanings as the direct inputs to the BLAS
% routines. The �G� input has no direct BLAS counterpart, but was relatively easy to
% implement in mtimesx and saves time (as opposed to computing conj(A) or conj(B)
% explicitly before calling mtimesx).
%
% mtimesx supports nD inputs. For these cases, the first two dimensions specify the
% matrix multiply involved. The remaining dimensions are duplicated and specify the
% number of individual matrix multiplies to perform for the result. i.e., mtimesx
% treats these cases as arrays of 2D matrices and performs the operation on the
% associated parings. For example:
%
% If A is (2,3,4,5) and B is (3,6,4,5), then
% mtimesx(A,B) would result in C(2,6,4,5)
% where C(:,:,i,j) = A(:,:,i,j) * B(:,:,i,j), i=1:4, j=1:5
%
% which would be equivalent to the MATLAB m-code:
% C = zeros(2,6,4,5);
% for m=1:4
% for n=1:5
% C(:,:,m,n) = A(:,:,m,n) * B(:,:,m,n);
% end
% end
%
% The first two dimensions must conform using the standard matrix multiply rules
% taking the transa and transb pre-operations into account, and dimensions 3:end
% must match exactly or be singleton (equal to 1). If a dimension is singleton
% then it is virtually expanded to the required size (i.e., equivalent to a
% repmat operation to get it to a conforming size but without the actual data
% copy). For example:
%
% If A is (2,3,4,5) and B is (3,6,1,5), then
% mtimesx(A,B) would result in C(2,6,4,5)
% where C(:,:,i,j) = A(:,:,i,j) * B(:,:,1,j), i=1:4, j=1:5
%
% which would be equivalent to the MATLAB m-code:
% C = zeros(2,6,4,5);
% for m=1:4
% for n=1:5
% C(:,:,m,n) = A(:,:,m,n) * B(:,:,1,n);
% end
% end
%
% When a transpose (or conjugate transpose) is involved, the first two dimensions
% are transposed in the multiply as you would expect. For example:
%
% If A is (3,2,4,5) and B is (3,6,4,5), then
% mtimesx(A,'C',B,'G') would result in C(2,6,4,5)
% where C(:,:,i,j) = A(:,:,i,j)' * conj( B(:,:,i,j) ), i=1:4, j=1:5
%
% which would be equivalent to the MATLAB m-code:
% C = zeros(2,6,4,5);
% for m=1:4
% for n=1:5
% C(:,:,m,n) = A(:,:,m,n)' * conj( B(:,:,m,n) );
% end
% end
%
% If A is a scalar (1,1) and B is (3,6,4,5), then
% mtimesx(A,'G',B,'C') would result in C(6,3,4,5)
% where C(:,:,i,j) = conj(A) * B(:,:,i,j)', i=1:4, j=1:5
%
% which would be equivalent to the MATLAB m-code:
% C = zeros(6,3,4,5);
% for m=1:4
% for n=1:5
% C(:,:,m,n) = conj(A) * B(:,:,m,n)';
% end
% end
%
% ---------------------------------------------------------------------------------------------------------------------------------
%
% The BLAS routines used are DDOT, DGEMV, DGEMM, DSYRK, and DSYR2K for double
% variables, and SDOT, SGEMV, SGEMM, SSYRK, and SSYR2K for single variables.
% These routines are (description taken from www.netlib.org):
%
% DDOT and SDOT:
%
% * forms the dot product of two vectors.
%
% DGEMV and SGEMV:
%
% * DGEMV performs one of the matrix-vector operations
% *
% * y := alpha*A*x + beta*y, or y := alpha*A'*x + beta*y,
% *
% * where alpha and beta are scalars, x and y are vectors and A is an
% * m by n matrix.
%
% DGEMM and SGEMM:
%
% * DGEMM performs one of the matrix-matrix operations
% *
% * C := alpha*op( A )*op( B ) + beta*C,
% *
% * where op( X ) is one of
% *
% * op( X ) = X or op( X ) = X',
% *
% * alpha and beta are scalars, and A, B and C are matrices, with op( A )
% * an m by k matrix, op( B ) a k by n matrix and C an m by n matrix.
%
% DSYRK and SSYRK:
%
% * DSYRK performs one of the symmetric rank k operations
% *
% * C := alpha*A*A' + beta*C,
% *
% * or
% *
% * C := alpha*A'*A + beta*C,
% *
% * where alpha and beta are scalars, C is an n by n symmetric matrix
% * and A is an n by k matrix in the first case and a k by n matrix
% * in the second case.
%
% DSYR2K and SSYR2K:
%
% * DSYR2K performs one of the symmetric rank 2k operations
% *
% * C := alpha*A*B' + alpha*B*A' + beta*C,
% *
% * or
% *
% * C := alpha*A'*B + alpha*B'*A + beta*C,
% *
% * where alpha and beta are scalars, C is an n by n symmetric matrix
% * and A and B are n by k matrices in the first case and k by n
% * matrices in the second case.
%
% Double sparse matrix operations are supported, but not always directly.
% For (matrix) * (scalar) operations, custom code is used to produce a result
% that minimizes memory access times. All other operations, such as
% (matrix) * (vector) or (matrix) * (matrix), or any operation involving a transpose
% or conjugate transpose, are obtained with calls back to the MATLAB intrinsic
% mtimes function. Thus for most non-scalar sparse operations, mtimesx is
% simply a thin wrapper around the intrinsic MATLAB function and you will see
% no speed improvement.
%
% ---------------------------------------------------------------------------------------------------------------------------------
%
% Examples:
%
% C = mtimesx(A,B) % performs the calculation C = A * B
% C = mtimesx(A,'T',B,'speed') % performs the calculation C = A.' * B
% % using a fast algorithm that may not
% % match MATLAB results exactly
% mtimesx('matlab') % sets calculation mode to match MATLAB
% C = mtimesx(A,B,'g') % performs the calculation C = A * conj(B)
% C = mtimesx(A,'c',B,'C') % performs the calculation C = A' * B'
%
% ---------------------------------------------------------------------------------------------------------------------------------
function varargout = mtimesx(varargin)
%\
% If you got here then mtimesx is not compiled yet, so go compile it first.
%/
mtimesx_build;
%\
% Call the mex routine mtimesx.
%/
[varargout{1:nargout}] = mtimesx(varargin{:});
end