You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

1481 lines
53 KiB

/*************************************************************************************
*
* MATLAB (R) is a trademark of The Mathworks (R) Corporation
*
* Function: mtimesx
* Filename: mtimesx.c
* Programmer: James Tursa
* Version: 1.40
* Date: October 4, 2010
* Copyright: (c) 2009, 2010 by James Tursa, All Rights Reserved
*
* This code uses the BSD License:
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
* met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the distribution
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
* Requires: mtimesx_RealTimesReal.c
*
* mtimesx is a multiply function that utilizes BLAS calls and custom code for
* matrix-matrix, matrix-vector, vector-vector, scalar-vector, scalar-matrix,
* and scalar-array calculations. Full support is given for transpose, conjugate
* transpose, and conjugate operations. Operands can be single, double, or
* sparse double. Sparse double matrices have direct support for scalar-matrix
* operations ... all other computations involving sparse matrices are done
* with calls to the MATLAB intrinsic functions.
*
* Building:
*
* mtimesx is typically self building. That is, the first time you call mtimesx,
* the mtimesx.m file recognizes that the mex routine needs to be compiled and
* then the compilation and linkage to the BLAS library will happen automatically.
* If the automatic compilation does not work, then you can do it manually as
* follows:
* - Put all of the files in one directory that is on the MATLAB path (but don't
* put them in a toolbox directory)
* - Make that directory the current directory.
* - Create a string variable with the pathname and filename of the BLAS library
* file to use. e.g. for microsoft it may be something like the following:
*
* - Issue the following command at the MATLAB prompt.
* >> mex('mtimesx.c',lib_blas)
*
* And if you are on a linux machine then you may need to add the string
* '-DEFINEUNIX' to the mex command argument list.
*
* The usage is as follows (arguments in brackets [ ] are optional):
*
* Syntax
*
* M = mtimesx( [mode] )
* C = mtimesx(A [,transa] ,B [,transb] [,mode])
*
* Description
*
* mtimesx performs the matrix calculation C = op(A) * op(B), where:
* A = A single or double or sparse scalar, matrix, or array.
* B = A single or double or sparse scalar, matrix, or array.
* transa = A character indicating a pre-operation on A (optional)
* transb = A character indicating a pre-operation on B (optional)
* The pre-operation can be any of:
* 'N' or 'n' = No operation (the default if trans_ is missing)
* 'T' or 't' = Transpose
* 'C' or 'c' = Conjugate Transpose
* 'G' or 'g' = Conjugate (no transpose)
* mode = 'MATLAB' or 'SPEED' (sets mode for current and future calculations,
* case insensitive, optional)
* M is a string indicating the current calculation mode, before setting the new one.
* C is the result of the matrix multiply operation.
*
* mtimesx uses a combination of BLAS library routine calls and custom code
* to optimize the matrix or scalar multiplies with respect to speed
* without sacrificing accuracy. As a result, some calculations are done in
* a different way than the equivalent MATLAB calculation, resulting in
* slightly different answers. Although slightly different, the mtimesx
* results are just as accurate as the MATLAB results. The general matrix
* multiply calculation uses the same BLAS calls as MATLAB, so there is no
* difference in speed or results for this case. For the scalar * matrix,
* scalar * vector, scalar * array, and matrix * vector cases there
* generally *will* be a difference between mtimesx and the MATLAB built in
* mtimes function. Usually, mtimesx will be faster that the equivalent
* MATLAB function, but some of these cases are slightly slower.
*
* M = mtimesx returns a string with the current calculation mode. The string
* will either be 'MATLAB' or 'SPEED' or one of the OMP modes.
*
* M = mtimesx(mode) sets the calculation mode to mode. The mode variable
* must be either the string 'MATLAB' or the string 'SPEED'. The return
* variable M is the previous mode setting prior to setting the new mode.
* The mode is case insensitive (lower or upper case is OK). You can also
* set one of the OMP modes if you have compiled with an OpenMP compiler.
*
* Mode:
*
* The 'MATLAB' mode uses the same sequence of BLAS calls that MATLAB does,
* and uses scalar multiply code that generates the same results as MATLAB.
* The purpose of this mode is to reproduce the exact same results as the
* same MATLAB operation. As such, there will often be no speed improvement
* by using mtimesx vs using the MATLAB intrinsic mtimes.
*
* The 'SPEED' mode uses different sequences of BLAS calls than MATLAB does
* whenever a speed improvement might be realized. Custom code for dot product
* calculations is used, and certain matrix-matrix and matrix-vector operations
* involving conjugates and/or transposes uses a series of dot product type
* calculations instead of calling BLAS routines if a speed improvement can
* be realized. This mode produces results that will sometimes differ slightly
* from the same MATLAB operation, but the results will still be accurate.
*
* The 'OMP' modes are basically the same as 'SPEED' mode except that some
* of the calculations are multi-threaded using OpenMP directives.
*
* Examples:
*
* C = mtimesx(A,B) % performs the calculation C = A * B
* C = mtimesx(A,'T',B) % performs the calculation C = A.' * B
* C = mtimesx(A,B,'G') % performs the calculation C = A * conj(B)
* C = mtimesx(A,'C',B,'C') % performs the calculation C = A' * B'
* mtimesx % returns the current calculation mode
* mtimesx('MATLAB') % sets calculation mode to match MATLAB
*
* Note: You cannot combine double sparse and single inputs, since MATLAB does not
* support a single sparse result. You also cannot combine sparse inputs with full
* nD (n > 2) inputs, since MATLAB does not support a sparse nD result. The only
* exception is a sparse scalar times an nD full array. In that special case,
* mtimesx will treat the sparse scalar as a full scalar and return a full nD result.
*
* Note: The �N�, �T�, and �C� have the same meanings as the direct inputs to the BLAS
* routines. The �G� input has no direct BLAS counterpart, but was relatively easy to
* implement in mtimesx and saves time (as opposed to computing conj(A) or conj(B)
* explicitly before calling mtimesx).
*
* mtimesx supports nD inputs. For these cases, the first two dimensions specify the
* matrix multiply involved. The remaining dimensions are duplicated and specify the
* number of individual matrix multiplies to perform for the result. i.e., mtimesx
* treats these cases as arrays of 2D matrices and performs the operation on the
* associated parings. For example:
*
* If A is (2,3,4,5) and B is (3,6,4,5), then
* mtimesx(A,B) would result in C(2,6,4,5)
* where C(:,:,i,j) = A(:,:,i,j) * B(:,:,i,j), i=1:4, j=1:5
*
* which would be equivalent to the MATLAB m-code:
* C = zeros(2,6,4,5);
* for m=1:4
* for n=1:5
* C(:,:,m,n) = A(:,:,m,n) * B(:,:,m,n);
* end
* end
*
* The first two dimensions must conform using the standard matrix multiply rules
* taking the transa and transb pre-operations into account, and dimensions 3:end
* must match exactly or be singleton (equal to 1). If a dimension is singleton
* then it is virtually expanded to the required size (i.e., equivalent to a
* repmat operation to get it to a conforming size but without the actual data
* copy). For example:
*
* If A is (2,3,4,5) and B is (3,6,1,5), then
* mtimesx(A,B) would result in C(2,6,4,5)
* where C(:,:,i,j) = A(:,:,i,j) * B(:,:,1,j), i=1:4, j=1:5
*
* which would be equivalent to the MATLAB m-code:
* C = zeros(2,6,4,5);
* for m=1:4
* for n=1:5
* C(:,:,m,n) = A(:,:,m,n) * B(:,:,1,n);
* end
* end
*
* When a transpose (or conjugate transpose) is involved, the first two dimensions
* are transposed in the multiply as you would expect. For example:
*
* If A is (3,2,4,5) and B is (3,6,4,5), then
* mtimesx(A,'C',B,'G') would result in C(2,6,4,5)
* where C(:,:,i,j) = A(:,:,i,j)' * conj( B(:,:,i,j) ), i=1:4, j=1:5
*
* which would be equivalent to the MATLAB m-code:
* C = zeros(2,6,4,5);
* for m=1:4
* for n=1:5
* C(:,:,m,n) = A(:,:,m,n)' * conj( B(:,:,m,n) );
* end
* end
*
* If A is a scalar (1,1) and B is (3,6,4,5), then
* mtimesx(A,'G',B,'C') would result in C(6,3,4,5)
* where C(:,:,i,j) = conj(A) * B(:,:,i,j)', i=1:4, j=1:5
*
* which would be equivalent to the MATLAB m-code:
* C = zeros(6,3,4,5);
* for m=1:4
* for n=1:5
* C(:,:,m,n) = conj(A) * B(:,:,m,n)';
* end
* end
*
* Change Log:
* 2009/Sep/27 --> 1.00, Initial Release
* 2009/Dec/10 --> 1.11, Fixed bug for empty transa & transb inputs
* 2010/Feb/23 --> 1.20, Fixed bug for dgemv and sgemv calls
* 2010/Aug/02 --> 1.30, Added (nD scalar) * (nD array) capability
* Replaced buggy mxRealloc with custom routine
* 2010/Oct/04 --> 1.40, Added OpenMP support for custom code
* Expanded sparse * single and sparse * nD support
* Fixed (nD complex scalar)C * (nD array) bug
*
****************************************************************************/
/* OpenMP ------------------------------------------------------------- */
#ifdef _OPENMP
#include <omp.h>
#endif
/* Includes ----------------------------------------------------------- */
#include <string.h>
#include <stddef.h>
#include <ctype.h>
#include <math.h>
#include <time.h>
#include "mex.h"
/* Macros ------------------------------------------------------------- */
#ifndef MWSIZE_MAX
#define mwIndex int
#define mwSignedIndex int
#define mwSize int
#define mwSize_t size_t
#else
#define mwSize_t mwSize
#endif
#define METHOD_BLAS 0
#define METHOD_LOOPS 1
#define METHOD_LOOPS_OMP 2
#define MTIMESX_NOT_SET 0
#define MTIMESX_BLAS 1
#define MTIMESX_LOOPS 2
#define MTIMESX_LOOPS_OMP 3
#define MTIMESX_MATLAB 4
#define MTIMESX_SPEED 5
#define MTIMESX_SPEED_OMP 6
#define STRINGIZE(name) #name
#define TOKENSTRING(name) STRINGIZE(name)
#define DIRECTIVE_MAX 1000
#ifndef COMPILER
#define COMPILER (unknown)
#endif
/* Prototypes --------------------------------------------------------- */
mxArray *DoubleTimesDouble(mxArray *, char, mxArray *, char);
mxArray *FloatTimesFloat(mxArray *, char, mxArray *, char);
mxArray *mxCreateSharedDataCopy(const mxArray *pr);
char mxArrayToTrans(const mxArray *mx);
void *myRealloc(void *vp, mwSize_t n);
void mtimesx_logo(void);
mxArray *modestring(int);
/* Global Variables --------------------------------------------------- */
int mtimesx_mode = MTIMESX_MATLAB;
int max_threads = 0;
int threads_used = 0;
int debug = 0;
int debug_message = 0;
/* Functions ect for OpenMP implementations ------- */
#ifdef _OPENMP
#define OPENMP_ENABLED 1.0
/* Functions etc for non OpenMP implementations ----------------- */
/* The omp_get_num_procs() function is courtesy of Dirk-Jan Kroon */
/* and is based on his FEX submission maxNumCompThreads --------- */
#else
#define OPENMP_ENABLED 0.0
#define omp_get_wtick() 0.0
#define omp_get_wtime() ((double)clock()/((double)CLOCKS_PER_SEC))
#if defined(_WIN32) || defined(_WIN64)
#include <windows.h>
int omp_get_num_procs( ) {
SYSTEM_INFO sysinfo;
GetSystemInfo(&sysinfo);
return sysinfo.dwNumberOfProcessors;
}
#elif MACOS
#include <sys/param.h>
#include <sys/sysctl.h>
int omp_get_num_procs( ) {
int nm[2];
size_t len = 4;
uint32_t count;
nm[0] = CTL_HW; nm[1] = HW_AVAILCPU;
sysctl(nm, 2, &count, &len, NULL, 0);
if(count < 1) {
nm[1] = HW_NCPU;
sysctl(nm, 2, &count, &len, NULL, 0);
if(count < 1) { count = 1; }
}
return count;
}
#else
#include <unistd.h>
int omp_get_num_procs( ) {
return sysconf(_SC_NPROCESSORS_ONLN);
}
#endif
#endif
/*------------------------------------------------------------------------ */
/*------------------------------------------------------------------------ */
/*------------------------------------------------------------------------ */
/*------------------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[])
{
mxArray *A, *B, *C, *Araw, *Braw;
mxArray *rhs[4];
char transa, transb;
char *directive, *cp;
char transstring[2] = "_";
int unsupported = 0;
int i, j, k, got_directive, nrhs_old = nrhs;
int mtimesx_mode_new, max_threads_new,
already_set_mode, already_set_debug, already_set_threads;
mxArray *ans;
double threads;
/*-------------------------------------------------------------------------
* Check for proper number of inputs and outputs
*------------------------------------------------------------------------- */
if( nlhs > 1 ) {
mexErrMsgTxt("Must have at most 1 output.");
}
/*-------------------------------------------------------------------------
* If no inputs, just return the current mode
*------------------------------------------------------------------------- */
if( nrhs == 0 ) {
plhs[0] = modestring(mtimesx_mode);
return;
}
/*-------------------------------------------------------------------------
* Find out if any input is a directive
*------------------------------------------------------------------------- */
i = 0;
mtimesx_mode_new = MTIMESX_NOT_SET;
max_threads_new = 0;
already_set_mode = 0;
already_set_debug = 0;
already_set_threads = 0;
while( i < nrhs ) {
if( mxIsChar(prhs[i]) ) {
if( mxGetNumberOfElements(prhs[i]) > DIRECTIVE_MAX ) {
mexErrMsgTxt("Unknown directive.");
} else if( cp = directive = mxArrayToString(prhs[i]) ) {
j = 0;
while( *cp ) {
if( *cp == '(' ) j++;
if( *cp == ')' ) j--;
if( j == 0 ) {
*cp = toupper( *cp );
}
cp++;
}
k = cp - directive;
got_directive = 1;
if( strcmp(directive,"MATLAB") == 0 ) {
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
mtimesx_mode_new = MTIMESX_MATLAB;
} else if( strcmp(directive,"SPEED") == 0 ) {
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
mtimesx_mode_new = MTIMESX_SPEED;
} else if( strcmp(directive,"BLAS") == 0 ) {
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
mtimesx_mode_new = MTIMESX_BLAS;
} else if( strcmp(directive,"LOOPS") == 0 ) {
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
mtimesx_mode_new = MTIMESX_LOOPS;
} else if( strcmp(directive,"LOGO") == 0 ) {
if( nrhs_old > 1 ) {
mexErrMsgTxt("Cannot combine LOGO directive with other arguments.");
}
mtimesx_logo();
return;
} else if( strcmp(directive,"HELP") == 0 ) {
if( nrhs_old > 1 ) {
mexErrMsgTxt("Cannot combine HELP directive with other arguments.");
}
mexEvalString("help mtimesx;");
return;
} else if( strcmp(directive,"DEBUG") == 0 ) {
if( already_set_debug ) {
mexErrMsgTxt("Cannot set DEBUG/NODEBUG twice in same call.");
}
already_set_debug = 1;
debug = 1;
} else if( strcmp(directive,"NODEBUG") == 0 ) {
if( already_set_debug ) {
mexErrMsgTxt("Cannot set DEBUG/NODEBUG twice in same call.");
}
already_set_debug = 1;
debug = 0;
} else if( strcmp(directive,"OMP_GET_NUM_PROCS") == 0 ||
strcmp(directive,"OMP_GET_NUM_PROCS()") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine OMP_GET_NUM_PROCS directive with other arguments.");
}
plhs[0] = mxCreateDoubleScalar(omp_get_num_procs());
return;
} else if( strcmp(directive,"OMP_GET_MAX_THREADS") == 0 ||
strcmp(directive,"OMP_GET_MAX_THREADS()") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine OMP_GET_MAX_THREADS directive with other arguments.");
}
plhs[0] = mxCreateDoubleScalar(max_threads);
return;
} else if( strcmp(directive,"OMP_GET_NUM_THREADS") == 0 ||
strcmp(directive,"OMP_GET_NUM_THREADS") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine OMP_GET_NUM_THREADS directive with other arguments.");
}
plhs[0] = mxCreateDoubleScalar(threads_used);
return;
} else if( strcmp(directive,"COMPILER") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine COMPILER directive with other arguments.");
}
plhs[0] = mxCreateString(TOKENSTRING(COMPILER));
return;
} else if( strcmp(directive,"OPENMP") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine OPENMP directive with other arguments.");
}
plhs[0] = mxCreateDoubleScalar(OPENMP_ENABLED);
return;
} else if( strcmp(directive,"SPEEDOMP") == 0 ) {
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
#ifdef _OPENMP
mtimesx_mode_new = MTIMESX_SPEED_OMP;
if( max_threads == 0 ) {
max_threads_new = omp_get_num_procs();
}
#else
mtimesx_mode_new = MTIMESX_SPEED;
#endif
} else if( strcmp(directive,"LOOPSOMP") == 0 ) {
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
#ifdef _OPENMP
mtimesx_mode_new = MTIMESX_LOOPS_OMP;
if( max_threads == 0 ) {
max_threads_new = omp_get_num_procs();
}
#else
mtimesx_mode_new = MTIMESX_LOOPS;
#endif
} else if( strcmp(directive,"OMP_GET_WTICK") == 0 ||
strcmp(directive,"OMP_GET_WTICK()") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine OMP_GET_WTICK directive with other arguments.");
}
plhs[0] = mxCreateDoubleScalar(omp_get_wtick());
return;
} else if( strcmp(directive,"OMP_GET_WTIME") == 0 ||
strcmp(directive,"OMP_GET_WTIME()") == 0 ) {
if( nrhs > 1 ) {
mexErrMsgTxt("Cannot combine OMP_GET_WTIME directive with other arguments.");
}
plhs[0] = mxCreateDoubleScalar(omp_get_wtime());
return;
} else if( strcmp(directive,"OMP") == 0 ) {
#ifdef _OPENMP
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
if( already_set_threads ) {
mexErrMsgTxt("Cannot set threads twice in same call.");
}
already_set_mode = 1;
already_set_threads = 1;
max_threads_new = omp_get_num_procs();
mtimesx_mode_new = MTIMESX_SPEED_OMP;
#else
if( already_set_mode ) {
mexErrMsgTxt("Cannot set mode twice in same call.");
}
already_set_mode = 1;
mtimesx_mode_new = MTIMESX_SPEED;
#endif
} else if( strcmp(directive,"OMP_SET_NUM_THREADS") == 0 ||
strcmp(directive,"OMP_SET_MAX_THREADS") == 0 ) {
if( already_set_threads ) {
mexErrMsgTxt("Cannot set threads twice in same call.");
}
already_set_threads = 1;
#ifdef _OPENMP
max_threads_new = omp_get_num_procs();
#else
max_threads_new = -1;
#endif
} else if( k > 20 && directive[19] == '(' ) {
directive[19] = '\0';
if( strcmp(directive,"OMP_SET_NUM_THREADS") == 0 ||
strcmp(directive,"OMP_SET_MAX_THREADS") == 0 ) {
if( already_set_threads ) {
mexErrMsgTxt("Cannot set threads twice in same call.");
}
already_set_threads = 1;
directive[ 8] = 'N';
directive[ 9] = 'U';
directive[10] = 'M';
directive[19] = '=';
cp = directive + 20;
j = 1;
while( *cp ) {
if( *cp == '(' ) {
j++;
} else if( *cp == ')' ) {
j--;
if( !j ) {
*cp = ';';
break;
}
}
cp++;
}
if( j ) {
mexErrMsgTxt("Expression or statement is incorrect--possibly unbalanced (");
}
if( mexEvalString(directive) ) {
mexErrMsgTxt("Unable to evaluate expression for number of threads");
}
ans = mexGetVariablePtr("caller","OMP_SET_NUM_THREADS");
if( ans == NULL ) {
mexErrMsgTxt("Unable to evaluate expression for number of threads");
}
j = threads = mxGetScalar(ans);
mexEvalString("clear OMP_SET_NUM_THREADS;");
if( j < 1 || j != threads ) {
mexErrMsgTxt("Number of threads must be a positive integer.");
}
if( j > omp_get_num_procs() ) j = omp_get_num_procs();
#ifdef _OPENMP
max_threads_new = j;
#else
max_threads_new = -1;
#endif
} else {
got_directive = 0;
}
} else {
got_directive = 0;
}
mxFree(directive);
} else {
mexErrMsgTxt("Error allocating memory for directive string");
}
if( got_directive ) {
for( j=i; j<nrhs-1; j++ ) {
prhs[j] = prhs[j+1];
}
nrhs--;
} else {
i++;
}
} else {
i++;
}
}
if( mtimesx_mode_new > 0 ) {
mtimesx_mode = mtimesx_mode_new;
}
if( max_threads_new > 0 ) {
max_threads = max_threads_new;
}
if( nrhs == 0 ) {
if( mtimesx_mode_new || !max_threads_new ) {
plhs[0] = modestring(mtimesx_mode);
} else {
plhs[0] = mxCreateDoubleScalar(max_threads);
}
return;
}
if( nrhs < 2 || nrhs > 4 ) {
mexErrMsgTxt("Must have 2 - 4 non-directive inputs for multiply function");
}
/*-------------------------------------------------------------------------
* Pick off the transpose character inputs. If they are missing or empty
* or blank, then use 'N' by default.
*------------------------------------------------------------------------- */
Araw = (mxArray *) prhs[0];
if( nrhs == 2 ) {
transa = 'N';
Braw = (mxArray *) prhs[1];
transb = 'N';
} else if( nrhs == 3 ) {
if( mxIsChar(prhs[1]) ) {
transa = mxArrayToTrans(prhs[1]);
Braw = (mxArray *) prhs[2];
transb = 'N';
} else if( mxIsChar(prhs[2]) ) {
transa = 'N';
Braw = (mxArray *) prhs[1];
transb = mxArrayToTrans(prhs[2]);
} else {
mexErrMsgTxt("2nd or 3rd input must be char.");
}
} else { /* nrhs == 4 */
if( mxIsChar(prhs[1]) && mxIsChar(prhs[3]) ) {
transa = mxArrayToTrans(prhs[1]);
Braw = (mxArray *) prhs[2];
transb = mxArrayToTrans(prhs[3]);
} else {
mexErrMsgTxt("2nd and 4th inputs must be char.");
}
}
/*-----------------------------------------------------------------------------
* Check for valid TRANS characters
*----------------------------------------------------------------------------- */
if( transa == 'n' ) transa = 'N';
if( transa == 't' ) transa = 'T';
if( transa == 'c' ) transa = 'C';
if( transa == 'g' ) transa = 'G';
if( transb == 'n' ) transb = 'N';
if( transb == 't' ) transb = 'T';
if( transb == 'c' ) transb = 'C';
if( transb == 'g' ) transb = 'G';
if( (transa != 'N' && transa != 'T' && transa != 'C' && transa != 'G') ||
(transb != 'N' && transb != 'T' && transb != 'C' && transb != 'G') ) {
mexErrMsgTxt("Invalid TRANS character. Expected N, T, C, or G.");
}
/*-----------------------------------------------------------------------------
* Check for proper input type and call the appropriate multiply routine.
* To be similar to MATLAB for mixed single-double operations, convert
* single inputs to double, do the calc, then convert back to single.
*----------------------------------------------------------------------------- */
/* Future Enhancement: Put in special code for scalar multiplies, do calc in double */
if( mxIsDouble(Araw) ) {
if( mxIsDouble(Braw) ) {
/*
if( mxIsSparse(Araw) && !mxIsSparse(Braw) &&
(mxGetNumberOfElements(Araw) != 1 || mxGetNumberOfDimensions(Braw) == 2) ) {
k = mexCallMATLAB(1, &B, 1, &Braw, "sparse");
plhs[0] = DoubleTimesDouble(Araw, transa, B, transb);
mxDestroyArray(B);
} else if( !mxIsSparse(Araw) && mxIsSparse(Braw) &&
(mxGetNumberOfElements(Braw) != 1 || mxGetNumberOfDimensions(Araw) == 2) ) {
k = mexCallMATLAB(1, &A, 1, &Araw, "sparse");
plhs[0] = DoubleTimesDouble(A, transa, Braw, transb);
mxDestroyArray(A);
} else {
plhs[0] = DoubleTimesDouble(Araw, transa, Braw, transb);
}
*/
if( mxIsSparse(Araw) && !mxIsSparse(Braw) &&
mxGetNumberOfElements(Araw) == 1 && mxGetNumberOfDimensions(Braw) == 2 ) {
k = mexCallMATLAB(1, &B, 1, &Braw, "sparse");
plhs[0] = DoubleTimesDouble(Araw, transa, B, transb);
mxDestroyArray(B);
} else if( !mxIsSparse(Araw) && mxIsSparse(Braw) &&
mxGetNumberOfElements(Braw) == 1 && mxGetNumberOfDimensions(Araw) == 2 ) {
k = mexCallMATLAB(1, &A, 1, &Araw, "sparse");
plhs[0] = DoubleTimesDouble(A, transa, Braw, transb);
mxDestroyArray(A);
} else {
plhs[0] = DoubleTimesDouble(Araw, transa, Braw, transb);
}
} else if( mxIsSingle(Braw) ) {
if( mxIsSparse(Araw) ) {
if( mxGetNumberOfElements(Araw) == 1 ) {
k = mexCallMATLAB(1, &C, 1, &Araw, "full");
k = mexCallMATLAB(1, &A, 1, &C, "single");
mxDestroyArray(C);
plhs[0] = FloatTimesFloat(A, transa, Braw, transb);
mxDestroyArray(A);
} else {
k = mexCallMATLAB(1, &B, 1, &Braw, "double");
plhs[0] = DoubleTimesDouble(Araw, transa, B, transb);
mxDestroyArray(B);
}
} else if( mxGetNumberOfElements(Araw) == 1 || mxGetNumberOfElements(Braw) == 1 ) {
k = mexCallMATLAB(1, &B, 1, &Braw, "double");
C = DoubleTimesDouble(Araw, transa, B, transb);
mxDestroyArray(B);
k = mexCallMATLAB(1, plhs, 1, &C, "single");
mxDestroyArray(C);
} else {
k = mexCallMATLAB(1, &A, 1, &Araw, "single");
plhs[0] = FloatTimesFloat(A, transa, Braw, transb);
mxDestroyArray(A);
}
} else {
unsupported = 1;
}
} else if( mxIsSingle(Araw) ) {
if( mxIsDouble(Braw) ) {
if( mxIsSparse(Braw) ) {
if( mxGetNumberOfElements(Braw) == 1 ) {
k = mexCallMATLAB(1, &C, 1, &Braw, "full");
k = mexCallMATLAB(1, &B, 1, &C, "single");
mxDestroyArray(C);
plhs[0] = FloatTimesFloat(Araw, transa, B, transb);
mxDestroyArray(B);
} else {
k = mexCallMATLAB(1, &A, 1, &Araw, "double");
plhs[0] = DoubleTimesDouble(A, transa, Braw, transb);
mxDestroyArray(A);
}
} else if( mxGetNumberOfElements(Araw) == 1 || mxGetNumberOfElements(Braw) == 1 ) {
k = mexCallMATLAB(1, &A, 1, &Araw, "double");
C = DoubleTimesDouble(A, transa, Braw, transb);
mxDestroyArray(A);
k = mexCallMATLAB(1, plhs, 1, &C, "single");
mxDestroyArray(C);
} else {
k = mexCallMATLAB(1, &B, 1, &Braw, "single");
plhs[0] = FloatTimesFloat(Araw, transa, B, transb);
mxDestroyArray(B);
}
} else if( mxIsSingle(Braw) ) {
plhs[0] = FloatTimesFloat(Araw, transa, Braw, transb);
} else {
unsupported = 1;
}
} else {
unsupported = 1;
}
if( unsupported ) {
if( debug ) {
mexPrintf("MTIMESX: Unsupported types ... calling MATLAB intrinsic function mtimes\n");
}
rhs[0] = Araw;
transstring[0] = transa;
rhs[1] = mxCreateString(transstring);
rhs[2] = Braw;
transstring[0] = transb;
rhs[3] = mxCreateString(transstring);
k = mexCallMATLAB(1, plhs, 4, rhs, "mtimesx_sparse");
mxDestroyArray(rhs[3]);
mxDestroyArray(rhs[1]);
threads_used = 0;
}
return;
}
/*-------------------------------------------------------------------------------
*
* Convert mxArray char input into trans character.
*
*------------------------------------------------------------------------------- */
char mxArrayToTrans(const mxArray *mx)
{
mwSize n;
char *cp;
char trans;
if( cp = mxArrayToString(mx) ) {
n = mxGetNumberOfElements(mx);
if( n == 0 ) {
trans = 'N'; /* Treat empty char strings the same as 'N' */
} else {
trans = *cp; /* Pick off only the first character */
}
mxFree(cp);
} else {
mexErrMsgTxt("Error allocating memory for trans string");
}
return trans;
}
/*-------------------------------------------------------------------------------
*
* Clean a real sparse matrix of zeros. The code is a bit ugly because it was
* highly optimized for speed.
*
*------------------------------------------------------------------------------- */
mwIndex spcleanreal(mxArray *mx)
{
mwSize n, nrow;
mwIndex *ir, *jc, *iz;
mwIndex j, x, y, diff;
double *pr, *pz;
n = mxGetN(mx);
pz = pr = mxGetData(mx);
ir = mxGetIr(mx);
jc = mxGetJc(mx);
diff = 0;
for( y=0; y<n; y++ ) {
nrow = jc[y+1] - jc[y];
for( x=0; x<nrow; x++ ) {
if( *pr == 0.0 ) {
iz = (ir += (pr - pz));
pz = pr;
j = y + 1;
goto copydata;
}
pr++;
}
}
return jc[n];
for( y=0; y<n; y++ ) {
j = y + 1;
nrow = (jc[j] - diff) - jc[y];
for( x=0; x<nrow; x++ ) {
if( *pr != 0.0 ) {
*pz++ = *pr;
*iz++ = *ir;
} else {
copydata:
diff++;
}
pr++;
ir++;
}
jc[j] -= diff;
}
return jc[n];
}
/*-------------------------------------------------------------------------------
*
* Clean a complex sparse matrix of zeros. The code is a bit ugly because it was
* highly optimized for speed.
*
*------------------------------------------------------------------------------- */
mwIndex spcleancomplex(mxArray *mx)
{
mwSize n, nrow;
mwIndex *ir, *jc, *iz;
mwIndex j, x, y, diff;
double *pr, *pz;
double *pi, *py;
n = mxGetN(mx);
pz = pr = mxGetData(mx);
py = pi = mxGetImagData(mx);
ir = mxGetIr(mx);
jc = mxGetJc(mx);
diff = 0;
for( y=0; y<n; y++ ) {
nrow = jc[y+1] - jc[y];
for( x=0; x<nrow; x++ ) {
if( *pr == 0.0 && *pi == 0.0 ) {
iz = (ir += (pr - pz));
pz = pr;
py = pi;
j = y + 1;
goto copydata;
}
pr++;
pi++;
}
}
return jc[n];
for( y=0; y<n; y++ ) {
j = y + 1;
nrow = (jc[j] - diff) - jc[y];
for( x=0; x<nrow; x++ ) {
if( *pr != 0.0 || *pi != 0.0 ) {
*pz++ = *pr;
*py++ = *pi;
*iz++ = *ir;
} else {
copydata:
diff++;
}
pr++;
pi++;
ir++;
}
jc[j] -= diff;
}
return jc[n];
}
/*------------------------------------------------------------------------------- */
mwIndex spclean(mxArray *mx)
{
if( mxIsComplex(mx) ) {
return spcleancomplex(mx);
} else {
return spcleanreal(mx);
}
}
/*-------------------------------------------------------------------------------
* myRealloc for use only in cases of decreasing a block size. Avoids using the
* buggy mxRealloc function for this case.
*------------------------------------------------------------------------------- */
void *myRealloc(void *vp, mwSize_t n)
{
void *xp;
if( xp = mxMalloc(n) ) { /* not really a necessary test for a mex routine */
memcpy(xp, vp, n);
mxFree(vp);
return xp;
} else {
return vp;
}
}
/*-------------------------------------------------------------------------------
* Logo.
*------------------------------------------------------------------------------- */
void mtimesx_logo(void)
{
mxArray *rhs[9];
double *x, *y, *z;
double r;
mwSize i, j;
/* fh = figure */
mexCallMATLAB(1, rhs, 0, NULL,"figure");
/* set(fh, 'Color', [1 1 1]) */
rhs[1] = mxCreateString("Color");
rhs[2] = mxCreateDoubleMatrix(1, 3, mxREAL);
x = mxGetPr(rhs[2]);
x[0] = x[1] = x[2] = 1.0;
mexCallMATLAB(0, NULL, 3, rhs, "set");
mxDestroyArray(rhs[0]);
mxDestroyArray(rhs[1]);
mxDestroyArray(rhs[2]);
/* [X Y] = meshgrid(-8:0.0625:8); */
/* R = sqrt(X.^2 + Y.^2) + eps; */
/* Z = sin(R)./R; */
rhs[0] = mxCreateDoubleMatrix(257, 257, mxREAL);
rhs[1] = mxCreateDoubleMatrix(257, 257, mxREAL);
rhs[2] = mxCreateDoubleMatrix(257, 257, mxREAL);
x = mxGetPr(rhs[0]);
y = mxGetPr(rhs[1]);
z = mxGetPr(rhs[2]);
for( i=0; i<257; i++ ) {
for( j=0; j<257; j++ ) {
*x = -8.0 + i * 0.0625;
*y = -8.0 + j * 0.0625;
r = sqrt((*x)*(*x) + (*y)*(*y)) + 2.22e-16;
*z++ = sin(r) / r;
x++;
y++;
}
}
/* surf(X,Y,Z,'FaceColor','interp',... */
/* 'EdgeColor','none',... */
/* 'FaceLighting','phong') */
rhs[3] = mxCreateString("FaceColor");
rhs[4] = mxCreateString("interp");
rhs[5] = mxCreateString("EdgeColor");
rhs[6] = mxCreateString("none");
rhs[7] = mxCreateString("FaceLighting");
rhs[8] = mxCreateString("phong");
mexCallMATLAB(0, NULL, 9, rhs, "surf");
for( i=0; i<9; i++ ) {
mxDestroyArray(rhs[i]);
}
/* daspect([5 5 1]) */
rhs[0] = mxCreateDoubleMatrix(1, 3, mxREAL);
x = mxGetPr(rhs[0]);
x[0] = 5.0;
x[1] = 5.0;
x[2] = 1.0;
mexCallMATLAB(0, NULL, 1, rhs, "daspect");
mxDestroyArray(rhs[0]);
/* axis('tight') */
rhs[0] = mxCreateString("tight");
mexCallMATLAB(0, NULL, 1, rhs, "axis");
mxDestroyArray(rhs[0]);
/* axis('off') */
rhs[0] = mxCreateString("off");
mexCallMATLAB(0, NULL, 1, rhs, "axis");
mxDestroyArray(rhs[0]);
/* view([-50 30]) */
rhs[0] = mxCreateDoubleMatrix(1, 2, mxREAL);
x = mxGetPr(rhs[0]);
x[0] = -50.0;
x[1] = 30.0;
mexCallMATLAB(0, NULL, 1, rhs, "view");
mxDestroyArray(rhs[0]);
/* camlight('left') */
rhs[0] = mxCreateString("left");
mexCallMATLAB(0, NULL, 1, rhs, "camlight");
mxDestroyArray(rhs[0]);
}
/*-------------------------------------------------------------------------------
* Construct mode string.
*------------------------------------------------------------------------------- */
mxArray *modestring(int m)
{
switch( m )
{
case MTIMESX_BLAS:
return mxCreateString("BLAS");
case MTIMESX_LOOPS:
return mxCreateString("LOOPS");
case MTIMESX_LOOPS_OMP:
return mxCreateString("LOOPSOMP");
case MTIMESX_MATLAB:
return mxCreateString("MATLAB");
case MTIMESX_SPEED:
return mxCreateString("SPEED");
case MTIMESX_SPEED_OMP:
return mxCreateString("SPEEDOMP");
}
return mxCreateString("");
}
/*-------------------------------------------------------------------------------
* Construct thread string.
*------------------------------------------------------------------------------- */
mxArray *threadstring(int t)
{
char ompstring[] = "OMP_SET_NUM_THREADS(___)";
int k;
if( t > 999 ) {
return mxCreateString("OMP_SET_NUM_THREADS");
} else {
k = 20;
if( max_threads > 99 ) {
ompstring[k++] = (t / 100) + '0';
t /= 10;
}
if( t > 9 ) {
ompstring[k++] = (t / 10) + '0';
t /= 10;
}
ompstring[k++] = t + '0';
ompstring[k++] = ')';
ompstring[k ] = '\0';
return mxCreateString(ompstring);
}
}
/*-------------------------------------------------------------------------
* Macros to define CAPITAL letters as the address of the lower case letter
* This will make the function calls look much cleaner
*------------------------------------------------------------------------- */
#define M &m
#define K &k
#define L &l
#define N &n
#define M1 &m1
#define N1 &n1
#define M2 &m2
#define N2 &n2
#define AR &ar
#define AI &ai
#define BI &bi
#define AIBI &aibi
#define LDA &lda
#define LDB &ldb
#define LDC &ldc
#define ZERO &Zero
#define ONE &One
#define MINUSONE &Minusone
#define TRANSA &transa
#define TRANSB &transb
#define TRANS &trans
#define PTRANSA &ptransa
#define PTRANSB &ptransb
#define INCX &inc
#define INCY &inc
#define ALPHA &alpha
#define UPLO &uplo
/*-------------------------------------------------------------------------
* Number of bytes that a sparse matrix result has to drop by before a
* realloc will be performed on the data area(s).
*------------------------------------------------------------------------- */
#define REALLOCTOL 1024
/*-------------------------------------------------------------------------
* Number of elements in a dot product must be at least this much before
* the calculation will be multi-threaded even when OMP is enabled.
*------------------------------------------------------------------------- */
#ifdef __LCC__
#define OMP_DOT_SMALL 182000
#else
#define OMP_DOT_SMALL 126000
#endif
/*-------------------------------------------------------------------------
* Number of elements in the temporary dot product storage array. Used to
* ensure multi-threaded results are consistent from run to run. Prefer
* using this as opposed to using atomic or reduction methods which would
* not guarantee consistent results from run to run.
*------------------------------------------------------------------------- */
#define OMP_DOT_ARRAY_SIZE 256
/*-------------------------------------------------------------------------
* Number of elements in a scalar product must be at least this much before
* the calculation will be multi-threaded even when OMP is enabled.
*------------------------------------------------------------------------- */
#define OMP_SCALAR_SMALL 100000
/*-------------------------------------------------------------------------
* Number of elements in an outer product must be at least this much before
* the calculation will be multi-threaded even when OMP is enabled.
*------------------------------------------------------------------------- */
#define OMP_OUTER_SMALL 100000
/*-------------------------------------------------------------------------
* Number of loops in special multiply cases must be at least this much
* before the calculation will be multi-threaded even when OMP is enabled.
*------------------------------------------------------------------------- */
#define OMP_SPECIAL_SMALL 1000
/*-------------------------------------------------------------------------
* Macros for the double * double function. All of the generic function
* names and variable types in the file mtimesx_RealTimesReal.c are to be
* replace with these names specific to the double type.
*------------------------------------------------------------------------- */
#define MatlabReturnType mxDOUBLE_CLASS
#define RealTimesReal DoubleTimesDouble
#define RealTimesScalar DoubleTimesScalar
#define RealTimesScalarX DoubleTimesScalarX
#define AllRealZero AllDoubleZero
#define RealKindEqP1P0TimesRealKindN DoubleImagEqP1P0TimesDoubleImagN
#define RealKindEqP1P0TimesRealKindG DoubleImagEqP1P0TimesDoubleImagG
#define RealKindEqP1P0TimesRealKindT DoubleImagEqP1P0TimesDoubleImagT
#define RealKindEqP1P0TimesRealKindC DoubleImagEqP1P0TimesDoubleImagC
#define RealKindEqP1P1TimesRealKindN DoubleImagEqP1P1TimesDoubleImagN
#define RealKindEqP1P1TimesRealKindG DoubleImagEqP1P1TimesDoubleImagG
#define RealKindEqP1P1TimesRealKindT DoubleImagEqP1P1TimesDoubleImagT
#define RealKindEqP1P1TimesRealKindC DoubleImagEqP1P1TimesDoubleImagC
#define RealKindEqP1M1TimesRealKindN DoubleImagEqP1M1TimesDoubleImagN
#define RealKindEqP1M1TimesRealKindG DoubleImagEqP1M1TimesDoubleImagG
#define RealKindEqP1M1TimesRealKindT DoubleImagEqP1M1TimesDoubleImagT
#define RealKindEqP1M1TimesRealKindC DoubleImagEqP1M1TimesDoubleImagC
#define RealKindEqP1PxTimesRealKindN DoubleImagEqP1PxTimesDoubleImagN
#define RealKindEqP1PxTimesRealKindG DoubleImagEqP1PxTimesDoubleImagG
#define RealKindEqP1PxTimesRealKindT DoubleImagEqP1PxTimesDoubleImagT
#define RealKindEqP1PxTimesRealKindC DoubleImagEqP1PxTimesDoubleImagC
#define RealKindEqM1P1TimesRealKindN DoubleImagEqM1P1TimesDoubleImagN
#define RealKindEqM1P1TimesRealKindG DoubleImagEqM1P1TimesDoubleImagG
#define RealKindEqM1P1TimesRealKindT DoubleImagEqM1P1TimesDoubleImagT
#define RealKindEqM1P1TimesRealKindC DoubleImagEqM1P1TimesDoubleImagC
#define RealKindEqM1M1TimesRealKindN DoubleImagEqM1M1TimesDoubleImagN
#define RealKindEqM1M1TimesRealKindG DoubleImagEqM1M1TimesDoubleImagG
#define RealKindEqM1M1TimesRealKindT DoubleImagEqM1M1TimesDoubleImagT
#define RealKindEqM1M1TimesRealKindC DoubleImagEqM1M1TimesDoubleImagC
#define RealKindEqM1PxTimesRealKindN DoubleImagEqM1PxTimesDoubleImagN
#define RealKindEqM1PxTimesRealKindG DoubleImagEqM1PxTimesDoubleImagG
#define RealKindEqM1PxTimesRealKindT DoubleImagEqM1PxTimesDoubleImagT
#define RealKindEqM1PxTimesRealKindC DoubleImagEqM1PxTimesDoubleImagC
#define RealKindEqM1P0TimesRealKindN DoubleImagEqM1P0TimesDoubleImagN
#define RealKindEqM1P0TimesRealKindG DoubleImagEqM1P0TimesDoubleImagG
#define RealKindEqM1P0TimesRealKindT DoubleImagEqM1P0TimesDoubleImagT
#define RealKindEqM1P0TimesRealKindC DoubleImagEqM1P0TimesDoubleImagC
#define RealKindEqPxP1TimesRealKindN DoubleImagEqPxP1TimesDoubleImagN
#define RealKindEqPxP1TimesRealKindG DoubleImagEqPxP1TimesDoubleImagG
#define RealKindEqPxP1TimesRealKindT DoubleImagEqPxP1TimesDoubleImagT
#define RealKindEqPxP1TimesRealKindC DoubleImagEqPxP1TimesDoubleImagC
#define RealKindEqPxM1TimesRealKindN DoubleImagEqPxM1TimesDoubleImagN
#define RealKindEqPxM1TimesRealKindG DoubleImagEqPxM1TimesDoubleImagG
#define RealKindEqPxM1TimesRealKindT DoubleImagEqPxM1TimesDoubleImagT
#define RealKindEqPxM1TimesRealKindC DoubleImagEqPxM1TimesDoubleImagC
#define RealKindEqPxP0TimesRealKindN DoubleImagEqPxP0TimesDoubleImagN
#define RealKindEqPxP0TimesRealKindG DoubleImagEqPxP0TimesDoubleImagG
#define RealKindEqPxP0TimesRealKindT DoubleImagEqPxP0TimesDoubleImagT
#define RealKindEqPxP0TimesRealKindC DoubleImagEqPxP0TimesDoubleImagC
#define RealKindEqPxPxTimesRealKindN DoubleImagEqPxPxTimesDoubleImagN
#define RealKindEqPxPxTimesRealKindG DoubleImagEqPxPxTimesDoubleImagG
#define RealKindEqPxPxTimesRealKindT DoubleImagEqPxPxTimesDoubleImagT
#define RealKindEqPxPxTimesRealKindC DoubleImagEqPxPxTimesDoubleImagC
#if defined(_WIN32) || defined(_WIN64)
#define xDOT ddot
#define xGEMV dgemv
#define xGEMM dgemm
#define xSYRK dsyrk
#define xSYR2K dsyr2k
#define xAXPY daxpy
#define xGER dger
#define xSYR dsyr
#define xSYR2 dsyr2
#else
#define xDOT ddot_
#define xGEMV dgemv_
#define xGEMM dgemm_
#define xSYRK dsyrk_
#define xSYR2K dsyr2k_
#define xAXPY daxpy_
#define xGER dger_
#define xSYR dsyr_
#define xSYR2 dsyr2_
#endif
#define xFILLPOS dfillpos
#define xFILLNEG dfillneg
#define zero 0.0
#define one 1.0
#define minusone -1.0
#define RealKind double
#define RealKindComplex doublecomplex
#define RealKindDotProduct doublecomplexdotproduct
#define RealKindDotProductX doublecomplexdotproductx
#define RealKindOuterProduct doublecomplexouterproduct
#define RealKindOuterProductX doublecomplexouterproductx
#define RealScalarTimesReal doublescalartimesdouble
#define RealKindZZ doubleZZ
#include "mtimesx_RealTimesReal.c"
/*----------------------------------------------------------------------
* Undefine all of the double specific macros
*---------------------------------------------------------------------- */
#undef MatlabReturnType
#undef RealTimesReal
#undef RealTimesScalar
#undef RealTimesScalarX
#undef AllRealZero
#undef RealKindEqP1P0TimesRealKindN
#undef RealKindEqP1P0TimesRealKindG
#undef RealKindEqP1P0TimesRealKindT
#undef RealKindEqP1P0TimesRealKindC
#undef RealKindEqP1P1TimesRealKindN
#undef RealKindEqP1P1TimesRealKindG
#undef RealKindEqP1P1TimesRealKindT
#undef RealKindEqP1P1TimesRealKindC
#undef RealKindEqP1M1TimesRealKindN
#undef RealKindEqP1M1TimesRealKindG
#undef RealKindEqP1M1TimesRealKindT
#undef RealKindEqP1M1TimesRealKindC
#undef RealKindEqP1PxTimesRealKindN
#undef RealKindEqP1PxTimesRealKindG
#undef RealKindEqP1PxTimesRealKindT
#undef RealKindEqP1PxTimesRealKindC
#undef RealKindEqM1P1TimesRealKindN
#undef RealKindEqM1P1TimesRealKindG
#undef RealKindEqM1P1TimesRealKindT
#undef RealKindEqM1P1TimesRealKindC
#undef RealKindEqM1M1TimesRealKindN
#undef RealKindEqM1M1TimesRealKindG
#undef RealKindEqM1M1TimesRealKindT
#undef RealKindEqM1M1TimesRealKindC
#undef RealKindEqM1P0TimesRealKindN
#undef RealKindEqM1P0TimesRealKindG
#undef RealKindEqM1P0TimesRealKindT
#undef RealKindEqM1P0TimesRealKindC
#undef RealKindEqM1PxTimesRealKindN
#undef RealKindEqM1PxTimesRealKindG
#undef RealKindEqM1PxTimesRealKindT
#undef RealKindEqM1PxTimesRealKindC
#undef RealKindEqPxP1TimesRealKindN
#undef RealKindEqPxP1TimesRealKindG
#undef RealKindEqPxP1TimesRealKindT
#undef RealKindEqPxP1TimesRealKindC
#undef RealKindEqPxM1TimesRealKindN
#undef RealKindEqPxM1TimesRealKindG
#undef RealKindEqPxM1TimesRealKindT
#undef RealKindEqPxM1TimesRealKindC
#undef RealKindEqPxP0TimesRealKindN
#undef RealKindEqPxP0TimesRealKindG
#undef RealKindEqPxP0TimesRealKindT
#undef RealKindEqPxP0TimesRealKindC
#undef RealKindEqPxPxTimesRealKindN
#undef RealKindEqPxPxTimesRealKindG
#undef RealKindEqPxPxTimesRealKindT
#undef RealKindEqPxPxTimesRealKindC
#undef xDOT
#undef xGEMV
#undef xGEMM
#undef xSYRK
#undef xSYR2K
#undef xAXPY
#undef xGER
#undef xSYR
#undef xSYR2
#undef xFILLPOS
#undef xFILLNEG
#undef zero
#undef one
#undef minusone
#undef RealKind
#undef RealKindComplex
#undef RealKindDotProduct
#undef RealKindDotProductX
#undef RealKindOuterProduct
#undef RealKindOuterProductX
#undef RealScalarTimesReal
#undef RealKindZZ
/*-------------------------------------------------------------------------
* Macros for the single * single function. All of the generic function
* names and variable types in the file mtimesx_RealTimesReal.c are to be
* replaced with these names specific to the single type.
*------------------------------------------------------------------------- */
#define MatlabReturnType mxSINGLE_CLASS
#define RealTimesReal FloatTimesFloat
#define RealTimesScalar FloatTimesScalar
#define RealTimesScalarX FloatTimesScalarX
#define AllRealZero AllFloatZero
#define RealKindEqP1P0TimesRealKindN FloatImagEqP1P0TimesFloatImagN
#define RealKindEqP1P0TimesRealKindG FloatImagEqP1P0TimesFloatImagG
#define RealKindEqP1P0TimesRealKindT FloatImagEqP1P0TimesFloatImagT
#define RealKindEqP1P0TimesRealKindC FloatImagEqP1P0TimesFloatImagC
#define RealKindEqP1P1TimesRealKindN FloatImagEqP1P1TimesFloatImagN
#define RealKindEqP1P1TimesRealKindG FloatImagEqP1P1TimesFloatImagG
#define RealKindEqP1P1TimesRealKindT FloatImagEqP1P1TimesFloatImagT
#define RealKindEqP1P1TimesRealKindC FloatImagEqP1P1TimesFloatImagC
#define RealKindEqP1M1TimesRealKindN FloatImagEqP1M1TimesFloatImagN
#define RealKindEqP1M1TimesRealKindG FloatImagEqP1M1TimesFloatImagG
#define RealKindEqP1M1TimesRealKindT FloatImagEqP1M1TimesFloatImagT
#define RealKindEqP1M1TimesRealKindC FloatImagEqP1M1TimesFloatImagC
#define RealKindEqP1PxTimesRealKindN FloatImagEqP1PxTimesFloatImagN
#define RealKindEqP1PxTimesRealKindG FloatImagEqP1PxTimesFloatImagG
#define RealKindEqP1PxTimesRealKindT FloatImagEqP1PxTimesFloatImagT
#define RealKindEqP1PxTimesRealKindC FloatImagEqP1PxTimesFloatImagC
#define RealKindEqM1P1TimesRealKindN FloatImagEqM1P1TimesFloatImagN
#define RealKindEqM1P1TimesRealKindG FloatImagEqM1P1TimesFloatImagG
#define RealKindEqM1P1TimesRealKindT FloatImagEqM1P1TimesFloatImagT
#define RealKindEqM1P1TimesRealKindC FloatImagEqM1P1TimesFloatImagC
#define RealKindEqM1M1TimesRealKindN FloatImagEqM1M1TimesFloatImagN
#define RealKindEqM1M1TimesRealKindG FloatImagEqM1M1TimesFloatImagG
#define RealKindEqM1M1TimesRealKindT FloatImagEqM1M1TimesFloatImagT
#define RealKindEqM1M1TimesRealKindC FloatImagEqM1M1TimesFloatImagC
#define RealKindEqM1P0TimesRealKindN FloatImagEqM1P0TimesFloatImagN
#define RealKindEqM1P0TimesRealKindG FloatImagEqM1P0TimesFloatImagG
#define RealKindEqM1P0TimesRealKindT FloatImagEqM1P0TimesFloatImagT
#define RealKindEqM1P0TimesRealKindC FloatImagEqM1P0TimesFloatImagC
#define RealKindEqM1PxTimesRealKindN FloatImagEqM1PxTimesFloatImagN
#define RealKindEqM1PxTimesRealKindG FloatImagEqM1PxTimesFloatImagG
#define RealKindEqM1PxTimesRealKindT FloatImagEqM1PxTimesFloatImagT
#define RealKindEqM1PxTimesRealKindC FloatImagEqM1PxTimesFloatImagC
#define RealKindEqPxP1TimesRealKindN FloatImagEqPxP1TimesFloatImagN
#define RealKindEqPxP1TimesRealKindG FloatImagEqPxP1TimesFloatImagG
#define RealKindEqPxP1TimesRealKindT FloatImagEqPxP1TimesFloatImagT
#define RealKindEqPxP1TimesRealKindC FloatImagEqPxP1TimesFloatImagC
#define RealKindEqPxM1TimesRealKindN FloatImagEqPxM1TimesFloatImagN
#define RealKindEqPxM1TimesRealKindG FloatImagEqPxM1TimesFloatImagG
#define RealKindEqPxM1TimesRealKindT FloatImagEqPxM1TimesFloatImagT
#define RealKindEqPxM1TimesRealKindC FloatImagEqPxM1TimesFloatImagC
#define RealKindEqPxP0TimesRealKindN FloatImagEqPxP0TimesFloatImagN
#define RealKindEqPxP0TimesRealKindG FloatImagEqPxP0TimesFloatImagG
#define RealKindEqPxP0TimesRealKindT FloatImagEqPxP0TimesFloatImagT
#define RealKindEqPxP0TimesRealKindC FloatImagEqPxP0TimesFloatImagC
#define RealKindEqPxPxTimesRealKindN FloatImagEqPxPxTimesFloatImagN
#define RealKindEqPxPxTimesRealKindG FloatImagEqPxPxTimesFloatImagG
#define RealKindEqPxPxTimesRealKindT FloatImagEqPxPxTimesFloatImagT
#define RealKindEqPxPxTimesRealKindC FloatImagEqPxPxTimesFloatImagC
#if defined(_WIN32) || defined(_WIN64)
#define xDOT sdot
#define xGEMV sgemv
#define xGEMM sgemm
#define xSYRK ssyrk
#define xSYR2K ssyr2k
#define xAXPY saxpy
#define xGER sger
#define xSYR ssyr
#define xSYR2 ssyr2
#else
#define xDOT sdot_
#define xGEMV sgemv_
#define xGEMM sgemm_
#define xSYRK ssyrk_
#define xSYR2K ssyr2k_
#define xAXPY saxpy_
#define xGER sger_
#define xSYR ssyr_
#define xSYR2 ssyr2_
#endif
#define xFILLPOS sfillpos
#define xFILLNEG sfillneg
#define zero 0.0f
#define one 1.0f
#define minusone -1.0f
#define RealKind float
#define RealKindComplex floatcomplex
#define RealKindDotProduct floatcomplexdotproduct
#define RealKindDotProductX floatcomplexdotproductx
#define RealKindOuterProduct floatcomplexouterproduct
#define RealKindOuterProductX floatcomplexouterproductx
#define RealScalarTimesReal floatscalartimesfloat
#define RealKindZZ floatZZ
#include "mtimesx_RealTimesReal.c"