Is C++ MEX API significantly slower than the C MEX API?
17 views (last 30 days)
Show older comments
MathWorks recommend "Whenever possible, choose C++ over C applications.", however I cannot find a way to match the performance of the old C API and, considering that the main reason we use MEX is to make our code run faster - what's the point of the new C++ API? And is it actually slower or am I using it completely wrong?
I've tried doing some tests (all code attached) and in all cases whenever using the new C++ API, even if there is no data transfer and the function itself does nothing - it comes out significantly slower. I tried to rule out the C++ compiler being worse by compiling the same C code with a C++ compiler and it does turn out similar to the C one. All MEX functions were compiled using -O flag, as well as -R2018a for ProdSum ones.
nop() - functions that do absolutely nothing. nothing in the table below is literally nothing, simply tic/toc in 2 lines, nop() is a Matlab function and Cnop/Cppnop are C and C++ MEX functions. Cppnop2 is C MEX function compiled with C++ compiler.
nothing x100000 | 23.9487ms | 0.2395us/call | 1.000x
nop() x100000 | 31.8879ms | 0.3189us/call | 1.332x
Cnop() x100000 | 77.2739ms | 0.7727us/call | 3.227x - C API & C Compiler
Cppnop() x100000 | 607.6918ms | 6.0769us/call | 25.375x - C++ API & C++ Compiler
Cppnop2() x100000 | 85.2547ms | 0.8525us/call | 3.560x - C API & C++ Compiler
empty() - functions that return empty result ([]). inline, empty() and @()[] are 3 different ways to achieve the same in Matlab. CEmpty/CppEmpty are C and C++ MEX functions. CppEmpty2 is, again, a C MEX function compiled with C++ compiler.
inline x100000 | 24.1816ms | 0.2418us/call | 1.000x
empty() x100000 | 29.1480ms | 0.2915us/call | 1.205x
@()[] x100000 | 33.1176ms | 0.3312us/call | 1.370x
CEmpty() x100000 | 116.2378ms | 1.1624us/call | 4.807x - C API & C Compiler
CppEmpty() x100000 | 784.0485ms | 7.8405us/call | 32.423x - C++ API & C++ Compiler
CppEmpty2() x100000 | 120.9537ms | 1.2095us/call | 5.002x - C API & C++ Compiler
The above functions are mainly to evaluate the overhead of just calling MEX without any data transfer. The C++ MEX API version comes out 6-8 times slower which or around 5-6us per call (which is nothing really but can add up).
ProdSum() - functions that calculate product of all values in a double array in cells (of different lengths) and computes a sum of those. ProdSum and cellfun are Matlab options, while CProdSum/CppProdSum are again C and C++ MEX functions. CppProdSum2 is C MEX entry function combined with C++ classes (so C++ using C MEX API) and is as fasts if not slightly faster than the C MEX function.
ProdSum() x100 | 9.0598ms | 90.5980us/call | 1.000x | val = 931.56
cellfun x100 | 197.1907ms | 1971.9070us/call | 21.765x | val = 931.56
CProdSum() x100 | 3.8130ms | 38.1300us/call | 0.421x | val = 931.56 - C API & C Compiler
CppProdSum() x100 | 147.2003ms | 1472.0030us/call | 16.248x | val = 931.56 - C++ API & C++ Compiler
CppProdSum2() x100 | 3.6594ms | 36.5940us/call | 0.404x | val = 931.56 - C API & C++ Compiler
In this case the C++ API MEX function comes out 40x slower when there is some data transfer and the time difference per call is a lot bigger than in the nop/empty cases. In the ProdSum test it comes out barely faster than the cellfun option so would be pretty much worthless compared to the standard Matlab functions and the old C MEX API when the performance matters.
Currently, it makes no sense to use the C++ API if you don't need to as you can still combine C++ with C API and achieve much better performance. Is there a way to make the MEX functions using the C++ API as fast as the C API ones? Or is this a know limitation of the C++ API? And if so, is that going to be addressed in the future releases?
8 Comments
William
on 18 Oct 2025 at 19:14
Edited: William
on 18 Oct 2025 at 19:16
Just found this thread after running into the issue myself. I have some older coordinate transforms code (written in "C++" but it could compile as C and uses the C matrix API) that have been quite good to me. I needed to implement a new transform, and decided to use the C++ API as I have been needing to learn more about modern C++ for another project, and found at first it was two orders of magnitude slower than the older transform functions---from ~20ms for 500 calls over large matrices (really an array of column vectors) to 4s. I wouldn't necessarily expect 1:1 performance because the transform is different, but this is ridiculous.
I removed explicit use of ColumnMajorIterator<T> (foolish me thinking being explicit was good) and got a 10x speedup to about 270ms, but haven't been able to get much better through trying out some optimizations. I'm not aware of an easy way to hook into a mex function and profile it itself, and as the problem appears to be related to accessing the Matlab data structures, I can't just separate the real math part of the transform and profile that. I agree this is a real shame both for customers in general and because I've found I actually quite like modern C++ constructs (heresy, I know) and some of the helpful API sugar Mathworks added, but it looks like it's back to the Matrix API for me.
William
3 minutes ago
Update: I did end up making a C Matrix API version of the function and it does indeed run in about the exact same time as the old transforms, so it does seem like the C++ API just has a lot more overhead for data access or function calls in the first place since the actual algorithm didn't change at all.
Answers (0)
See Also
Categories
Find more on Write C Functions Callable from MATLAB (MEX Files) in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!