In the past weeks, I explained how we can start asynchronous Java threads to run in parallel to the main Matlab processing using Java and Dot-Net threads. Today I continue by examining C/C++ threads. This series will conclude next week, by discussing timer objects and process-spawning.
The alternatives that can be used to enable Matlab multithreading with C/C++ include standard POSIX threads, native OS threads, OpenMP, MPI (Message Passing Interface), TBB (Thread Building Blocks), Cilk, OpenACC, OpenCL or Boost. We can also use libraries targeting specific platforms/architectures: Intel MKL, C++ AMP, Bolt etc. Note that the Boost library is included in every relatively-modern Matlab release, so we can either use the built-in library (easier to deploy, consistency with Matlab), or download and install the latest version and use it separately. On Windows, we can also use .Net’s Thread
class, as explained in last week’s article. This is a very wide range of alternatives, and it’s already been covered extensively elsewhere from the C/C++ side.
Today I will only discuss the POSIX alternative. The benefit of POSIX is that is is more-or-less cross-platform, enabling the same code to work on all MATLAB platforms, as well as any other POSIX-supported platform.
POSIX threads (Pthreads) is a standard API for multi-threaded programming implemented natively on many Unix-like systems, and also supported on Windows. Pthreads includes functionality for creating and managing threads, and provides a set of synchronization primitives such as mutexes, conditional variables, semaphores, read/write locks, and barriers. POSIX has extensive offline and online documentation.
Note that POSIX is natively supported on Macs & Linux, but requires a separate installation on Windows. Two of the leading alternatives are Pthreads_Win32 (also works on Win64, despite its name…), and winpthreads (part of the extensive MinGW open-source project).
When creating a C/C++ -based function, we can either compile/link it into a dynamic/shared library (loadable into Matlab using the loadlibrary & calllib functions), or into a MEX file that can be called directly from M-code. The code looks the same, except that a MEX file has a gateway function named mexFunction that has a predefined interface. Today I’ll show the MEX variant using C; the adaptation to C++ is easy. To create multi-threaded MEX, all it takes is to connect the thread-enabled C/C++ code into our mexFunction(), provide the relevant threading library to the mex linker and we’re done.
The example code below continues the I/O example used throughout this series, of asynchronously saving a vector of data to disk file. Place the following in a file called myPosixThread.c:
#include "mex.h" #include "pthread.h" #include "stdio.h" char *filename; double *data; size_t numElementsExpected; size_t numElementsWritten; /* thread compute function */ void *thread_run(void *p) { /* Open the file for binary output */ FILE *fp = fopen(filename, "wb"); if (fp == NULL) mexErrMsgIdAndTxt("YMA:MexIO:errorOpeningFile", "Could not open file %s", filename); /* Write the data to file */ numElementsWritten = (size_t) fwrite(data, sizeof(double), numElementsExpected, fp); fclose(fp); /* Ensure that the data was correctly written */ if (numElementsWritten != numElementsExpected) mexErrMsgIdAndTxt("YMA:MexIO:errorWritingFile", "Error writing data to %s: wrote %d, expected %d\n", filename, numElementsWritten, numElementsExpected); /* Cleanup */ pthread_exit(NULL); } /* The MEX gateway function */ void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { pthread_t thread; /* Check for proper number of input and output arguments */ if (nrhs != 2) mexErrMsgIdAndTxt("YMA:MexIO:invalidNumInputs", "2 input args required: filename, data"); if (nlhs > 0) mexErrMsgIdAndTxt("YMA:MexIO:maxlhs", "Too many output arguments"); if (!mxIsChar(prhs[0])) mexErrMsgIdAndTxt("YMA:MexIO:invalidInput", "Input filename must be of type string"); if (!mxIsDouble(prhs[1])) mexErrMsgIdAndTxt("YMA:MexIO:invalidInput", "Input data must be of type double"); /* Get the inputs: filename & data */ filename = mxArrayToString(prhs[0]); data = mxGetPr(prhs[1]); numElementsExpected = mxGetNumberOfElements(prhs[1]); /* Launch a new I/O thread using default attributes */ if (pthread_create(&thread, NULL, thread_run, NULL)) mexErrMsgIdAndTxt("YMA:MexIO:threadFailed", "Thread creation failed"); }
This source file can be compiled as follows on Macs/Linux:
mex myPosixThread.c –lpthread
Or on Windows, assuming we installed Pthreads-Win32, we need to set-up the environment:
% prepare the environment (we could also use the -I, -L flags) pthreadsInstallFolder = 'C:\Program Files\Pthreads_Win32\'; % change this as needed setenv('PATH', [getenv('PATH') ';' pthreadsInstallFolder 'dll\x64']); setenv('LIB', [getenv('LIB') ';' pthreadsInstallFolder 'lib\x64']); setenv('INCLUDE',[getenv('INCLUDE') ';' pthreadsInstallFolder 'include']); % create a 64-bit MEX that uses the pthreads DLL mex myPosixThread.c -lpthreadVC2 % copy the pthreadVC2.dll file to be accessible to the MEX file, otherwise it will not run copyfile([pthreadsInstallFolder 'dll\x64\pthreadVC2.dll'], '.')
To run the MEX file from MATLAB, we use the following code snippet (note the similarity with our Java/.Net examples earlier in this series):
addpath('C:\Yair\Code\'); % location of our myPosixThread MEX file data = rand(5e6,1); % pre-processing (5M elements, ~40MB) myPosixThread('F:\test.data',data); % start running in parallel data2 = fft(data); % post-processing (pthread I/O runs in parallel)
Note that we cannot directly modify the data (data=fft(data)
) while it is being accessed by the I/O thread. This would cause the data to be reallocated elsewhere in memory, causing the I/O thread to access invalid (stale) memory – this would cause a segmentation violation causing Matlab to crash. Read-only access (data2=fft(data)
) is ok, just ensure not to update the data. This was not a problem with our earlier Java/.Net threads, since they received their data by value, but mexFunction() receives its data by reference (which is quicker and saves memory, but also has its limitations). Alternatively, we can memcpy() the Matlab data to a newly-allocated memory block within our thread and only use the memcpy‘ed data from then on. This will ensure that if any updates to the original data occur, the parallel thread will not be affected and no SEGV will occur.
Also note that we call a few MEX functions from within the parallel portion of our code (the thread’s run() function). This works without problems on recent Matlab releases since some MEX API functions have been made thread-safe, however it might not work in earlier (or future) Matlab versions. Therefore, to make our code portable, it is recommended to not interact with Matlab at all during parallel blocks, or to protect MEX API calls by critical sections. Alternatively, only use MEX API calls in the main thread (which is actually MT), defined as those parts of code that run in the same thread as mexFunction(). Synchronization with other threads can be done using POSIX mechanisms such as pthread_join().
To complete the picture, it is also possible to use native threads (rather than POSIX) on Windows: The MEX file should #include <process.h>
and call _beginthread(). In fact, since Microsoft for some reason decided to reinvent the wheel with its own native threads and not to support POSIX, all the POSIX implementations on Windows are basically a wrapper for the native Window threads. Using these native threads directly often proves to be the fastest alternative. Unfortunately, code using native threads is not portable to Macs/Linux, unlike POSIX-based code.
Yuval Tassa’s excellent mmx utility (which deserves a detailed review by its own right!) employs both Pthreads (Mac/Linux) and Windows threads in its MEX file. Readers are encouraged to review mmx’s code to see the specifics.
Another related utility on the Matlab File Exchange is Thomas Weibel’s MexThread, which uses C++11‘s std::threads
.
Related posts:
- Explicit multi-threading in Matlab – part 2 Matlab performance can be improved by employing .Net (C#, VB, F# or C++) threads. ...
- Explicit multi-threading in Matlab – part 1 Explicit multi-threading can be achieved in Matlab by a variety of simple means. ...
- Multi-line uitable column headers Matlab uitables can present long column headers in multiple lines, for improved readability. ...
- Matlab mex in-place editing Editing Matlab arrays in-place can be an important technique for optimizing calculations. This article shows how to do it using Mex. ...