Quantcast
Channel: Undocumented Matlab
Viewing all articles
Browse latest Browse all 219

Parsing mlint (Code Analyzer) output

$
0
0

Mlint, Matlab’s static code-analysis parser, was written by Stephen Johnson (the original developer of the enormously successful lint parser for C/C++ back in 1977), when he was lured by MathWorks in 2002 to develop a similar tool for Matlab. Since its development (in R14 I believe), and especially since its incorporation in Matlab’s Editor in R2006a (Matlab 7.2), mlint has become a very important tool for reporting potential problems in m-files.

Unfortunately, to this day (R2013a), there is no documented manner of programmatically separating mlint warnings and errors, nor for accessing any of the multitude of features that are readily available in mlint. Naturally, there is (and has always been) an undocumented back door.

From its earliest beginnings, mlint has relied on C code (presumably modeled after lint). For many years mlint relied on a mex file (%matlabroot%/toolbox/matlab/codetools/mlintmex.mex*), which is basically just a wrapper for mlint.dll where the core algorithm resides. In recent releases, mlintmex, just like many other core mex files, was ported into a core Matlab library (libmwbuiltins.dll on Windows). However, the name and interface of the mlintmex function have remained unchanged over the years. Wrapping the core mlintmex function is the mlint m-function (%matlabroot%/toolbox/matlab/codetools/mlint.m) that calls mlintmex internally. In R2011b (Matlab 7.13) its official function name has changed to checkcode, although this was never documented in the release notes for some reason. However, using mlint still works even today. Wrapping all that is the mlintrpt function, which calls mlint/checkcode internally.

The core function mlintmex returns a long string with embedded newlines to separate the messages. For example:

>> str = mlintmex('perfTest.m')
str = 
L 3 (C 1): The value assigned to variable 'A' might be unused.
L 4 (C 1): The value assigned to variable 'B' might be unused.
L 5 (C 1-3): Variable 'ops', apparently a structure, is changed but the value seems to be unused.
L 12 (C 9): This statement (and possibly following ones) cannot be reached.
L 53 (C 19-25): The function 'subFunc' might be unused.
L 53 (C 27-35): Input argument 'iteration' might be unused. If this is OK, consider replacing it by ~.

We can parse this long string ourselves, but there is no need since mlint/checkcode do this for us, returning a struct array:

>> results = mlint('perfTest.m')
results = 
6x1 struct array with fields:
    message
    line
    column
    fix
>> results(5)
ans = 
    message: 'The function 'subFunc' might be unused.'
       line: 53
     column: [19 25]
        fix: 0

As can be seen, the message severity (warning/error) does not appear. This severity is obviously available since it is integrated in the Editor and the Code Analyzer report – orange for warnings, red for errors.

In one of my projects I needed to enable the user to dynamically create executable Matlab code that would then be run interactively. This enabled users to create dynamic data analyses functions without actually needing to know Matlab or to code all the nuts-and-bolts of a regular Matlab function. For this I needed to display warnings and errors-on-the-fly (the dynamic cell tooltips used a custom table cell-renderer). Here’s the end-result:


Analysis definition panel

Analysis definition panel


Dynamic analysis alert tooltips
Dynamic analysis alert tooltips

Dynamic analysis alert tooltips


My solution was to use mlintmex, as follows:

% Get the relevant message strings
errMsgs = mlintmex('-m2', srcFileName);
allMsgs = mlintmex('-m0', srcFileName);
 
% Parse the strings to find newline characters
numErrors = length(strfind(regexprep(errMsgs,'\*\*\*.*',''),char(10)));
numAllMsg = length(strfind(regexprep(allMsgs,'\*\*\*.*',''),char(10)));
numWarns = numAllMsg - numErrors;

(and from the messages themselves [errMsgs,allMsgs] I extracted the actual error/warning location)

Alternatively, I could have used mlint directly, as I have recently explained:

% Note that mlint returns struct arrays, so the following are all structs, not strings
errMsgs = mlint('-m2',srcFileNames); % m2 = errors only
m1Msgs  = mlint('-m1',srcFileNames); % m1 = errors and severe warnings only
allMsgs = mlint('-m0',srcFileNames); % m0 = all errors and warnings

The original information about mlintmex and the undocumented -m0/m1/m2 options came from Urs (us) Schwartz, whose contributions are an endless source of such gems. Urs also provided a list of other undocumented mlint options:

'-all'
'-allmsg'
'-amb'
'-body'
'-callops'
'-calls'
'-com'
'-cyc'
% '-db'       % == -set + -ud + -tab
'-dty'
'-edit'
'-en'         % messages in English
'-id'
'-ja'         % messages in Japanese
'-lex'
'-m0'         % + other opt
'-m1'         % + other opt
'-m2'         % + other opt
'-m3'         % + other opt
'-mess'
'-msg'
'-notok'
'-pf'
'-set'
'-spmd'
'-stmt'
'-tab'
'-tmtree'
'-tmw'        % not valid anymore
'-toks'
'-tree'
'-ty'
'-ud'
'-yacc'       % ONLY: !mlint FILE -yacc -...

to which were added in recent years ‘-eml’, ‘-codegen’ etc. – see the checkcode doc page. Also note that not all Matlab releases support all options. For example, ‘-tmw’ is ignored in R2013a, returning the same data as ‘-all’ plus a warning about the ignored option.

Urs prepared a short utility called doli that accepts an m-file name and returns a struct whose fields are the respective outputs of mlint for each of the corresponding options:

>> results = doli('perfTest.m')
MLINT >   C:\Yair\Books\MATLAB Performance Tuning\Code\perfTest.m
OPTION>   -all       6
OPTION>   -allmsg    501
OPTION>   -amb       17
OPTION>   -body      6
OPTION>   -callops   15
OPTION>   -calls     15
OPTION>   -com       6
OPTION>   -cyc       8
OPTION>   -dty       162
OPTION>   -edit      92
OPTION>   -en        7
...

Some of these options are used by Urs’ farg and fdep utilities. Their usage of mlint rather than direct m-code parsing, is part of the reason that these functions are so lightningly fast.

For example, we can use the ‘-calls’ options to parse an m-file and get the names, type, and code location of its contained functions (explanation):

>> mlint('-calls','perfTest.m')
M0 1 10 perfTest
E0 51 3 perfTest
U1 3 5 randi
U1 4 5 num2cell
U1 4 14 randn
U1 6 1 whos
U1 7 1 tic
U1 7 6 save
U1 7 45 toc
U1 9 6 savefast
S0 53 19 subFunc
E0 60 3 subFunc
U1 55 8 isempty
U1 56 20 load
U1 57 29 sin

With so many useful features, I really cannot understand why they were never exposed to the public in a documented manner. After all, they have remained pretty-much unchanged for many years and can provide enormous benefits for developers of unit-tests and interactive analysis frameworks (as I have shown above).

As a side-note, in R2010a (Matlab 7.10), mlint was renamed “Code Analyzer”, but this was really just a name change – its core functionality has changed little in the past decade. Some might argue that new checks were added and the Editor interface has improved by allowing auto-fixes and message suppression. But for a tool that is over a decade old (much more, if you count lint’s development), I contend that these are not much. Don’t get me wrong – I have the utmost respect for Steve. Serious unix C/C++ development relies on his lint and yacc tools on a regular basis. I think they show astonishing ingenuity and intelligence. It’s just that I had expected more after a decade of mlint development (I bet it’s not due to Steve suddenly losing the touch).

Addendum: A little birdie tells me that Steve left MathWorks a few years ago, which does explain things… I apologize to Steve for any misguided snide on my part. As I said above, I have nothing but the utmost respect for his work. The question of why MathWorks left his mlint work hanging without serious continuation remains open.

 
Related posts:
  1. Running VB code in Matlab Matlab does not natively enable running VB code, but a nice trick enables us to do just that...
  2. Undocumented scatter plot behavior The scatter plot function has an undocumented behavior when plotting more than 100 points: it returns a single unified patch object handle, rather than a patch handle for each specific...
  3. More undocumented timing features There are several undocumented ways in Matlab to get CPU and clock data...
  4. tic / toc – undocumented option Matlab's built-in tic/toc functions have an undocumented option enabling multiple nested clockings...
 

Viewing all articles
Browse latest Browse all 219

Trending Articles