The latest version of MATLAB, 2007a, has been released. While some changes to base MATLAB are of interest to data miners (multi-threading, in particular), owners of the Statistics Toolbox receive a number of new features in this major upgrade.
First, the Statistics Toolbox makes new data structures available for categorical data (categorical arrays) and mixed-type data (dataset arrays). Most MATLAB users performing statistical analysis or data mining tend to store their data in numerical matrices (my preference) or cell arrays. Using ordinary matrices requires the programmer/analyst to manage things like variable names. Cell arrays deal with the variable name issue, but preclude some of the nice things about using MATLAB matrices. Hopefully these new structures make statistical analysis in MATLAB more natural.
Second, the Statistics Toolbox updates the classify function, permitting it to output the discovered discriminant coefficients (at last!). I have been complaining about this for a long time. Why? Because classify provides quadratic discriminant analysis (QDA), an important non-linear modeling algorithm. Without the coefficients, though, it is impossible to deliver models to other (admittedly inferior to MATLAB) platforms.
Also of note: the Genetic Algorithm and Direct Search Toolbox now includes simulated annealing.
More information on 2007a is available at The Mathworks.