Data Mining in MATLAB: neural network

Friday, April 21, 2017

A New Dawn for Local Learning Methods?

The relentless improvement in speed of computers continues. While some technical barriers to this progress have begun to emerge, exploitation of parallelism has actually increased the rate of acceleration for many purposes, especially in applied mathematical fields such as data mining.

Interestingly, new, powerful hardware has been put to the task of running ever more baroque algorithms. Feedforward neural networks, once trained over several days, now train in minutes on affordable desktop hardware. Over time, ever fancier algorithms have been fed to these machines: boosting, support vector machines, random forests and, most recently, deep learning illustrate this trend.

Another class of learning algorithms may also benefit from developments in hardware: local learning methods. Typical of local methods are radial basis function (RBF) neural networks and k-nearest neighbors (k-NN). RBF neural networks were briefly popular in the heyday of neural networks (the 1990s) since they train much faster than the more popular feedforward neural networks. k-NN is often discussed in chapter 1 of machine learning books: it is conceptually simple, easy to implement and demonstrates the advantages and disadvantages of local techniques well.

Local learning techniques usually have a large number of components, each of which handles only a small fraction of the set of possible input cases. The nice thing about this approach is that these local components largely do not need to coordinate with each other: The complexity of the model comes from having a large number of such components to handle many different situations. Local learning techniques thus make training easy: In the case of k-NN, one simply stores the training data for future reference. Little, if any, "fitting" is done during learning. This gift comes with a price, though: Local learning systems train very quickly, but model execution is often rather slow. This is because local models will either fire all of those local components, or spend time figuring out which among them applies to any given situation.

Local learning methods have largely fallen out of favor since: 1. they are slow to predict outcomes for new cases and, secondarily, 2. their implementation requires retention of some or all of the training data, and 2. This author wonders whether contemporary computer hardware may not present an opportunity for a resurgence among local methods. Local methods often perform well statistically, and would help diversify model ensembles for users of more popular learning algoprithms. Analysts looking for that last measure of improvement might be well served by investigating this class of solutions. Local algorithms are among the easiest to code from scratch. Interested readers are directed to "Lazy Learning", edited by D. Aha (ISBN-13: 978-0792345848) and "Nearest Neighbor Norms: NN Pattern Classification Techniques", edited by B. Dasarathy (ISBN-13: 978-0818689307).

Tuesday, February 16, 2010

Single Neuron Training: The Delta Rule

I have recently put together a routine, DeltaRule, to train a single artificial neuron using the delta rule. DeltaRule can be found at MATLAB Central.

This posting will not go into much detail, but this type of model is something like a logistic regression, where a linear model is calculated on the input variables, then passed through a squashing function (in this case the logistic curve). Such models are most often used to model binary outcomes, hence the dependent variable is normally composed of the values 0 and 1.

Single neurons with linear functions (with squashing functions or not) are only capable of separating classes that may be divided by a line (plane, hyperplane), yet they are often useful, either by themselves or in building more complex models.

Use help DeltaRule for syntax and a simple example of its use.

Anyway, I thought readers might find this routine useful. It trains quickly and the code is straightforward (I think), making modification easy. Please write to let me know if you do anything interesting with it.

If you are already familiar with simple neural models like this one, here are the technical details:

Learning rule: incremental delta rule
Learning rate: constant
Transfer function: logistic
Exemplar presentation order: random, by training epoch

See also the Mar-15-2009 posting, Logistic Regression and the Dec-11-2010 posting, Linear Discriminant Analysis (LDA).