Responsible for this page: webmaster , cvlwebmaster@isy.liu.se
Page last update: 2005-05-22

[ Go to content ] [ Help ] [ Information about accessability ]
På svenska | A to Z Maps Web overview Contact us
Go to LiU.se
LiU - ISY - CVL > Learning

Learning

Over the years, the complexity of systems for information processing has increased dramatically. In many fields such as vision, robotics, speech, control, etc., it has however proved increasingly difficult to specify sufficiently complex operations controlled by parameters, where each operation may only be valid within limited windows of context. Such information could in principle be supplied through training.

There has been an extensive research on mechanisms which would allow such an acquisition of information, e.g. under the heading of neural networks. In spite of the extensive literature on various aspects of the subject, the success for powerful applications employing learning has been limited. The reason for this is a limited capacity of available learning structures.

Learning has become an important line of research at CVL, in the conviction that sufficiently complex, steerable operations can only be generated using learning structures.

New Information Representations

A new architecture for learning systems has been developed. A number of particular design features in combination result in a high performance and excellent robustness.

The architecture uses a monopolar channel information representation. The channel representation implies a partially overlapping mapping of signals into a higher-dimensional space, such that a flexible but continuous restructuring mapping can be made. The high-dimensional mapping introduces locality in the information representation, which is directly available in wavelets or filter outputs. One consequence is that single level maps can produce closed decision regions, which eliminates the need for back-propagation. See [Granlund2000].

The monopolar property implies that data only utilizes one polarity, say positive values, in addition to zero, allowing zero to represent no information. This gives a strategy to use confidence statements in data, leading to a low sensitivity to noise in features, allowing a more efficient sparse representation. See [Granlund2000].

The processing mode of the architecture is association where the mapping of feature inputs onto desired state outputs is learned from a representative training set. The sparse monopolar representation together with locality, using individual learning rates, allows very fast optimization, as the system exhibits linear complexity.

The result is an architecture allowing systems with a complexity of some hundred thousand features described by some hundred thousand samples to be trained in typically less than an hour. The architecture has been tested experimentally on various problems, such as the design of hyper complex operations for view centered object recognition in vision for robotics, a brief description of which is given under the heading of robotics.