History Of Neural Networks

Introduction:-The history of neural networks begins in the early 1940’s and thus nearly simultaneously with the history of programmable electronic computers. The youth of this field of research, as with the field of computer science itself, can be easily recognized due to the fact that many of the cited persons are still with us.

The beginning :-As soon as 1943 Warren McCullochand Walter Pitts introduced modelsof neurological networks, recreatedthreshold switches based on neuronsand showed that even simplenetworks of this kind are able tocalculate nearly any logic or arithmeticfunction. Further precursors ("electronic brains")were developed, among others supported by KonradZuse, who was tired of calculating ballistic trajectories by hand.

1947:Walter Pitts and Warren Mc-Culloch indicated a practical field of application (which was not mentionedin their work from 1943),namely the recognition of spacial patterns by neural networks .

1949: Donald O. Hebb formulated the classical hebbian rule [Heb49] which represents in its more generalized form the basis of nearly all neural learning procedures. The rule implies that the connection between two neurons is strengthened when both neurons are active at the same time. This change in strength is proportional to the product of the two activities.Hebb could postulate this rule,but due to the absence of neurologicalresearch he was not able to verify it.

Golden age

1951: For his dissertation Marvin Minsky developed the neurocomputer Snark, which has already been capable to adjust its weights 3 automatically.But it has never been practically implemented, since it is capable to busily calculate, but nobody really knows what it calculates.

1956: Well-known scientists and ambitious students met at the DartmouthSummer Research Project and discussed, to put it crudely, how to simulate a brain. Differences between top-down and bottom-up research developed.

1957-1958: At the MIT, Frank Rosenblatt, Charles Wightman and their coworkers developed the first successful neurocomputer, the Mark I perceptron, which was capable to development accelerates recognize simple numerics by means of a 20 × 20 pixel image sensor and electromechanically worked with 512 motor driven potentiometers – each potentiometer representing one variable weight.

1959: Frank Rosenblatt described different versions of the perceptron, formulated and verified his perceptron convergence theorem. He described neuron layers mimicking the retina, threshold switches, and a learning rule adjusting the connecting weights.

1960: Bernard Widrow and Marcian E. Hoff introduced the ADALINE (ADAptive LInear NEuron) ,a fast and precise adaptive learning system being the first widely commercially used neural network: It could be found in nearly every analog telephone for realtime adaptive echo filtering and was trained by menas of the Widrow-Hoff first spread use rule or delta rule. At that time Hoff, later co-founder of Intel Corporation, was a PhD student of Widrow, who himself is known as the inventor One advantage the delta rule had over the original perceptron learning algorithm was its adaptivity.Disadvantage: missapplication led to infinitesimal small steps close to the target.

1961:Karl Steinbuch introduced technical realizations of associative memory, which can be seen as predecessors of today’s neural associative memories.Additionally, he described concepts for neural techniques and analyzed their possibilities and limits.

1965:It was assumed that the basic principles of self-learning and therefore, generally speaking, "intelligent" systems had already been discovered. Today this assumption seems to be an exorbitant overestimation, but at that time it provided for high popularity and sufficient research funds.

1969: Marvin Minsky and Seymour Papert published a precise mathe-matical analysis of the perceptron[MP69] to show that the perceptron model was not capable of representing many important problems (keywords: XOR problem and linear separability),and so put an end to overestimation, popularity and research funds. The research funds were stopped implication that more powerful models would show exactly the same problems and the forecast that the entire field would be a research dead end resulted in a nearly complete decline in research funds for the next 15 years– no matter how incorrect these forecasts were from today’s point of view.

1972: Teuvo Kohonen introduced a model of the linear associator, a model of an associative memory.In the same year, such a model was presented independently and from a neurophysiologist’s point of view by James A. Anderson.

1973: Christoph von der Malsburg used a neuron model that was nonlinear and biologically more motivated.

1974: For his dissertation in Harvard Paul Werbos developed a learning procedure called backpropagation of error ,but it was not until one decade later that this procedure reached today’s importance. Backprop developed 1976-1980 and thereafter: Stephen Grossberg presented many papers in which numerous neural models are analyzed mathematically. Furthermore, he dedicated himself to the problem of keeping a neural network capable of learning without destroying already learned associations. Under cooperation of Gail Carpenter this led to models of adaptive resonance theory (ART).

1982: Teuvo Kohonen described the self-organizing feature maps – also known as Kohonen maps. He was looking for the mechanisms involving self-organization in the brain (He knew that the information about the creation of a being is stored in the genome, which has, however, not enough memory for a structure like the brain. As a consequence, the brain has to organize and create itself for the most part).

1983: Fukushima, Miyake and Ito introduced the neural model of the Neocognitron which could recognize handwritten characters and was an extension of the Cognitron network already developed in 1975.

1985:John Hopfield published an article describing a way of finding acceptable solutions for the Travelling Salesman problem by using Hopfield nets.

1986:The backpropagation of error learning procedure as a generalization of the delta rule was separately developed and widely published by the Parallel Distributed Processing Group: Non-linearly-separable problems could be solved by multilayer perceptrons, and Marvin Minsky’s negative evaluations were disproven at a single blow. At the same in the field of artificial intelligence, caused by a series of failures and unfulfilled hopes.

From this time on, the development of the field of research has almost been explosive.