Components Of Neural Networks

Common activation functions:-The simplest activation function is the binarythreshold function, which can only take on two values(also referred to as Heaviside function).If the input is above a certain threshold, the function changes from onevalue to another, but otherwise remainsconstant. This implies that the functionis not differentiable at the threshold andfor the rest the derivative is 0. Due tothis fact, back propagation learning, for example,is impossible Also very popular is the Fermi function or logistic function

Which maps to the range of values of (0, 1) and the hyperbolic tangent (fig. 3.2) which maps to (−1, 1). Both functions are differentiable. The Fermi function can be expanded by a temperature parameterT into the form

The smaller this parameter, the more does it compress the function on the x axis. Thus, one can arbitrarily approximate the Heaviside function. Incidentally, there exists activation functions which are not explicitly defined but depend on the input according to a random distribution (stochasticactivation function).A alternative to the hyperbolic tangent that is really worth mentioning was suggested by Anguita, whohave been tired of the slowness of the workstations back in 1993. Thinking about how to make neural network propagations faster, they quickly identified the approximation of the e-function used in the hyperbolic tangent as one of the causes of slowness. Consequently, they"engineered"an approximation to the hyperbolic tangent,just using two parabola pieces and two half-lines. At the price of delivering slightly smaller range of values than the hyperbolic tangent ([−0.96016; 0.96016] instead of [−1; 1]), dependent on what CPU one uses, it can be calculated 200 times faster because it just needs two multiplications and one addition.

An output function may be used to process the activation once again:-The output function of a neuron j calculatesthe values which are transferred tothe other neurons connected to j.

Definition 7(Output function). Let j informs other neurons be a neuron. The output function

fout (a_j) = o_j

Calculates the output value ojof the neuron j from its activation state ajfout .Generally, the output function is defined globally, too. Often this function is the identity, i.e. the activation ajis directly output:

fout(a_j) = a_j, so o_j= a_j

Unless explicitly specified differently, we will use the identity as output function within this text.

Definition 8 (General learning rule). The learning strategy is an algorithmthat can be used to change and therebytrain the neural network, so that the network m produces a desired output for a given input.