Although this chapter is devoted to empirical modeling techniques for modeling linear systems, artificial neural networks (ANNs) which can be used to model both linear and nonlinear systems are discussed here as well because of their popularity. Following a short historical perspective, summarizing foundations of ANNs. Due to availability of numerous ANN software on different platforms, there is no need to construct ANN models from scratch unless a very special, custom application is aimed.
The "neural networks" have been inspired from the way the human brain works as an information-processing system in a highly complex, nonlinear and massively parallel fashion. In its most general form, the following definition of a neural network has been suggested as an adaptive machine :
A neural network is a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects:
1. Knowledge is acquired by the network from its environment through a learning process.
2. Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.
ANNs have a large number of highly interconnected processing elements also called as nodes or artificial neurons. The first computational model of a biological neuron, namely the binary threshold unit whose output was either 0 or 1 depending on whether its net input exceeded a given threshold, has been proposed by McCulloch and Pitts in 1943 . Their interpretation has united the studies of neurophysiology and mathematical logic. The next major development in neural networks came in 1949 when Hebb, in his book named The Organization of Behavior , explicitly proposed that the connectivity of the brain is continually changing as an organism learns differing functional tasks, and that "neural assemblies" are created by such changes. This suggested that a system of neurons, assembled in a finite state automaton, could compute any arbitrary function, given suitable values of weights between the neurons . 15 years after these pioneering fundamental studies, automatically finding suitable values for those weights was introduced by Rosenblatt  in his work on the perceptron which is a function that computes a linear combination of variables and returns the sign of the result. It was proposed that this iterative learning procedure (so-called perceptron convergence theorem) always converged to a set of weights that produced the desired function, as long as the desired function is computable by the network [521, 643], Interest in neural networks was gradually revived from about 1985s when Rumelhart et al.  popularized a much faster learning procedure called back-propagation, which could train a multi-layer perceptron to compute any desired function.
Other commonly used names for ANNs include parallel distributed processors, connectionist models (or networks), self-organizing systems, neurocomputing systems, and neuromorphic systems. ANNs can be seen as "black-box" models for which no prior knowledge about the process is needed. The goal is to develop a process model based only on the input-output data acquired from the process. There are benefits and limitations of using ANNs for empirical modeling [226, 483]:
• Adaptive Behavior. ANNs have the ability to adapt, or learn, in response to their environment through training. A neural network can easily be retrained to deal with minor changes in the operational and/or environmental conditions. Moreover, when it is operating in a nonstationary environment, it can be designed to adjust its synaptic weights in real time. This is especially a valuable asset in adaptive pattern classification and adaptive control.
• Nonlinearity. A neural network is made of interconnections of neurons and is itself nonlinear. This special kind of nonlinearity is distributed throughout the network. The representation of nonlinear behavior by nonlinear structure is a significant property, since the inherent characteristic of most fermentations/biological processes is highly nonlinear.
• Pattern Recognition Properties. ANNs perform multivariable pattern recognition tasks very well. They can learn from examples (training) by constructing an input-output mapping for the system of interest.
In the pattern classification case an ANN can be designed to provide information about similar and unusual patterns. Training and pattern recognition must be made by using a closed set of patterns. All possible patterns to be recognized should be present in the data set.
• Fault Tolerance. A properly designed and implemented ANN is usually capable of robust computation. Its performance degrades gracefully under adverse operating conditions and when some of its connections are severed.
• Long Training Times. When structurally complex ANNs or inappropriate optimization algorithms are used, training may take unreasonably long times.
• Necessity of Large Amount of Training Data. If the size of input-output data is small, ANNs may not produce reliable results. ANNs provide more accurate models and classifiers when large amounts of historical data rich in variations are available.
• No Guarantee of Optimal Results. Training may cause the network to be accurate in some operating zones, but inaccurate in others. While trying to minimize the error, it may get trapped in local minima.
• No Guarantee of Complete Reliability. This general fact about all computational techniques is particularly true for ANNs. In fault diagnosis applications, for instance, ANNs may misdiagnose some faults 1% of the time while other faults in the same domain 25% of the time. It is hard to determine a priori (when backpropagation algorithm is used) what faults will be prone to higher levels of misdiagnosis.
• Operational Problems Associated with Implementation. There are practical problems related to training data set selection [302, 334].
Was this article helpful?