(pressing HOME will start a new search)
|
|
NEURAL NETWORKS IN ENGINEERING APPLICATIONS
|
|
|
INTRODUCTION
Neural network methodologies offer a rich new paradigm of computational capabilities for the engineer faced with problems requiring some form of flexible functions. Although some network schemes have been inspired by research into the neural structure of the brain, the most well understood and developed neural networks have generally evolved from well‑known statistical methods such as stochastic approximation (see Robbins and Monro 1951) and can be viewed, in general, as recursive statistical estimation processing (White 1989). Neural networks are distinguished by the ability to learn through various parallel distributed processing (PDP) schemes and should be considered by the engineer to be a tool for intelligence augmentation (Engelbart 1963).
A wide range of networking models have been developed within the PDP paradigm (see Eckmiller and von der
Malsburg 1988). In general, a PDP model will consist of N units i.e. neurons (also non‑biologically termed as nodes or elements), an NxN weight matrix W = [wuJ, where W represents the knowledge or memory of the model; and w, is the synaptic connection strength from neuron j to neuron i. The PDP models perform seemingly intelligent functions like pattern recognition in a manner similar to the brain, hence the biological connotations. To distinguish PDP models from the biological networks, we refer to PDP models generally as artificial neural networks (ANNs).
Among the basic characteristics that distinguish these models are 1) learning algorithms and associated transfer functions, 2) error handling, 3) training methods, 4) the processing of input‑output data, and 5) network topologies. This paper will first investigate related characteristics of several better known neural network models that have been developed to illustrate potential uses in engineering applications. Finally, as a specific example of the ability of ANNs to augment the engineer's capabilities, an in‑depth illustration of a back‑propagation neural network (BPNN) trained and optimized to predict guyed tower cable pretensions will be presented.
NETWORK LEARNING METHODS
There are a number of general classifications for network learning or updating rules that have been developed for various ANN models. These include coincidence, performance, competitive, filter, and spatiotemporal learning methodologies (Hecht‑Nielsen 1989). All of these methods modify the connection weight matrix allowing the network to adapt and learn. It is important to distinguish between learning and memorization where learning implies the ability to generalize results for previously unseen or missing data.
Coincidence
Learning
Changes to a network weight matrix that occur in response to events are essentially behavioral rewards. Herb (1949) proposed a learning mechanism designed to simulate this type of response in ANNs. It can be seen from Hebb's updating rule for W (Eq. 1) that if no y vector of correct responses is presented to the network, there are no new weight modifications.
|
Performance
Learning
Gradient descent methods such as stochastic approximation and Least Mean Squares (LMS) are the precursors of performance learning methods. The network goal is to determine the optimal W which will minimize or maximize a specified performance measure or cost function F(99. The cost of incorrect decisions is proportional to the Euclidean distance between a network‑output pattern y' and a correct target=output y pattern and is known as the Least Mean Squared error goal where:
|
The weight matrix updating mechanism in equation 3 is known as the LMS, Widrow‑Hoff, or delta rule uses gradient descent to search for the minimal F(W) by evaluating the change in the error and using a learning rate constant a to determine the proportion of weight change:
|
There are a number of variants to the delta rule. Networks that propagate error vectors backward use a modified learning mechanism called the general delta rule and are discussed below
Competitive
Learning
Networks designed with this learning mechanism implement a competition process between processing elements before each learning episode and only the winning elements of the competition are allowed to modify weights. While there are a variety of competitive learning rules (Grossberg 1982; Von Malsburg 1973), perhaps the most well‑known is called the Kohonen learning law (Kohonen 1984) and is similar to the statistical process of fording k‑means. The kmeans (Hecht‑Nielsen 1989) for a set of data vectors, {x1, x2, …..xn} chosen at random with respect to a fixed probability density function ρ, comprise a set of k vectors {w1, w2, …, wk} that minimizes the sum
|
where w(xj) is the closest w vector to xi, as measured using the distance measure D. Thus, much like Kohonen weight vectors, k‑means are distributed in the same area as the x data vectors. However, the k‑means are not equiprobable. Thus, the basic Kohonen learning law is essentially the k‑means incremental adjustment law.
In the competitive process, each processing element calculates its input intensity h as
|
W1= (wi1,wi2,...,wiL)^T and X = (x1,x2...,xL) and where D is a distance measurement function with either a spherical arc or Euclidean metric (Hecht‑Nielsen 1989). The winning element has the minimum and will be updated according to
|
and the losing elements remain unchanged by
|
Filter
Learning
Network connection weights subjected to learning rules in this category are determined by a filtering process which allows a particular input connection to learn the average activity of other input connections. When a processing element is stimulated by a non‑zero signal from an input connection, the weight updating rule (also known as Grossberg learning expressed in Eq.7)
|
implements a step function where
|
Spatio-Temporal
Learning
Independently developed by Bart Kosko (1988) and Harry Klopf(1988), spatiotemporal (STNN) learning modifies in response to input signals and the time derivatives of selected processing input signals. Networks of this type implement asymmetrical infra‑slab connections, with X and Y representing vector functions of time, and are primarily used to classify spatiotemporal patterns (such as speech) or as command sequence memories for control networks.
TRAINING
Supervised
Training
The supervised network model is presented with a set of training vectors, one an external input vector X, and the other a companion training target‑output Y vector. The input vector X is passed through the ANN model and a corresponding network‑output vector Y' is obtained. The difference between the Y' vector and the Y vector is used to adjust the weights in W until Y' a Y . The process of minimizing the difference E between the target and the actual output vectors is similar to gradient descent techniques with E being the error or cost function. After minimizing E, the ANN model is used to identify vectors that are similar to those it has been trained to recognize. Once this type of network is trained and after training it has no ability to adapt.
Unsupervised Training
Networks trained without presenting target-outputs in training patterns are unsupervised. Most often used for classification problems, the network determines a category structure from the input data.
Reinforcement (Or Graded) Training
This training approach is especially good for control and process‑optimization problems. The network only receives a score or grade over a sequence of input‑output trials which represents the value of some network performance measure or cost function rather than receiving the target‑output on each individual training trial as in supervised training.
NETWORK TOPOLOGIES
Engineering problem solving can be distinguished as either derivative (interpretation, classification, diagnosis) or formative (planning, diagnosis) (Garrett 1992). The desired topology of networks is largely determined by the nature of the problem and characteristics of the data. Input data may be binary, bipolar, continuous, or statistical and may require some form of preprocessing such as normalization prior to presentation to the network . Network outputs may be of the following general types: classifications, patterns, or analog outputs. Classified outputs are statistic4lly mapped inputs to discrete output categories and have found uses in scene analysis, forecasting, and signal processing. Image processing, data fusion, and symbol identification problems might use a pattern output which is developed in response to the level of input activation. Optimization is a special case of pattern outputs which are interpreted as a set of decisions and are useful in operations research and task allocation problems. Analog outputs are mapped typically as real number metrics with a wide variety of applications such as forecasting, functional mapping, process control and robotics (Bailey 1990; Crooks 1992; Anderson 1990; Caudill 1988).
Network topology can generally be characterized in broad terms as associative or mapping as defined by the pattern of connections between the units, the propagation of signals, and the type of memory developed. These two primary categories of networks have provided the major thrust of ANN research and development for several decades. New developments in ANN methodology have also led to the additional categories of stochastic, spatiotemporal, and hierarchical networks which will not be discussed here.
Associative
Networks
Networks of this type
normally will have a single layer (or slab) of nodes which is used to associate
one set of vectors with another set of vectors. There are two primary classifications
of associative networks distinguishing 1) the relationship of input vectors to
output vectors and 2) the relationship of the processing elements or nodes. The
mechanisms for error handling in associative networks can also be distinguished
as accretive or interpolative, where, in an accretive network, an interpolative
network, (Hecht‑Nielsen 1989).
In the first classification, the relationship of input vectors are termed either
autoassociative or heteroassociative. Autoassociative memories map Yvectors to
corresponding Xvectors which are assumed to be equal. Typical applications for
autoassociative memory include character recognition, signal recognition, and
can serve as noise filters or content‑addressable memory.
Heteroassociative memories map one set of patterns onto another where
Target classification, process monitoring, and financial trend analysis are among many applications using this type of network structure. And among the newer approaches to network topology, "a more general approach to forming an associative memory is to avoid making a distinction between inputs and outputs. By concatenating the X vector and the Y vector into one longer measurement vector Z, a single probabilistic network can be used to find the global probability density function,
Associative
networks are secondly classified as either feedforward
or recurrent. In feedforward networks, the data flows from inputs X to
outputs Y with no feedback connections so that
|
where and the weight matrix W = (wij) is fixed, symmetrical, and the diagonal terms of the matrix are zero, i.e. w„= 0 The corresponding cost or energy function is with the network descending to a local minimum
Stochastic networks (Boltzman machine) are a variant which are the same as discrete time Hopfield networks which use simulated annealing procedures to solve problems ofspuriousminima. The layers of Boltzman machines are fully interconnected and weight updating is random, asynchronous, and probabilistic (Ackley 1984; Bachman 1987). Other recurrent networks of interest but not discussed here are the brain state in a box (BSB) (Anderson 1977) and the Bi-directional Associative Memory network (Kosko 1988).
Mapping
Networks
Kolmogorov's (1957) Mapping Neural Network Existence Theorem showed that if a problem can be solved by a nonlinear mapping M of input f into output g, a neural network can provide the solution g = MM (Hecht‑Nielsen 1987). Two basic categories of mapping networks are feature and prototype networks. Feature‑based networks implement a functional input‑output relationship and include back propagation and GMDH (Group Method of Data Handling) networks. Numerous variations of feature‑based networks models have been developed for engineering problems such as the analysis of composite materials such as reinforced concrete (Wu 1992) and selection of formwork systems (Kamarthi 1992). The back propagation model, which will be used in an application modeling the nonlinear behavior of tower-guy interactions, will be discussed at length. Prototype networks, including counter‑propagation and self‑organizing networks, create a set of specific input output examples that represent the mapping.
Feature-based Networks: Back propagation Having evolved from stochastic approximation methodologies, back propagation neural networks (Banns) derive their power from organized iterative processing which produces associative learning. Associative learning is accomplished in Banns through a supervised gradient descent scheme developed by Rinehart teal. (198 as is described below. In the feed forward process of the BPNN, independent variable pattern weights are modified as a signal is passed along connections from an input node layer through a hidden layer of transfer function nodes to an output layer of expected patterns. The term hidden simply means that these elements do not receive input directly from the outside world. The input layer' normalizes and distributes the patterns to each of the nodes in the hidden layer, which acts as a collection of associative feature detectors, and the output layer generates an appropriate response. The state of each node is determined by signals sent to it from all nodes connected to it. These signals are biased by the value of the connection weights between nodes.
Independent variable patterns are normalized at the input layer with each signal ax given a value between 0 and 1 prior to presentation to the hidden node layer. Each connection between the input layer and a hidden node has an associated weight w.. The net input signal I. to an individual hidden node is expressed as the sum of ad connections between the input layer nodes and that particular hidden node plus the connection weight w,, from a bias node which serves as a threshold:
The signal from the hidden node layer Job is then processed with sigmoid transfer function ( see Figure 1), which again normalizes the values between 0 and 1 prior to being sent to the output layer. The general form of this function is expressed as.
The net signal Ilk to an output node is the sum of all connections between the hidden layer nodes and that respective output node, expressed as:
In this model, the net signal is processed through the sigmoid function to produce the final network‑output value Ox
The back propagation process of this network model implements a variant of the W drown/Hoff learning rule known as the general delta rule where output layer error signals are propagated through the network to perform the appropriate weight adjustments after each pattern presentation (see Hecht‑Nielsen '1989; Widrow/Hoff 1960; Parker 1985). Rumelhart et.al. (1986) describe the process of weight adjustment by the following equation:
where is a learning coefficient and is a momentum factor "which determines the effect of past weight changes on the current direction of movement in weight space" and proportions the amount of the last weight change to be added into the new weight change (also see Caudill, 198.8). The error signal back‑propagated to the connection weights between the hidden and output layers is defined as the difference of the target value TPk for a particular input pattern and the neural network's feed forward calculations of the signal from the output layer 0t as
Then connection weights between the input and hidden layers are changed by
Prototype‑based Networks Self‑organizing 'Kohonen' networks use unsupervised competitive learning algorithms known as Learning Vector Quantizers (LVQ) to cluster input‑patterns into categories based on proximity of the patterns to W (Kohonen 1984). The winning nodes, with the minimum distance, and nearest neighbors are adjusted. Modifications of this approach include using a probability density function matching criterion rather than mean square error of the pattern‑to‑weight vector distances; networks of this type estimate probability density functions used to map input patterns to output patterns (Specht 1988).
Counter propagation networks essentially combine a Kohonen layer with a Grossberg learning layer with no intra‑slab connections. The x and y vectors are presented to the network from opposite ends and are propagated through the network in counter flow directions to produce approximated x' and Y. An improved derivation of this approach has been developed for structural damage detection (2bigniew 1992) where an additional distance metric f a is used to control the resolution of the network and redefine the clustering of sample space categories where:
AN ENGINEERING APPLICATION USING BACK PROPAGATION
Guyed
Tower Analysis
The modeling of guyed towers for analysis and design purposes is particularly complicated due to the importance of secondary deflection effects. While nonlinear effects are negligible in most structural systems, nonlinear effects significantly complicate the structural analysis of guyed towers. One aspect of the nonlinearity is ‑‑‑ P‑ A effect due to applied loads. A second is the interaction between cables and tower. Both of these facets of tower analysis have received considerable attention in the literature (Issa and Avent 1991). The nonlinear cable‑tower interaction, however, requires an iterative solution for the cable prestress tensions which can be very cumbersome and time consuming. A back‑propagation neural network (BPNN) model has been developed that will determine cable prestress tensions for nonlinear cable‑tower interactions more efficiently than traditional methods.
The neural network model was developed using a commercially available back‑propagation network NeuroShell 4.1. The network is comprised of input nodes which correspond to the independent variables (length from tower to anchors and height of cable connection on the tower), output nodes which represent the optimal prestress tensions for each cable, and a layer of hidden nodes which serve as nonlinear feature detectors. Guyed towers of various heights, cable lengths and guy locations and configurations have been analyzed and the optimal prestress cable tensions have been determined to develop the BPNN model. As a result, the solution for the cable pretensions is narrowed down to at most a couple of iterations. The results obtained using the neural network are compared with those obtained using a standard, iterative guyed tower analysis.
While the analysis of a sagging cable itself is linear, the support movement at the cable tower junction is nonlinear. Dean (1961) has presented a detailed analysis of sagging cables, including ice and dead weight as loading parameters. Dean's analysis forms the basis of the guy analysis approach used in this study. A typical guyed tower arrangement and potential distortion due to p‑0 is shown in Figure 2.
As the tower deflects laterally due to applied loads, the windward cables tighten and the leeward cables slacken. The resultant change in reactive force has both a horizontal and a vertical component. Since the guy cables are attached to the legs or to a starmount, a couple is produced during this movement. Thus, the guy cables at a given level can be approximated as equivalent translational and rotational spring supports. As the guy response is nonlinear, these spring constants are only valid at the specified reaction level. However it is assumed that within the vicinity of these reactions, the springs are approximately linear.
The guy cable analysis in this study is based on the assumption that: (1) The loads are applied symmetrically to the tower; and (2) the cables and the tower are attached to
the ground at the same elevation. Although the three dimensional aspects are included in both truss and cable analysis, the symmetry of the loadings and guy arrangements allow for the system analysis to be reduced to two dimensions. Thus, the torsional response of the tower is not addressed.
A comprehensive computer program (ANTENNA) has been developed by Issa and Avent (1991) using the discrete field analysis approach to develop a solution procedure for the analysis of guyed towers. The program allows for the analysis of three truss tower configurations in which the three tower legs form the apexes of an equilateral triangle, and the truss configurations on the faces of the tower are either: (I) X‑braced; (2) Warren; or (3) Vierendeel. The modeling of the tower is exact in the sense that only the usual assumptions of linear elastic behavior are used. The guy behavior is modeled as previously discussed.
For the purpose of this study only the X‑braced type truss tower configuration will be used with two guy levels. As illustrated in Fig. 3, the tower parameters H2 and L2 were kept at constant lengths of96 ft. (29.26 m) and 138 ft. (42.06 m), respectively. The length of Hl was varied in a range from a fourth to three‑quarters of H2 and the length of L 1 was set either equal to L2 or half of L2. The cross‑sectional parameter f was set at 3 ft.(0.91 m)
As a rule of thumb, the number of hidden nodes should be approximately 2n+1 nodes, where n is the number of nodes in the input layer. This is based on Hecht‑Nielsen's (1989) discussion of Kolmogorov's Mapping Neural Network Existence Theorem which states that a network of this type can exactly implement any continuous function. It should be noted here that networks with more than 2n+1 hidden nodes tend to memorize the training data rather than generalize the learned features. Furthermore, in practice we have found that networks should be tested with hidden
nodes ranging from 2 n+m‑2(n+1) and we found for this particular set of data seven hidden nodes provided a sufficiently minimal error and greater generalization capability. White (1989) illustrated that to extract the most information out of stationary data, ,T1 ~ (see Eq. 18) should be declining. Therefore, a declining learning rate
was used with a starting value of 1.0 and a final value of 0.1. The momentum value was 0.1. The network (see Fig. 4) was trained for 3.46 minutes (72,000 learning events). When minimum mean squared error CR2)values had been achieved, as shown in Fig. 5 below, R2 was determined to be 0.99976 for the training data set and 0.99936 for the target data set. The BPNN model illustrated in Fig. 4 is comprised of four input nodes which correspond to the independent variables (length from tower to anchors, L1 and L2, and height of cable connection on the tower, H 1 and H2), two output nodes which represent the optimal prestress tensions for each cable (TI and T2), a layer of hidden nodes, and two bias nodes.
The BPNN was trained using the training set shown below in Table 1. The cable pretensions, TI and T2, were determined using the software package ANTENNA (Issa and Avent 1991) with each set of values requiring several runs and each run containing at most 10 iterations.
Neural Network Results. After training the BPNN to a mean square error (MSE) of 5.178 x 10‑1, a range of values for Hl, H2, L1, and L2 were presented to the BPNN model to determine corresponding values for T1 and T2 under different conditions. The values of T 1 and T2 thus obtained were used as initial pre‑tension values in ANTENNA. The number of iterations needed to balance out the cable tensions was significantly reduced, from double digit numbers to one or two iterations at most. Hence, the time taken to perform a tower analysis was reduced from hours to minutes. The functions modeled by Neuroshell 4.1 are illustrated in Fig. 6. The functions describe expected cable pretensions for heights of guy‑tower connections, Hl and H2, and horizontal distances from tower base to guy anchors, L1 and L2. The plotted curves in Fig. 6 indicate that as the values of H 1 approach the values of H2, the value of the pretension T2 starts approaching a constant value. The constant value of T2 is to be expected since most of the lateral support formerly provided by the guy cables at H2 is being shifted to the guy cables at HI. The cable pretension values, T 1 and T2, obtained from the BPNN were verified using ANTENNA. Each set of pretension parameters verified required just one program run of ANTENNA which arrived at the appropriate solution after at most two iterations. Thus, the use of a BPNN to predict the cable pretensions allows for a more efficient guyed tower analysis and design. The efficiency of the BPNN approach derives from the fact that it allows the user to avoid the traditional iterative approach, which depending on the choice of initial cable pretension, may require several program runs with multiple iterations.
SUMMARY
Back propagation learning algorithms are probably the most understood (and developed) and have proven useful in a broad range of applications. Therefore, a BPNN was trained to predict the initial cable tensions in a guyed communication tower. In order to evaluate the back propagation neural network for this application, the results obtained using the network were compared with those obtained using a standard, iterative guyed tower analysis. Using the BPNN to predict pretension values improved the efficiency of the analysis of guyed towers by reducing the number of ANTENNA program runs from several to just one. The authors are in the process of extending this study to multi‑level guyed towers with both constant and variable cross‑sections between guy levels.
However, this review is primarily intended to `whet the appetite' by providing an overview of the broad potential for implementation of neural network architectures. In addition to the networks specifically discussed here, there are many permutations and combinations of each. For example, by replacing the sigmoid activation function with an exponential function, a neural network can be developed which computes nonlinear decision boundaries which approach the Bayes optimal under certain conditions (Specht 1988). Hybrid systems that combine the strengths of neural networks and expert systems have been successfully implemented in financial and military realms. Multiple network systems such as the RCE network use modules to control multiple levels of networks to handle increasingly complex classification decisions; "the Controller can correlate the ambiguous answers of several RCE networks to produce an unambiguous response. The degree of decision‑making is adjustable in the system through the definition of the
Controller parameters. The `minimum error' mode is used in applications where there is a high premium associated with each error. At the other extreme, in the `maximum response' mode, the system is maximizing throughput and occasional errors can be tolerated or detected by some filtering system downstream" (Cooper 1988; Reilly 1982; Reilly 1987). All of these are examples of the potential for intelligence augmentation.
REFERENCES
Ackley, D. H., Hinton, G. E. and Sojnowski, T. J. (1985). "A Learning Algorithm for Boltzmann Machines." Cognitive Science, 9(1), 147‑169.
Anderson, J.A., Silverstein, J.W., Ritz, S.A., and Jones, R.S. (1977). "Distinctive Features, Categorical Perception, and Probability Learning: Some Applications of a Neural Model." Psychol. Rev. 84(5), 413‑451.
Anderson, J.A. "Data Representation in Neural Networks." AI ExWrt. June. 30‑37.
Bachmann, C. M., Cooper, L. N., Dembo, A., and Zeitouni, O. (1987). "A Relaxation Model for Memory with High Storage Density." Proc., National Academy of Sciences of the U.S.A., 84, 7529‑7531.
Bailey, D. and Thompson, D. (1990). "How to Develop Neural Networks." Expert. June. 38‑47.
Caudill, M. (1988). "Neural Network Primer, Part 3." AI Exhert. June. 53‑59.
Cooper, L. N., Elbaum, C., and Reilly, D. L. (Awarded 1988). Parallel. Multi‑unit. Adaptive Nonlinear Pattern Class Separator and Identifier. U.S. Patent No. 4,760,604.
Crooks, T. (1992). "Care and Feeding ofNeural Networks." AI ExIrt. July. 37‑41.
Dean, D.L. (1961). "Static and dynamic analysis of guy cables" J. Struct. Engrg., ASCE 87(1), 1‑21.
Edelman, G.M., and Reeke, G. N., Jr. (1982). "Selective Networks Capable of Representative Transformations, Limited Generalizations and Associative Memory." Proc., National Academy of Sciences of the U.S.A., 79, 20912095.
Eckmiller, R., and Von Malsburg, C. (1988). Neural Comtl~=. Springer‑Verlag. New York.
Engelbart,
D. (1963). "A Conceptual Framework for Augmenting Man's Intellect." Vistas
in Information‑Han‑
i~g, 1, P.W. Howerton and D.C. Weeks, ed., Washington, D.C.: Spartan Books, 1‑29.
Furata, H., Sugiara, K., Tonewaga, T., and Wanatabe, E. (1991). "A Neural Network System for Aesthetic Design of Dam Structures." Proc., 2nd Int. Conf. on the Applicability of AI to Civil and Structural Eng. B.H. V. Topping, ed., 2, AI and Structural Eng., Civil‑Comp Press, 273‑278.
Garrett, J.H. (1992) "Neural Networks and Their Applicability within Civil Engineering." Proc., 8th Conf. on Comput. in Civil Engineering. B.J. Goodno end J. Wright, ed., ASCE, 1155‑1162.
Grossberg, S. (1982). Studies of Mind and Brain. Dordrecht, The Netherlands: Reidel.
Hebb, D. O. (1949). The Organization of Behavior, Wiley:New York.
Hecht‑Nielsen, R. (1987). "Kolmogorov's Mapping Neural Network Existence Theorem." Proc., Intl. Conf. on Neural Networks. III, 11‑13, IEEE Press:New York.
Hecht‑Nielsen, R. (1990). NeurocomputinQ. AddisonWesley. Reading:MA.
Hopfield, J.J. (1982). "Neural Networks and Physical Systems with Emergent Collective Computational Abilities." PrOC., Nat. Academy of Sciences of the U.S.A., 79, 2554‑58.
Hopfield, J.J., Feinstein, and Palmer, K.G. (1983). "Unlearning has a Stabilizing Effect in Collective Memories." Nature, 304, 158‑159.
Issa, R. R. A., and Avent, R. R. (1991). "Microcomputer Analysis of Guyed Towers as Lattices." J. Struct. Engrrg_,, ASCE 117(4), 1238‑1256.
Issa, R.R.A., Fletcher, D., end Cade, R.A. (1992). "Predicting Tower Guy Pretension Using a Neural Network." Proc., 8th Conf. on Comput. in Civil Engineering. B.J. Goodno and J. Wright, ed., ASCE, 1074‑1081.
Kamarthi, S.V., Sanvido, V.E.,andKumara, S.R.T. (1992). "Neuroform‑Neural Network System for Vertical Formwork Selection." J. Comput. Civ. Eng., ASCE, 6(6), 178‑199.
Klopf, A.H. (1988). "A Neuronal Model of Classical Conditioning." Psycho Biology
16(2), 85‑125.
Kohonen, T. (1984). Self‑Organization and Associative Memory. Springer‑Verlag:Berlin.
Kolmogorov, A. K. (1957). "On the Representation of Continuous Functions ofSeveral Variables by Superposition of Continuous Functions of One Variable and Addition." Doklady Akademi Nauk. USSR, 114, 369‑373.
Kosko, B. (1988). "Bidirectional Associative Memories." IEEE Traps. Systems. Man & Cyber. 18(1), 49‑60.
Lee, B.W. and Sheu, B.J. (1991). "Modified Hopfield Neural Networks for Retrieving the Optimal Solution." IEEE Trans, on Neural Networks, 2(l), 137‑142.
Parker, D. B. (1985). Learning Logic (Report TR‑47). Cambridge, MS: MIT Center for Research in Computational Economics Management Science.
Pearlmutter, B.A. (1990). "Dynamic Recurrent Neural Networks." School ofComputer Science, Carnegie Mellon Univ., CMU‑CS‑90‑196. Pittsburgh, PA.
Robbins, H., and Monro, S. (1951). "A Stochastic Approximation Method." The Annals of Mathematical Statistics, 22, 400‑407.
Rosenblatt, F. (1958). The Perception. a Theory of Statistical Separation in Cognitive Systems (Report no VG1196‑G‑1). Cornell Aeronautical Laboratory, Ithaca, NY.
Rummelhart, D. E., and McClelland, J. L. (1986). Parallel Distributed Processing, Cambridge, MA: MIT Press.
Rummelhart, D.E., Hinton, G.E.,and Williams, R.J. (1986). "Learning Internal Representations by Error Propagation." In Rummelhart, D. E., and McClelland, J. L. (Eds.). Parallel Distributed Processing, NUT Press:Cambridge.
Reilly, D. L., Cooper, L. N., and Elbaum, C. (1982). "A Neural Model for Category Learning." Biological Cybernetics, 45, 35‑41.
Reilly, D. L., Scofield, C., Elbaum, C., and Cooper, L. N. (1987). "Learning System Architectures Composed of Multiple Learning Modules." Proc., IEEE Intl. Conf. on Neural Networks, lI, 495‑503.
Specht, D. F. (1988). "Probabilistic Neural Networks for Classification, Mapping, or Associative Memory." Proc., IEEE Intl. Conf. on Neural Networks, I, 525‑532.
von der Malsburg, C. (1973). "Self‑Organization of Orientation Sensitive Cells in the Striate Cortex." rn t' ,14, 85‑100.
White, H. (1989). "Some Asymptotic Results for Learning in Single Hidden‑layer Feed‑Forward Network Models." L Amer, Stat. Assoc., 84, No. 408, 1003‑1013.
Widrow, B., and Hoff, M.E. (1960), "Adaptive Switching Circuits." 1960 IRE WESCON Convention Record, New York, 96‑104.
Williams, T.P., Khajuria, A., and Balaguru,P. (1992). "Neural Network for Predicting Concrete Strength." PtOC., 8th Conf. on Comput. in Civil Engineering. B.J. Goodno and J. Wright, ed., ASCE, 1082‑1088.
Wu, X. and Ghaboussi, M. (1992) "Neural Network‑based Modeling of Composite Material with Emphasis on Reinforced Concrete." Proc., 8th Conf. on Comput. in Civil Engineering. B.J. Goodno and J. Wright, ed., ASCE,11791186.
Zbigniew, P.S., and Hajela, P. (1992). "Neural Networks Based Damage Detection in Structures." Proc., 8th Conf. on Comput. in Civil Engineering. B.J. Goodno and J. Wright, ed., ASCE, 1163‑1170