Genetic Algorithms versus Traditional Methods Genetic algorithms are substantially different to the more traditional search and optimization techniques. techniques. The five main differences are: 1. Genetic algorithms search a population of points in parallel, para llel, not from a single point. 2. Genetic algorithms do not require derivative information or other auxiliary knowledge; only the objective function and corresponding fitness levels influence the direction of the search. 3. Genetic algorithms use probabilistic transition rules, r ules, not deterministic rules. 4. Genetic algorithms work on an encoding of a parameter set not the parameter set itself (except where real-valued individuals are used). 5. Genetic algorithms may provide a number of potential solutions to a given problem and the choice of the final is left up to the user
Define
SOFT COMPUTING
Soft computing is a term applied to a field within computer science which is characterized by the use of inexact solutions to computationally-hard computationally-hard tasks such as the solution of NP-complete problems, for which an e xact solution cannot be derived in polynomial time. DELTA LEARNING
RULE
learning rule that adjusts synaptic weights according to the product of the presynaptic activity and a postsynaptic postsynaptic error signal obtained obtained by computing the the difference between the actual output activity and a desired or required output activity. A
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
LEAST SQUARES
method of determining the curve that best describes the t he relationship between expected and observed sets of data by minimizing the sums of the squares of o f deviation between observed and expected values.
A
A statistical method used to determine a line of best fit by minimizing the sum of squares created by a mathematical function. A "square" is determined by squaring the distance between a data point and the regression line. The least squares approach limits the distance between a function and the data points that a function is trying to explain. It is used in regression analysis, often in nonlinear regression modeling in which a curve is fit into a set of data.
T-conorm 2.2. Generalized t-conorm integral In this section, we use t-conorms and t-norms. They are binary b inary operators that generalize addition and multiplication, and also max and min. triangular conorm (t-conorm) conditions: A
is a binary binary operation on [0, 1] 1] fulfilling the
(T1) x0 = x. (T2) x y u
v whenever x u and y v.
(T3) x y = y x. (T4) (x y)z = x(y z). t-conorm is said to be strict if and only if it is continuous on [0, 1] and strictly increasing on [0, 1) A
2. A continuous t-conorm
is said to be
Archimedean
if and only if x
x > x for
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
AGGREGATION Aggregation
operations are operations that combine or aggregate two or more fuzzy sets. There are a number of different types of aggregation, including unions (sums), intersections (products), and means. Fuzzy Logic contains a wide collection of different operators, including many nonstandard operators that are not found in many other fuzzy packages. In addition, Fuzzy Logic provides a function for creating user-defined aggregators, making it easy for users to experiment with aggregators or add their own aggregators. The following are among the aggregators that can be used in Fuzzy Logic. Logic . y
y
y
For
unions and intersections: min, max, Hamacher, Frank, Yager, DuboisPrade, Dombi, Yu, and Weber For sums and products: drastic, bounded, algebraic, Einstein, and Hamacher For means: arithmetic, geometric, harmonic, and generalized
CROSSOVER Single point crossover - one crossover point is selected, binary string from beginning of chromosome to the crossover point is copied from one parent, the rest is copied from the second s econd parent
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
11001011+11011111
= 11001111
point crossover - two crossover point are selected, binary string from beginning of chromosome to the first crossover point is copied from one parent, the part from the first to the second crossover point is copied from the second parent and the rest is copied from the first parent
Two
11001011
+ 11011111 =
11011111
crossover - bits are randomly copied from the first or from the second parent Uniform
11001011 + 11011101 = 11011111 crossover - some arithmetic operation is performed to make a new offspring Arithmetic
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
SINGLE-LAYER PERCEPTRON
The earliest kind of neural network is a single-layer perceptron network, which consists of a single layer of output nodes; the inputs are fed directly to the outputs via a series of weights. In this way it can be considered the simplest kind of feedforward network. The sum of the products of the weights and the inputs is calculated in each node, and if the value is above some threshold (typically 0) the neuron fires and takes the activated value (typically 1); otherwise it takes the deactivated value (typically -1). Neurons with this kind of activation function are also called Artificial called Artificial neuron s or linear or linear thre shold unit s s. perceptron can be created using any values for the activated and deactivated states as long as the threshold t hreshold value lies between the two. Most perceptrons have [citation needed ] outputs of 1 or -1 with w ith a threshold of 0 and there is some evidence that such networks can be trained more quickly than networks created fro m nodes with different activation and deactivation values. A
Perceptrons can be trained by a simple learning algorithm that is usually called the delta rule. rule . It calculates the errors between calculated output and sample output data, and uses this to create an a n adjustment to the weights, thus implementing a form of gradient of gradient descent. descent . Single-unit perceptrons are only capable of learning linearly separable patterns; in 1969 in a famous monograph entitled Perceptron entitled Perceptron s Marvin Minsky and Seymour Papert showed that it was impossible for a single-layer perceptron network to learn an XOR function. function. It is often believed that they also conjectured (incorrectly) that a similar result would hold for a multi-layer perceptron network. However, However, this is not true, as both Minsky and Papert already a lready knew that multi-layer perceptrons were capable of producing an XOR Function. unction. (See (S ee the page on Perceptrons for more information.) Although
a single threshold unit is quite limited in its computational power, it has been shown that networks of parallel threshold units can approximate any continuous continuous function from a compact interval of the real numbers into the interval [-
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
multi-layer neural network can compute a continuous output instead of a step function.. A common choice is the so-called logistic function: function function :
A
(In general form, f(X) is in place of x, where f(X) is an analytic function in set of x's.) With this choice, the single-layer network is identical to the logistic regression model, widely used in statistical modeling. modeling . The logistic function is also known as the sigmoid function. function. It has a continuous derivative, which allows it to be used in backpropagation. backpropagation. This function is also preferred because its derivative is easily calculated: y' y' = y = y(1 (1 í y í y)) (times df / df / dX , in general form, according to the Chain Rule) Rule )
KOHONEN
NETWORKS
The objective of a Kohonen network is to map input vectors (patterns) of arbitrary dimension N onto a discrete map with 1 or o r 2 dimensions. Patterns close to one o ne another in the input space should be close to one another in the map: they should be topologically ordered. A Kohonen network is composed of a grid of output units and N input units. The input pattern is fed to each output unit. The input lines to each output unit are weighted. These weights are initialized to small random numbers. Learning
in Kohonen Net works
The learning process is as roughly as follows: y y
initialise the weights for each output unit loop until weight changes are negligible neg ligible for each input pattern o present the input pattern find the winning output unit un it find all units in the neighbourhood neighbourhoo d of the winner update the weight vectors vecto rs for all those units
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The world's largest digital library
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
distance of that unit on t he map (not in weight space). In the de monstration monstration below all the neighbourhoods are square. If the size o f the neighbourhood is 1 then all units no more than 1 either horizontally or vertically from any unit fall w ithin its neighbourhood. The weights of every unit in the neighbourhood neighbo urhood of the winning unit (including the winning unit itself) are updated using (21)
This will move each unit in the neighbourhood closer to the input pattern. As time progresses the learning rate and the neighbourhood size are reduced. If the parameters are well chosen the final network should capture the natural clusters in the input data.