1、第 2 章神经网络(2)内容:感知器、多层前馈型神经、反向传播算法(BP 算法) 、神经网络应用。重点、难点:感知器、反向传播算法(BP 算法) 。1 引例引例1981 年生物学家格若根(W.Grogan)和维什(W Wirth)发现了两类蚊子(或飞蠓 midges)他们测量了这两类蚊子每个个体的翼长和触角长,数据如下: 翼长 触角长 类别 1.78 1.14 Apf 1.96 1.18 Apf 1.86 1.20 Apf 1.72 1.24 Af 2.00 1.26 Apf 2.00 1.28 Apf 1.96 1.30 Apf 1.74 1.36 Af问:如果抓到三只新的蚊子,它们的触角长
2、和翼长分别为(l.24,1.80); (l.28,1.84);(1.40,2.04) 问它们应分别属于哪一个种类? 解法: 把翼长作纵坐标,触角长作横坐标;那么每个蚊子的翼长和触角决定了坐标平面的一个点.其中 6 个蚊子属于 APf 类;用黑点 “”表示;9 个蚊子属 Af 类;用小 翼 长 触 角 长 类 别 1.64 1.38 Af 1.82 1.38 Af 1.90 1.38 Af 1.70 1.40 Af 1.82 1.48 Af 1.82 1.54 Af 2.08 1.56 Af圆圈“。 ”表示 得到的结果见图 1 思路:作一直线将两类飞蠓分开 例如;取 A(1.44,2.10)和
3、B(1.10 ,1.16),过A B 两点作一条直线: y 1.47x - 0.017 其中 X 表示触角长;y 表示翼长 分类规则:设一个蚊子的数据为(x, y) 如果 y1.47x - 0.017,则判断蚊子属 Apf 类; 如果 y1.47x - 0.017;则判断蚊子属 Af 类 分类结果:(1.24,1.80),(1.28 ,1.84) 属于 Af 类;(1.40,2.04)属于 Apf 类分类直线图如下的情形已经不能用分类直线的办法:新思路:将问题看作一个系统,飞蠓的数据作为输入,飞蠓的类型作为输出,研究输入与输出的关系。2 Perceptron ModelX13Y00110100
4、0X12X3 YBlack boxOutpInputX123Y001101000 X12X3 YBlack box0.3.0.3t=0.4OutpnodeInputodes otherwise0 tui if1)( wher )04.3.03.3.021zzI XXXYX12X3 YBlack boxw1tOutpnodeInputodes23)(tXwsignYi ii Model is an assembly of inter-connected nodes and weighted links Output node sums up each of its input value acc
5、ording to the weights of its links Compare output node against some threshold tFirst neural network with the ability to learnMade up of only input neurons and output neuronsInput neurons typically have two states: ON and OFFOutput neurons use a simple threshold activation functionIn basic form, can
6、only solve linear problems ( limited applications)(Can only express linear decision surfaces)How Do Perceptrons Learn?Uses Supervised training (training means learning the weights of the neurons)If the output is not correct, the weights are adjusted according to the formula:3 Back Propagation Networ
7、ksThe following diagram shows a Back Propagation NN: This NN consists of three layers: 1. Input layer with three neurons. 2. Hidden layer with two neurons. 3. Output layer with two neurons.Generally, Back Propagation NNMost common neural networkAn extension of the perceptron(1) Multiple layers The a
8、ddition of one or more “hidden” layers in between the input and output layers (2) Activation function is not simply a thresholdUsually a sigmoid function(3) A general function approximatorNot limited to linear problemsFor example, a typical multilayer network and decision surface is depicted in Figu
9、re:Information flows in one direction (1) The outputs of one layer act as inputs to the layerNote that: 1. The output of a neuron in a layer goes to all neurons in the following layer. 2. Each neuron has its own input weights. 3. The weights for the input layer are assumed to be 1 for each input. In
10、 other words, input values are not changed. 4. The output of the NN is reached by applying input values to the input layer, passing the output of each neuron to the following layer as input. 5. The Back Propagation NN must have at least an input layer and an output layer. It could have zero or more
11、hidden layers.The number of neurons in the input layer depends on the number of possible inputs we have, while the number of neurons in the output layer depends on the number of desired outputs. The number of hidden layers and how many neurons in each hidden layer cannot be well defined in advance,
12、and could change per network configuration and type of data. In general the addition of a hidden layer could allow the network to learn more complex patterns, but at the same time decreases its performance. You could start a network configuration using a single hidden layer, and add more hidden laye
13、rs if you notice that the network is not learning as well as you like. For example, suppose we have a bank credit application with ten questions, which based on their answers, will determine the credit amount and the interest rate. To use a Back Propagation NN, the network will have ten neurons in t
14、he input layer and two neurons in the output layer.4The Back Propagation AlgorithmThe Back Propagation NN uses a supervised training mode. The algorithm learns the weights for multilayer network. The training can be summarized as follows: 步骤 1:初始化权重每二个神经元之间的网络连接权重 被初始化为一个很小的随机数,同时每个神经元有一ij个偏置 也被初始化为
15、一个随机数。对每个输入样本 按步骤 2 进行处理。i x步骤 2:向前传播输入根据训练样本 提供网络的输人层,通过计算得到每个神经元的输出。都由其输入的线x性组合得到,具体公式为: )(1ijijj OSj eO步骤 3:反向误差传播由步骤 2 一路向前,最终在输出层得到实际输出,可以通过与预期输出相比较得到每个输出单元 的误差, 如公式 (for each j )(1jjjj OTEoutput unit calculate its error term)所示, 是输出单元 的预期输j j出。得到的误差需要从后向前传播,前面一层单元 的误差可以通过和它连接的后面一层的所有单元 k 的误差计算
16、所得,用公式:(for each hidden unit )依次得到最后一个隐含层jjjj EOE)1(到第一个隐含层每个神经元的误差。步骤 4:网络权重与神经元偏置调整计算得到的所有神经元的误差,然后统一调整网络权重和神经元的阈值。调整网络权重的方法是从输入层与第一隐含层的连接权重开始,依次向后进行,每个连接权重 用公式 = + = +( ) 进行调整。ijijijijijliOjE神经元偏置的调整方法是对每个神经元 用公式: = + = +( ) 更jjjljE新。步骤 5:判断结束对于每个样本,如果最终的输出误差小于可接受的范围或者迭代次数 t 达到了一定的阈值,则选取下一个样本,转到步骤 2 重新继续执行。否则,迭代次数加 1,然后转向步骤 2 继续使用当前样本进行训练。5应用举例:已知一个前馈型神经网络例子如下图所示。设学习率 l 为 0.9,当前的训练样本为 x=1,0,1,而且预期分类标号为 1,同时,下表给出了当前该网络的各个连接权值和神经元偏置。求该网络在当前训练样本下的训练过程 。