1. 首页 > 智能数码 >

bp算法的局限性 简述bp算法的算法思想

4.BP算法的优点与局限性

优点:

bp算法的局限性 简述bp算法的算法思想bp算法的局限性 简述bp算法的算法思想


(1)能够自适应、自主学习。这是BP算法的根本以及其优势所在,BP算法根据预设的参数更新规则,不断地调整神经网络中的参数,以达到期望的输出。

(2)拥有较强的非线性映射能力。

(3)严谨的推导过程。误的反向传播过程,采用的是已经非常成熟的链式法测,其推导过程严谨且科学。

(4)较强的泛化能力,即在BP算法训练结束之后,BP算法可以利用从原来知识中学到的知识解决新的问题。

缺点:

(1)由于BP神经网络中的参数众多,每次都需要更新数量较多的阈值和权值,故会导致收敛速度过慢。

(2)网络中隐含层节点个数尚无明确的公式,传统方法需要不断地设置隐含层节点数进行试凑,根据网络误结果确定终隐含层节点个数。

(3)从数学角度看,BP算法是一种速度较快的梯度下降算法,很容易陷入局部小值的问题。当出现局部极小时,从表面上看,误符合要求,但这时所得到的解并不一定是问题的真正解。所以BP算法是不完备的。

BP算法及其改进

传统的BP算法及其改进算法的一个很大缺点是:由于其误目标函数对于待学习的连接权值来说非凸的,存在局部小点,对网络进行训练时,这些算法的权值一旦落入权值空间的局部小点就很难跳出,因而无法达到全局小点(即点)而使得网络训练失败。针对这些缺陷,根据凸函数及其共轭的性质,利用Fenchel不等式,使用约束优化理论中的罚函数方法构造出了带有惩罚项的新误目标函数。

用新的目标函数对前馈神经网络进行优化训练时,隐层输出也作为被优化变量。这个目标函数的主要特点有:

1.固定隐层输出,该目标函数对连接权值来说是凸的;固定连接权值,对隐层输出来说是凸的。这样在对连接权值和隐层输出进行交替优化时,它们所面对的目标函数都是凸函数,不存在局部小的问题,算法对于初始权值的敏感性降低;

2.由于惩罚因子是逐渐增大的,使得权值的搜索空间变得比较大,从而对于大规模的网络也能够训练,在一定程度上降低了训练过程陷入局部小的可能性。

这些特性能够在很大程度上有效地克服以往前馈网络的训练算法易于陷入局部小而使网络训练失败的重大缺陷,也为利用凸优化理论研究前馈神经网络的学习算法开创了一个新思路。在网络训练时,可以对连接权值和隐层输出进行交替优化。把这种新算法应用到前馈神经网络训练学习中,在学习速度、泛化能力、网络训练等多方面均优于传统训练算法,如经典的BP算法。数值试验也表明了这一新算法的有效性。

本文通过典型的BP算法与新算法的比较,得到了二者之间相互关系的初步结论。从理论上证明了当惩罚因子趋于正无穷大时新算法就是BP算法,并且用数值试验说明了惩罚因子在网络训练算法中的作用和意义。对于三层前馈神经网络来说,惩罚因子较小时,隐层神经元局部梯度的可变范围大,有利于连接权值的更新;惩罚因子较大时,隐层神经元局部梯度的可变范围小,不利于连接权值的更新,但能提高网络训练精度。这说明了在网络训练过程中惩罚因子为何从小到大变化的原因,也说明了新算法的可行性而BP算法则时有无法更新连接权值的重大缺陷。

矿体预测在矿床地质中占有重要地位,由于输入样本量大,用以往前馈网络算法进行矿体预测效果不佳。本文把前馈网络新算法应用到矿体预测中,取得了良好的预期效果。

本文后指出了新算法的优点,并指出了有待改进的地方。

关键词:前馈神经网络,凸优化理论,训练算法,矿体预测,应用

Feed forward Neural Networks Training Algorithm Based on Convex Optimization and Its Application in Deposit Forcasting

JIA Wen-chen (Computer Application)

Directed by YE Shi-wei

Abstract

The paper studies primarily the application of convex optimization theory and algorithm for feed forward neural networks’ training and convergence performance.

It reviews the history of feed forward neural networks, points out that the training of feed forward neural networks is essentially a non-linear problem and introduces BP algorithm, its advantages as well as disadvantages and previous improvements for it. One of the big disadvantages of BP algorithm and its improvement algorithms is: because its error target function is non-convex in the weight values between neurons in different layers and exists local minimum point, thus, if the weight values enter local minimum point in weight values space when network is trained, it is difficult to skip local minimum point and reach the global minimum point (i.e. the most optimal point).If this happening, the training of networks will be unsuccessful. To overcome these essential disadvantages, the paper constructs a new error target function including restriction item according to convex function, Fenchel inequality in the conjugate of convex function and punishment function method in restriction optimization theory.

When feed forward neural networks based on the new target function is being trained, hidden layers’ outputs are seen as optimization variables. The main characteristics of the new target function are as follows:

1.With fixed hidden layers’ outputs, the new target function is convex in connecting weight variables; with fixed connecting weight values, the new target function is convex in hidden layers’ outputs. Thus, when connecting weight values and hidden layers’ outputs are optimized alternately, the new target function is convex in them, doesn’t exist local minimum point, and the algorithm’s sensitiveness is reduced for original weight values .

2.Because the punishment factor is increased gradually, weight values ’ searching space gets much bigger, so big networks can be trained and the possibility of entering local minimum point can be reduced to a certain extent in network training process.

Using these characteristics can overcome efficiently in the former feed forward neural networks’ training algorithms the big disadvantage that networks training enters local minimum point easily. This creats a new idea for feed forward neural networks’ learning algorithms by using convex optimization theory .In networks training, connecting weight variables and hidden layer outputs can be optimized alternately. The new algorithm is much better than traditional algorithms for feed forward neural networks. The numerical experiments show that the new algorithm is successful.

By comparing the new algorithm with the traditional ones, a primary conclusion of their relationship is reached. It is proved theoretically that when the punishment factor nears infinity, the new algorithm is BP algorithm yet. The meaning and function of the punishment factor are also explained by numerical experiments. For three-layer feed forward neural networks, when the punishment factor is aller, hidden layer outputs’ variable range is bigger and this is in for to updating of the connecting weights values, when the punishment factor is bigger, hidden layer outputs’ variable range is aller and this is not in for to updating of the connecting weights values but it can improve precision of networks. This explains the reason that the punishment factor should be increased gradually in networks training process. It also explains feasibility of the new algorithm and BP algorithm’s disadvantage that connecting weigh values can not be updated sometimes.

Deposit forecasting is very important in deposit geology. The previous algorithms’ effect is not good in deposit forecasting because of much more input samples. The paper applies the new algorithm to deposit forecasting and expectant result is reached.

The paper points out the new algorithm’s strongpoint as well as to-be-improved places in the end.

Keywords: feed forward neural networks, convex optimization theory, training algorithm, deposit forecasting, application

传统的BP算法及其改进算法的一个很大缺点是:由于其误目标函数对于待学习的连接权值来说非凸的,存在局部小点,对网络进行训练时,这些算法的权值一旦落入权值空间的局部小点就很难跳出,因而无法达到全局小点(即点)而使得网络训练失败。针对这些缺陷,根据凸函数及其共轭的性质,利用Fenchel不等式,使用约束优化理论中的罚函数方法构造出了带有惩罚项的新误目标函数。

用新的目标函数对前馈神经网络进行优化训练时,隐层输出也作为被优化变量。这个目标函数的主要特点有:

1.固定隐层输出,该目标函数对连接权值来说是凸的;固定连接权值,对隐层输出来说是凸的。这样在对连接权值和隐层输出进行交替优化时,它们所面对的目标函数都是凸函数,不存在局部小的问题,算法对于初始权值的敏感性降低;

2.由于惩罚因子是逐渐增大的,使得权值的搜索空间变得比较大,从而对于大规模的网络也能够训练,在一定程度上降低了训练过程陷入局部小的可能性。

这些特性能够在很大程度上有效地克服以往前馈网络的训练算法易于陷入局部小而使网络训练失败的重大缺陷,也为利用凸优化理论研究前馈神经网络的学习算法开创了一个新思路。在网络训练时,可以对连接权值和隐层输出进行交替优化。把这种新算法应用到前馈神经网络训练学习中,在学习速度、泛化能力、网络训练等多方面均优于传统训练算法,如经典的BP算法。数值试验也表明了这一新算法的有效性。

本文通过典型的BP算法与新算法的比较,得到了二者之间相互关系的初步结论。从理论上证明了当惩罚因子趋于正无穷大时新算法就是BP算法,并且用数值试验说明了惩罚因子在网络训练算法中的作用和意义。对于三层前馈神经网络来说,惩罚因子较小时,隐层神经元局部梯度的可变范围大,有利于连接权值的更新;惩罚因子较大时,隐层神经元局部梯度的可变范围小,不利于连接权值的更新,但能提高网络训练精度。这说明了在网络训练过程中惩罚因子为何从小到大变化的原因,也说明了新算法的可行性而BP算法则时有无法更新连接权值的重大缺陷。

矿体预测在矿床地质中占有重要地位,由于输入样本量大,用以往前馈网络算法进行矿体预测效果不佳。本文把前馈网络新算法应用到矿体预测中,取得了良好的预期效果。

本文后指出了新算法的优点,并指出了有待改进的地方。

关键词:前馈神经网络,凸优化理论,训练算法,矿体预测,应用

Feed forward Neural Networks Training Algorithm Based on Convex Optimization and Its Application in Deposit Forcasting

JIA Wen-chen (Computer Application)

Directed by YE Shi-wei

Abstract

The paper studies primarily the application of convex optimization theory and algorithm for feed forward neural networks’ training and convergence performance.

It reviews the history of feed forward neural networks, points out that the training of feed forward neural networks is essentially a non-linear problem and introduces BP algorithm, its advantages as well as disadvantages and previous improvements for it. One of the big disadvantages of BP algorithm and its improvement algorithms is: because its error target function is non-convex in the weight values between neurons in different layers and exists local minimum point, thus, if the weight values enter local minimum point in weight values space when network is trained, it is difficult to skip local minimum point and reach the global minimum point (i.e. the most optimal point).If this happening, the training of networks will be unsuccessful. To overcome these essential disadvantages, the paper constructs a new error target function including restriction item according to convex function, Fenchel inequality in the conjugate of convex function and punishment function method in restriction optimization theory.

When feed forward neural networks based on the new target function is being trained, hidden layers’ outputs are seen as optimization variables. The main characteristics of the new target function are as follows:

1.With fixed hidden layers’ outputs, the new target function is convex in connecting weight variables; with fixed connecting weight values, the new target function is convex in hidden layers’ outputs. Thus, when connecting weight values and hidden layers’ outputs are optimized alternately, the new target function is convex in them, doesn’t exist local minimum point, and the algorithm’s sensitiveness is reduced for original weight values .

2.Because the punishment factor is increased gradually, weight values ’ searching space gets much bigger, so big networks can be trained and the possibility of entering local minimum point can be reduced to a certain extent in network training process.

Using these characteristics can overcome efficiently in the former feed forward neural networks’ training algorithms the big disadvantage that networks training enters local minimum point easily. This creats a new idea for feed forward neural networks’ learning algorithms by using convex optimization theory .In networks training, connecting weight variables and hidden layer outputs can be optimized alternately. The new algorithm is much better than traditional algorithms for feed forward neural networks. The numerical experiments show that the new algorithm is successful.

By comparing the new algorithm with the traditional ones, a primary conclusion of their relationship is reached. It is proved theoretically that when the punishment factor nears infinity, the new algorithm is BP algorithm yet. The meaning and function of the punishment factor are also explained by numerical experiments. For three-layer feed forward neural networks, when the punishment factor is aller, hidden layer outputs’ variable range is bigger and this is in for to updating of the connecting weights values, when the punishment factor is bigger, hidden layer outputs’ variable range is aller and this is not in for to updating of the connecting weights values but it can improve precision of networks. This explains the reason that the punishment factor should be increased gradually in networks training process. It also explains feasibility of the new algorithm and BP algorithm’s disadvantage that connecting weigh values can not be updated sometimes.

Deposit forecasting is very important in deposit geology. The previous algorithms’ effect is not good in deposit forecasting because of much more input samples. The paper applies the new algorithm to deposit forecasting and expectant result is reached.

The paper points out the new algorithm’s strongpoint as well as to-be-improved places in the end.

Keywords: feed forward neural networks, convex optimization theory, training algorithm, deposit forecasting, application

BP算法及其改进

2.1 BP算法步骤

1°随机抽取初始权值ω0;

2°输入学习样本对(Xp,Yp),学习速率η,误水平ε;

3°依次计算各层结点输出opi,opj,opk;

4°修正权值ωk+1=ωk+ηpk,其中pk=,ωk为第k次迭代权变量;

5°若误E<ε停止,否则转3°。

2.2 步长ηk的确定

在上面的算法中,学习速率η实质上是一个沿负梯度方向的步长因子,在每一次迭代中如何确定一个步长ηk,使其误值下降快,则是典型的一维搜索问题,即E(ωk+ηkpk)=(ωk+ηpk)。令Φ(η)=E(ωk+ηpk),则Φ′(η)=dE(ωk+ηpk)/dη=E(ωk+ηpk)Tpk。若ηk为(η)的极小值点,则Φ′(ηk)=0,即E(ωk+ηpk)Tpk=-pTk+1pk=0。确定ηk的算法步骤如下

1°给定η0=0,h=0.01,ε0=0.00001;

2°计算Φ′(η0),若Φ′(η0)=0,则令ηk=η0,停止计算;

3°令h=2h, η1=η0+h;

4°计算Φ′(η1),若Φ′(η1)=0,则令ηk=η1,停止计算;

若Φ′(η1)>0,则令a=η0,b=η1;若Φ′(η1)<0,则令η0=η1,转3°;

5°计算Φ′(a),若Φ′(a)=0,则ηk=a,停止计算;

6°计算Φ′(b),若Φ′(b)=0,则ηk=b,停止计算;

7°计算Φ′(a+b/2),若Φ′(a+b/2)=0,则ηk=a+b/2,停止计算;

若Φ′(a+b/2)<0,则令a=a+b/2;若Φ′(a+b/2)>0,则令b=a+b/2

8°若|a-b|<ε0,则令,ηk=a+b/2,停止计算,否则转7°。

2.3 改进BP算法的特点分析

在上述改进的BP算法中,对学习速率η的选取不再由用户自己确定,而是在每次迭代过程中让计算机自动寻找步长ηk。而确定ηk的算法中,首先给定η0=0,由定义Φ(η)=E(ωk+ηpk)知,Φ′(η)=dE(ωk+ηpk)/dη=E(ωk+ηpk)Tpk,即Φ′(η0)=-pTkpk≤0。若Φ′(η0)=0,则表明此时下降方向pk为零向量,也即已达到局部极值点,否则必有Φ′(η0)<0,而对于一维函数Φ(η)的性质可知,Φ′(η0)<0则在η0=0的局部范围内函数为减函数。故在每一次迭代过程中给η0赋初值0是合理的。

改进后的BP算法与原BP算法相比有两处变化,即步骤2°中不需给定学习速率η的值;另外在每一次修正权值之前,即步骤4°前已计算出步长ηk。

bp神经网络的缺点

1)局部极小化问题:从数学角度看,传统的BP神经网络为一种局部搜索的优化方法,它要解决的是一个复杂非线性化问题,网络的权值是通过沿局部改善的方向逐渐进行调整的,这样会使算法陷入局部极值,权值收敛到局部极小点,从而导致网络训练失败。加上BP神经网络对初始网络权重非常敏感,以不同的权重初始化网络,其往往会收敛于不同的局部极小,这也是很多学者每次训练得到不同结果的根本原因。

2)BP神经网络算法的收敛速度慢:由于BP神经网络算法本质上为梯度下降法,它所要优化的目标函数是非常复杂的,因此,必然会出现“锯齿形现象”,这使得BP算法低效;又由于优化的目标函数很复杂,它必然会在神经元输出接近0或1的情况下,出现一些平坦区,在这些区域内,权值误改变很小,使训练过程几乎停顿。

3)BP神经网络结构选择不一:BP神经网络结构的选择至今尚无一种统一而完整的理论指导,一般只能由经验选定。网络结构选择过大,训练中效率不高,可能出现过拟合现象,造成网络性能低,容错性下降,若选择过小,则又会造成网络可能不收敛。而网络的结构直接影响网络的逼近能力及推广性质。因此,应用中如何选择合适的网络结构是一个重要的问题。

4)应用实例与网络规模的矛盾问题:BP神经网络难以解决应用问题的实例规模和网络规模间的矛盾问题,其涉及到网络容量的可能性与可行性的关系问题,即学习复杂性问题。

5)BP神经网络预测能力和训练能力的矛盾问题:预测能力也称泛化能力或者推广能力,而训练能力也称逼近能力或者学习能力。一般情况下,训练能力时,预测能力也。

BP神经网络的核心问题是什么?其优缺点有哪些?

人工神经网络,是一种旨在模仿人脑结构及其功能的信息处理系统,就是使用人工神经网络方法实现模式识别.可处理一些环境信息十分复杂,背景知识不清楚,推理规则不明确的问题,神经网络方法允许样品有较大的缺损和畸变.神经网络的类型很多,建立神经网络模型时,根据研究对象的特点,可以考虑不同的神经网络模型. 前馈型BP网络,即误逆传播神经网络是常用,的神经网络.BP网络的输入和输出关系可以看成是一种映射关系,即每一组输入对应一组输出.BP算法是的多层前向网络训练算法,尽管存在收敛速度慢,局部极值等缺点,但可通过各种改进措施来提高它的收敛速度,克服局部极值现象,而且具有简单,易行,计算量小,并行性强等特点,目前仍是多层前向网络的算法.

多层前向BP网络的优点:

网络实质上实现了一个从输入到输出的映射功能,而数学理论已证明它具有实现任何复杂非线性映射的功能。这使得它特别适合于求解内部机制复杂的问题;

网络能通过学习带正确答案的实例集自动提取“合理的”求解规则,即具有自学习能力;

网络具有一定的推广、概括能力。

多层前向BP网络的问题:

从数学角度看,BP算法为一种局部搜索的优化方法,但它要解决的问题为求解复杂非线性函数的全局极值,因此,算法很有可能陷入局部极值,使训练失败;

网络的逼近、推广能力同学习样本的典型性密切相关,而从问题中选取典型样本实例组成训练集是一个很困难的问题。

难以解决应用问题的实例规模和网络规模间的矛盾。这涉及到网络容量的可能性与可行性的关系问题,即学习复杂性问题;

网络结构的选择尚无一种统一而完整的理论指导,一般只能由经验选定。为此,有人称神经网络的结构选择为一种艺术。而网络的结构直接影响网络的逼近能力及推广性质。因此,应用中如何选择合适的网络结构是一个重要的问题;

新加入的样本要影响已学习成功的网络,而且刻画每个输入样本的特征的数目也必须相同;

网络的预测能力(也称泛化能力、推广能力)与训练能力(也称逼近能力、学习能力)的矛盾。一般情况下,训练能力时,预测能力也,并且一定程度上,随训练能力地提高,预测能力也提高。但这种趋势有一个极限,当达到此极限时,随训练能力的提高,预测能力反而下降,即出现所谓“过拟合”现象。此时,网络学习了过多的样本细节,而不能反映样本内含的规律

由于BP算法本质上为梯度下降法,而它所要优化的目标函数又非常复杂,因此,必然会出现“锯齿形现象”,这使得BP算法低效;

存在麻痹现象,由于优化的目标函数很复杂,它必然会在神经元输出接近0或1的情况下,出现一些平坦区,在这些区域内,权值误改变很小,使训练过程几乎停顿;

为了使网络执行BP算法,不能用传统的一维搜索法求每次迭代的步长,而必须把步长的更新规则预先赋予网络,这种方法将引起算法低效。

bp算法在深度神经网络上为什么行不通

BP算法作为传统训练多层网络的典型算法,实际上对仅含几层网络,该训练方法就已经很不理想,不再往下进行计算了,所以不适合深度神经网络。

BP算法存在的问题:

(1)梯度越来越稀疏:从顶层越往下,误校正信号越来越小。

(2)收敛到局部小值:尤其是从远离区域开始的时候(随机值初始化会导致这种情况的发生)。

(3)一般,我们只能用有标签的数据来训练:但大部分的数据是没标签的,而大脑可以从没有标签的的数据中学习。

深度神经网络的特点:

多层的好处是可以用较少的参数表示复杂的函数。

在监督学习中,以前的多层神经网络的问题是容易陷入局部极值点。如果训练样本足够充分覆盖未来的样本,那么学到的多层权重可以很好的用来预测新的测试样本。

非监督学习中,以往没有有效的方法构造多层网络。多层神经网络的顶层是底层特征的高级表示,比如底层是像素点,上一层的结点可能表示横线,三角; 而顶层可能有一个结点表示人脸。一个成功的算法应该能让生成的顶层特征化的代表底层的样例。

如果对所有层同时训练,时间复杂度会太高; 如果每次训练一层,偏就会逐层传递。这会面临跟上面监督学习中相反的问题,会严重欠拟合。

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至836084111@qq.com 举报,一经查实,本站将立刻删除。

联系我们

工作日:9:30-18:30,节假日休息