1,493 | 242 | 218 |
下载次数 | 被引频次 | 阅读次数 |
作为深度学习算法中重要的环节,激活函数可以为神经网络引入非线性因素。大量学者通过提出或改进激活函数的方法在一定程度上提高了算法的优化及泛化能力。研究了现阶段的激活函数,将激活函数大致分为S系激活函数和ReLU系激活函数,从不同激活函数的功能特点和存在的饱和性、零点对称和梯度消失及梯度爆炸的现象进行研究分析,针对Sigmoid,Tanh,ReL,P-ReLU,L-ReLU等典型激活函数分别应用在卷积神经网络(Covolutional Neural Network,CNN)和循环神经网络(Recurrent Neural Network,RNN)中测试。在CNN中使用MNIST,CIFAR-10经典数据集测试不同激活函数,并在RNN中使用大豆粮油数据集对大豆的产值进行预警,通过结果得到S系激活函数比ReLU系激活函数收敛更快,而ReLU系激活函数则在精度上优于S系激活函数,其中P-ReLU在大豆产值预测中达到93%的最高精度。
Abstract:Since Deep Learning algorithm has attracted widespread attention,academia has made great efforts to improve algorithm's optimization performance.As an important part of deep learning algorithm,activation function introduces non-linear factors to neural networks.A lot of authors have,to some extent,improved optimization and generalization of the algorithm by proposing or updating activation function methods.This article roughly divides activation functions into S-system activation function and ReLU-system activation function after a thorough research.Starting with researching and analyzing functional characteristics of different activation functions,such as the existence of saturation,zero symmetry,gradient disappearance and gradient explosion,the article focuses on the typical activation functions such as Sigmoid,Tanh,ReLU and P-ReLU,and their respective test results in Convolutional Neural Network(CNN) and Recurrent Neural Network(RNN).Classic data sets like MNIST and CIFAR-10 in CNN are used to test different activation functions.Soybean data set are used in RNN to give an early warning to the output value,which shows that the S-system activation function converges faster than the ReLU-system activation function while ReLU-system has an edge in accuracy,P-ReLU achieved the highest accuracy of 93% in soybean yield prediction.
[1] GUO Y H,SUN L,ZHANG Z H,et al.Algorithm Research on Improving Activation Function of Convolutional Neural Networks[C]// Chinese Control & Decision Conference.Nanchang:IEEE,2019:3582-3586.
[2] ZAHEER R,SHAZIYA H.GPU-based Empirical Evaluation of Activation Functions in Convolutional Neural Networks[C]//2018 2nd International Confe-rence on Inventive Systems and Control (ICISC),Coimbatore:IEEE,2018:769-773.
[3] ELHASSOUNY A F.Trends in Deep Convolutional Neural Networks Architectures:A Review[J].IEEE,ICCSRE,2019:2352-2450.
[4] 张顺,龚怡宏,王进军.深度卷积神经网络的发展及其在计算机视觉领域的应用[J].计算机学报,2019,42(3):453-482.
[5] 杨丽,吴雨茜,王俊丽,等.循环神经网络研究综述[J].计算机应用,2018,38(S2):6-11,31.
[6] LI L S,GAN S J,YIN X D.Feedback Recurrent Neural Network-based Embedded Vector And Its Application in Topic Model[J].EURASIP Journal on Embedded Systems,2017:5024-5028.
[7] STURSA D,DOLEZEL P.Comparison of ReLU and Linear Saturated Activation Functions in Neural Network for Universal Approximation[C]//International Conference on Process Control.Strbske Pleso:IEEE,2019:146-151.
[8] LAU M M,LIM K H.Review of Adaptive Activation Function in Deep Neural Network[C]// IEEE-EMBS Conference on Biomedical Engineering & Sciences.Sarawak:IEEE,2018:686-690.
[9] TACHIBANA K,OTSUKA K.Wind Prediction Performance of Complex Neural Network with ReLU Activation Function[C]//57th Annual Conference of the Society of Instrument and Control Engineers of Japan.Nara:IEEE,2018:1029-1034.
[10] MAZUMDAR A,RAWAT A S.Learning and Recovery in the ReLU Model[C]//57th Annual Allerton Conference on Communication,Control,and Computing.Monticello:IEEE,2019:108-115.
[11] OZAWA K,ISOGAI K,TACHIBANA T,et al.A Multiplication by a Neural Network (NN) with Power Activations and a Polynomial Enclosure for a NN with PReLUs[C]// 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS).Dallas:IEEE,2019:323-326.
[12] CLEVERT D A,UNTERTHINER T,HOCHREITER S.Fast and Accurate Deep Network Learning by Exponential Linear Units (elus)[J].Computer Science,2015:1-14.
[13] QIAN S,LIU H,LIU C,et al.Adaptive Activation Functions in Convolutional Neural Networks[J].Neurocomputing,2018,272:204-212.
[14] LI Y,FAN C,LI Y,et al.Improving Deep Neural Network with Multiple Parametric Exponential Linear Units[J].Neurocomputing,2018,301:11-24.
[15] 蒋昂波,王维维.ReLU激活函数优化研究[J].传感器与微系统,2018,37(2):50-52.
[16] SHARMA O.A New Activation Function for Deep Neural Network [C]// 2019 International Conference on Machine Learning,Big Data,Cloud and Parallel Computing (COMITCon).Faridabad:IEEE,2019:84-86.
[17] SUN W,SU F,WANG L.Improving Deep Neural Networks with Multilayer Maxout Networks[C]// 2014 IEEE Visual Communications and Image Processing Conference.Valletta:IEEE,2014:334-337.
[18] HU Z,LI Y,YANG Z.Improving Convolutional Neural Network Using Pseudo Derivative ReLU[C]// 2018 5th International Conference on Systems and Informatics (ICSAI).Nanjing:IEEE,2018:283-287.
[19] AHN H,CHUNG B,YIM C.Super-Resolution Convolutional Neural Networks Using Modified and Bilateral ReLU[C]//International Conference on Electronics.Auckland:IEEE,2019:596-598.
[20] DONG Y,WEN R,LI Z,et al.Clu-RNN:A New RNN Based Approach to Diabetic Blood Glucose Prediction[C]∥ IEEE International Conference on Bioinformatics & Computational Biology.Hangzhou:IEEE,2019:50-55.
[21] YE Q Q,YANG X Q,CHEN C B,et al.River Water Quality Parameters Prediction Method Based on LSTM-RNN Model[C]//IEEE Chinese Control And Decision Conference (CCDC).Nangchang:IEEE,2019:3024-3028.
基本信息:
DOI:
中图分类号:TP18
引用信息:
[1]张有健,陈晨,王再见.深度学习算法的激活函数研究[J].无线电通信技术,2021,47(01):115-120.
基金信息:
粮食信息处理与控制教育部重点实验室开放基金(KFJJ-2018-205)~~