英语原文共 13 页,剩余内容已隐藏,支付完成后下载完整资料
Machine learning algorithms for pattern recognition inherently rely on the characteristics of extracted features [1] . Deep learning models learn complicated functions with large data sets to extract high-level features automatically through deep-layered neural network structures. These models accomplish this using unsupervised learning for interconnection weight initialization followed by supervised learning using teaching signal information.
1.引言
模式识别的机器学习算法本质上依赖于被提取特征量的特征[ 1 ]。深度学习模型通过大数据集学习复杂函数,通过深层神经网络结构自动提取高级特征。这些模型使用无监督学习来实现互连权重初始化,然后使用教学信号信息监督学习。
Conventional deep neural network architectures use multiple hidden layers instead of a single hidden layer. However, it is usually difficult to identify an adequate learning algorithm to train those interconnection weights with multiple hidden layers. Thus, the interconnection weights in multiple layers are expected to replace manual domain-specific feature engineering in the case of conventional machine learning [2]. Moreover, recent neuroscience research has provided further elucidation and background information for efficiently constructing deep feature extraction [1].
传统的深神经网络结构使用多个隐层而不是单个隐层。然而,通常很难确定一个适当的学习算法来训练具有多个隐藏层的互连权重。因此,在传统机器学习的情况下,多层互连权重有望取代人工领域特定的特征工程[ 2 ]。此外,最近的神经科学研究为高效构建深度特征提取提供了进一步的阐述和背景信息[1]。
Earlier studies focused on the importance of deep architecture [3,4]. Nevertheless, the use of deep learning was not very prevalent partly because there were no effective learning methods apart from a few models [5]. The restricted Boltzmann machine (RBM), which was invented by Smolensky [6,7], is a generative stochastic neural network that can learn a probability distribution over its set of inputs. In addition, after Hinton et al. proposed the RBM network with contrastive divergence [8], deep architectures using RBM networks have become considerably popular for many pattern recognition applications since they exhibit state-of-the-art performance without complex manual feature engineering. Although well-trained RBM networks show good performance [9,10], the learning algorithm requires setting the user-determined meta parameters carefully, such as learning rate, momentum, weight regularization cost, initial values of the weights, sparsity target, number of hidden units, and size of each mini-batch [11]. Without optimal engineering of these parameters, the training does not show the best performance or it is readily affected by the overfitting problem.
早期的研究集中在深层建筑的重要性[3,4]。然而,深度学习的使用并不普遍,部分原因是除了少数模型外,没有有效的学习方法[5]。由Smolensky [6,7]发明的受限玻尔兹曼机(RBM)是一种生成随机神经网络,可以学习其输入集合的概率分布。另外,Hinton等人提出了具有对比分歧的RBM网络[8],使用RBM网络的深层架构已经在许多模式识别应用中变得相当流行,因为它们展现了最先进的性能而没有复杂的手动特征工程。尽管训练良好的RBM网络表现出良好的性能[9,10],但学习算法需要仔细设置用户确定的元参数,如学习速率,动量,权重正则化成本,权重初始值,稀疏目标,隐藏单位和每个小批量的大小[11]。如果没有对这些参数进行优化设计,培训不会显示出最佳性能,或者很容易受到过度拟合问题的影响。
The overfitting problem is one of the most critical problems for machine learning methods. It is a phenomenon where the accuracy of the model on unseen data is significantly worse than the training accuracy. This problem occurs when the model is biased on training data and is more severe in complex models that have a large set of parameters, such as multilayered neural networks. Several methods have been proposed to solve this problem. For example, the weight-sharing technique can be used in the case of a neural network; it is an essential part of successful convolutional neural network learning which is considered one of the most successful deep structures [12]. In addition, the cross-validation technique divides the training stage dataset into respective training and validation sets. In this technique, the model exhibiting the largest generalization performance is selected by testing using validation set [13]. Similarly, the Dropout method has been particularly proposed for artificial neural networks [14]. This method, which involves performing model averaging with redundant neural networks, reduces overfitting by preventing complex coadaptations on the training data. Additionally, the Maxout network has recently been proposed by Goodfellow et al., to enhance the advantages of the Dropout method [15]. In recent, the DropConnect is also proposed by Wan et al. [16], which is a generalized version of Hinton#39;s Dropout for regularizing large fully connected layers.
过度拟合问题是机器学习方法中最关键的问题之一。这是一种现象,模型对未知数据的准确性明显比训练准确性差。当模型偏向于训练数据时会出现此问题,并且在具有大量参数(例如多层神经网络)的复杂模型中更严重。已经提出了几种方法来解决这个问题。例如,权重分享技术可用于神经网络的情况;它是成功卷积神经网络学习的重要组成部分,被认为是最成功的深层结构之一[12]。另外,交叉验证技术将训练阶段数据集分为相应的训练和验证集。在这种技术中,表现出最大泛化性能的模型是通过使用验证集进行测试来选择的[13]。同样,Dropout方法已经被特别提出用于人工神经网络[14]。该方法中使用冗余神经网络来执行模型平均化,通过防止训练数据上的复杂共同性来减少过拟合。此外,Goodfellow等人最近提出了Maxout网络,进一步增强了Dropout方法的优点[15]。最近,DropConnect法也由Wan等人提出 [16],这一方法是Hinton的Dropout法的一个广义上的版本,用于调整大型完全连接的层。
The support vector machine (SVM) is another type of solution for overfitting problems. Proposed by Vapnik [17], the SVM is a supervised machine learning algorithm. An SVM with a shallow layer structure is commonly used for classification and regression, particularly with the kernel trick, which performs predictions for new inputs depending only on the kernel function evaluated at a sparse subset of training data points [18]. The SVM is trained with the structural risk minimization (SRM) principle [19]; i.e., it constructs an optimal decision hyperplane in the context of the maximal margin. In the SVM, maximal margin guarantees a high generalization performance because generalization errors can be bounded in terms of margins. The support vector data description (SVDD), a variant of the SVM, can build a minimum sphere around the training data of a class to construct the decision boundary [20]. SVMs are usually sensitive to noise patterns or outliers because a relatively small number of mislabeled examples or outliers can dramatically decrease the performance. In other words, an outlier can critically affect the decision boundary and calculat
全文共40158字,剩余内容已隐藏,支付完成后下载完整资料
资料编号:[15769],资料为PDF文档或Word文档,PDF文档可免费转换为Word
以上是毕业论文外文翻译,课题毕业论文、任务书、文献综述、开题报告、程序设计、图纸设计等资料可联系客服协助查找。