Диссертация (1137108), страница 23
Текст из файла (страница 23)
14, no. 1. — P. 76–86.39. Rumelhart D. E., Hinton G. E., Williams R. J. Learning representations by back-propagating errors // Nature. — 1986. — Vol. 323, no. 6088. — P. 533–538.40. Jia Y., Shelhamer E., Donahue J., Karayev S., Long J., Girshick R., Guadarrama S., Darrell T.Caffe: Convolutional architecture for fast feature embedding // ACM ICM. — 2014.41. Vedaldi A., Lenc K. MatConvNet – Convolutional Neural Networks for MATLAB // arXiv. —2014.42. Collobert R., Kavukcuoglu K., Farabet C. Torch7: A Matlab-like Environment for Machine Learning // BigLearn, NIPS Workshop.
— 2011.43. Theano Development Team. Theano: A Python framework for fast computation of mathematicalexpressions // arXiv. — 2016.44. Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G. S., Davis A., Dean J.,Devin M., [et al.]. Tensorflow: Large-scale machine learning on heterogeneous distributed systems // arXiv. — 2016.45. Robbins H., Monro S. A stochastic approximation method // The annals of mathematical statistics. — 1951. — P.
400–407.46. Loshchilov I., Hutter F. SGDR: Stochastic Gradient Descent with Restarts // International Conference on Learning Representations. — 2017.47. Karpathy A. Neural Networks Part 3: Learning and Evaluation / http://cs231n.github.io/neuralnetworks-3/. — 2015.48. Hardt M., Recht B., Singer Y. Train faster, generalize better: Stability of stochastic gradient descent // International Conference on Machine Learning. — 2016.49. Поляк Б. Т. О некоторых способах ускорения сходимости итерационных методов // Журналвычислительной математики и математической физики.
— 1964. — Т. 4, № 5. — С. 791—803.50. Duchi J., Hazan E., Singer Y. Adaptive subgradient methods for online learning and stochasticoptimization // Journal of Machine Learning Research. — 2011. — Vol. 12, Jul. — P. 2121–2159.51. Zeiler M. D. ADADELTA: an adaptive learning rate method // arXiv. — 2012.10552. Tieleman T., Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recentmagnitude // COURSERA: Neural networks for machine learning.
— 2012. — Vol. 4, no. 2. —P. 26–31.53. Kingma D., Ba J. Adam: A method for stochastic optimization // International Conference onLearning Representations. — 2015.54. Gross S., Wilber M. Training and investigating Residual Nets / http://torch.ch/blog/2016/02/04/resnets.html. — 2016.55. Wilson A. C., Roelofs R., Stern M., Srebro N., Recht B. The Marginal Value of Adaptive GradientMethods in Machine Learning // arXiv. — 2017.56. Glorot X., Bengio Y.
Understanding the difficulty of training deep feedforward neural networks //Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. —2010. — P. 249–256.57. He K., Zhang X., Ren S., Sun J. Delving deep into rectifiers: Surpassing human-level performanceon imagenet classification // Proceedings of the IEEE international conference on computer vision.
— 2015. — P. 1026–1034.58. Mishkin D., Matas J. All you need is a good init // International Conference on Learning Representations. — 2016.59. Hinton G. E., Salakhutdinov R. R. Reducing the dimensionality of data with neural networks //science. — 2006. — Vol. 313, no.
5786. — P. 504–507.60. Bengio Y., Lamblin P., Popovici D., Larochelle H. Greedy layer-wise training of deep networks //Advances in neural information processing systems. — 2007. — P. 153–160.61. Smolensky P. Information processing in dynamical systems: foundations of harmony theory //Parallel distributed processing: explorations in the microstructure of cognition. — 1986. — P. 194–281.62. Freund Y., Haussler D. Unsupervised learning of distributions on binary vectors using two layernetworks // Advances in neural information processing systems.
— 1992. — P. 912–919.63. Erhan D., Bengio Y., Courville A., Manzagol P.-A., Vincent P., Bengio S. Why does unsupervisedpre-training help deep learning? // Journal of Machine Learning Research. — 2010. — Vol. 11,Feb. — P. 625–660.64. Glorot X., Bordes A., Bengio Y. Deep sparse rectifier neural networks // Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. — 2011. — P.
315–323.65. Sharif Razavian A., Azizpour H., Sullivan J., Carlsson S. CNN features off-the-shelf: an astoundingbaseline for recognition // Proceedings of the IEEE conference on computer vision and patternrecognition workshops. — 2014. — P. 806–813.66. Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detectionand semantic segmentation // Proceedings of the IEEE conference on computer vision and patternrecognition. — 2014. — P. 580–587.10667. Girshick R. Fast R-CNN // Proceedings of the IEEE international conference on computer vision.
— 2015. — P. 1440–1448.68. Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A. L. DeepLab: Semantic ImageSegmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs //arXiv. — 2016.69. Shelhamer E., Long J., Darrell T. Fully convolutional networks for semantic segmentation // IEEEtransactions on pattern analysis and machine intelligence. — 2017. — Vol. 39, no. 4. — P. 640–651.70. He K., Zhang X., Ren S., Sun J. Identity Mappings in Deep Residual Networks // European Conference on Computer Vision.
— 2016.71. Sainath T. N., Kingsbury B., Sindhwani V., Arisoy E., Ramabhadran B. Low-rank matrix factorization for deep neural network training with high-dimensional output targets // IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP). — 2013. — P. 6655–6659.72. LeCun Y., Bottou L., Orr G. B., Müller K.-R. Effiicient BackProp // Neural Networks: Tricks ofthe Trade, This Book is an Outgrowth of a 1996 NIPS Workshop. — London, UK, UK, 1998. —P. 9–50. — URL: http://dl.acm.org/citation.cfm?id=645754.668382.73.
Goodfellow I., Warde-Farley D., Mirza M., Courville A., Bengio Y. Maxout Networks // Proceedings of the 30th International Conference on Machine Learning / ed. by S. Dasgupta,D. McAllester. — Atlanta, Georgia, USA, 2013. — Vol. 28, no. 3. — P. 1319–1327. — (Proceedings of Machine Learning Research). — URL: http://proceedings.mlr.press/v28/goodfellow13.html.74. LeCun Y.
[et al.]. Generalization and network design strategies // Connectionism in perspective. —1989. — P. 143–155.75. Masci J., Meier U., Cireşan D., Schmidhuber J. Stacked convolutional auto-encoders for hierarchical feature extraction // Artificial Neural Networks and Machine Learning–ICANN. — 2011. —P.
52–59.76. Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A Simple Wayto Prevent Neural Networks from Overfitting // Journal of Machine Learning Research. — 2014.77. Gal Y., Ghahramani Z. Dropout as a Bayesian approximation: Representing model uncertainty indeep learning // International Conference on Machine Learning. — 2016. — P. 1050–1059.78. Kingma D. P., Salimans T., Welling M. Variational dropout and the local reparameterization trick //Advances in Neural Information Processing Systems.
— 2015. — P. 2575–2583.79. Molchanov D., Ashukha A., Vetrov D. Variational Dropout Sparsifies Deep Neural Networks //International Conference on Machine Learning. — 2017.80. Neklyudov K., Molchanov D., Ashukha A., Vetrov D. Structured Bayesian Pruning via Log-NormalMultiplicative Noise // arXiv. — 2017.10781.
Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R., Fei-Fei L. Large-scale video classification with convolutional neural networks // Proceedings of the IEEE conference on ComputerVision and Pattern Recognition. — 2014. — P. 1725–1732.82. Tran D., Bourdev L., Fergus R., Torresani L., Paluri M. Learning spatiotemporal features with 3dconvolutional networks // Proceedings of the IEEE international conference on computer vision. —2015. — P. 4489–4497.83. Maturana D., Scherer S.
Voxnet: A 3d convolutional neural network for real-time object recognition // IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). — 2015. —P. 922–928.84. Dumoulin V., Visin F. A guide to convolution arithmetic for deep learning // arXiv. — 2016.85. Lin M., Chen Q., Yan S. Network in network // International Conference on Learning Representations. — 2014.86. Springenberg J.