CENG 534

Deep Learning for Natural Language Processing

Natural language processing (NLP) is one of the most important technologies of theinformation age. In traditionalNLP, task-specific feature engineering and language-specific solutions were common. Recently, deep learning approaches have obtained very high performance across many different NLP tasks, and multilingual solutions have been introduced. This course covers cutting-edge research in deep learning applied to NLP. Topics include word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks as well as some very novel models involving a memory component. A term project to implement, train, test, and visualize a custom neural network solution to a large scale NLP problem will be given.

Course Objectives

To advance students on the cutting-edge research in deep learning applied to NLP.

Recommended or Required Reading

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning: MIT Press.  ,Goldberg, Y., &Hirst, G. (2017). Neural Network Methods in Natural Language Processing: Morgan & Claypool Publishers. ,Mikolov, T., Sutskever, I., Chen, K., Corrado, G., &Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems – Volume 2, Lake Tahoe, Nevada. ,Mikolov, T., Chen, K., Corrado, G.,and Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. CoRR, abs/1301.3781.  ,Pennington, Jeffrey and Socher, Richard and Manning, Christopher D. (2014). Glove: Global Vectors for Word Representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 14. ,Huang, Eric H., Richard Socher, Christopher D. Manning, and Andrew Y. Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers – Volume 1. Jeju Island, Korea: Association for Computational Linguistics. ,Collobert, Ronan, Jason Weston, #233, onBottou, MichaelKarlen, KorayKavukcuoglu,and PavelKuksa. 2011. Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res. no.12:2493-2537. ,Hirschberg, J.and Manning, C.D. (2015) Advances in Natural Language Processing. Science, 349, 261-266. , Turney, P. D., &Pantel, P. (2010). From frequency to meaning: vector space models of semantics. J. Artif. Int. Res., 37(1), 141-188.

Learning Outcomes

Upon the completion of this course a student :
1. Able to process different word vector representations
2. Able to design and implement custom neural network solutions for a given NLP task
3. Able to tune the experimental settings for a neural network model
4. Able to present project results

Week Topics
1 Intro to NLP and Deep Learning
2 Simple Word Vector representations: word2vec, GloVe
3 Advanced word vector representations: language models, softmax, single layer networks
4 Neural Networks and backpropagation — for named entity recognition
5 Practical tips: gradient checks, overfitting, regularization, activation functions, details
6 Recurrent neural networks — for language modeling and other tasks
7 Recursive neural networks — for parsing
8 Review Session for Midterm
9 Convolutional neural networks — for sentence classification
10 Guest Lecture: Speech Recognition
11 Guest Lecture: Machine Translation
12 Guest Lecture: Seq2Seq and Large Scale DL
13 The future of Deep Learning for NLP: Dynamic Memory Networks
14 Project Presentations

Grading

Midterm 30%

Research Presentation 35%

Final 35%