CENG 464
Text Mining
In this course, we will cover important topics in text mining including: basic natural language processing techniques, document representation, text categorization and clustering, document summarization, sentiment analysis, social network and social media analysis, probabilistic topic models and text visualization.
Course Objectives
This course provides an opportunity to learn key components of text mining and analytics aided by the real-world datasets and the text mining toolkit written in Python. Hands-on experience in core text mining techniques including text preprocessing, sentiment analysis, and topic modeling help students be trained to be a data scientists.
Recommended or Required Reading
J. Eisenstein,” Introduction to Natural Language Processing”, MIT Press, 2019. ,S. Ghosh & D.Gunning, “Natural Language Processing Fundamentals”, Packt, 2019.
Learning Outcomes
1. Using Python and NLTK (Natural Language Toolkit) to build out your own text classifiers and solve common NLP problems
2. Performing data analysis and machine learning tasks using Python
3. Understanding the basics of computational linguistics
4. Building models for general natural language processing tasks
5. Evaluating the performance of a model with the right metrics
6. Visualizing, quantifying, and performing exploratory analysis from any text data
| Topics |
| Introduction |
| Steps in NLP |
| Steps in NLP |
| Document Representation |
| Document Representation |
| Text Classification |
| Text Categorization |
| Data Collection |
| Topic Modelling |
| Text Summarization |
| Text Generation |
| Social Media and Network Analysis |
| Sentiment Analysis |
| Text Visualization |
Grading
Midterm 25%
Research Presentation 35%
Final 40%
- CENG 400
- CENG 411
- CENG 415
- CENG 416
- CENG 418
- CENG 421
- CENG 422
- CENG 424
- CENG 431
- CENG 432
- CENG 433
- CENG 434
- CENG 435
- CENG 436
- CENG 437
- CENG 441
- CENG 442
- CENG 443
- CENG 444
- CENG 451
- CENG 452
- CENG 461
- CENG 462
- CENG 463
- CENG 465
- CENG 467
- CENG 471
- CENG 472
- CENG 473
- CENG 481
- CENG 482
- CENG 483
- CENG 484
- CENG 485
- CENG 486
- CENG 487
- CENG 488
- CENG 491
- CENG 499



