SEDS 501
Introduction to Data Science
This course covers the necessary background for data collection and integration, exploratory data analysis, predictive modeling, descriptive modeling, data product creation, evaluation, and effective communication.
Reference book(s):
• Christopher Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
• Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press. 2014.
• Foster Provost and Tom Fawcett. Data Science for Business: What You Need to Know about Data Mining and Data-analytic Thinking.
• Avrim Blum, John Hopcroft and Ravindran Kannan. Foundations of Data Science.
• Mohammed J. Zaki and Wagner Miera Jr. Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press. 2014.
Course Objectives: To introduce the fundamentals of data science.
| Week | Topics |
| 1 | Introduction: What is Data Science? |
| 2 | Statistical Inference |
| 3 | Exploratory Data Analysis and the Data Science Process |
| 4 | Basic Machine Learning Algorithms |
| 5 | One More Machine Learning Algorithm and Usage in Applications |
| 6 | Feature Generation and Feature Selection |
| 7 | Feature Learning |
| 8 | Recommendation Systems |
| 9 | Mining Social-Network Graphs |
| 10 | Link Analysis |
| 11 | Data Visualization |
| 12 | Natural Language Processing |
| 13 | Image Processing |
| 14 | Data Science and Ethical Issues |
Grading:
Final Exam %40
Midterm Exam %30
Practice Assignments %30
Course Learning Outcomes:
CO1 Describe what Data Science is and the skill sets needed to be a data scientist.
CO2 Carry out exploratory data analysis.
CO3 Apply the Data Science Process in a case study.
CO4 Apply basic machine learning algorithms for predictive modeling.
CO5 Reason around ethical and privacy issues in data science conduct and apply ethical practices.
Contribution of Program Learning Outcomes:
| PO1 | PO2 | PO3 | PO4 | PO5 | PO6 | PO7 | |
| CO1 | 1 | 1 | |||||
| CO2 | 1 | 1 | |||||
| CO3 | 1 | 1 | 1 | ||||
| CO4 | 1 | 1 | 1 | ||||
| CO5 | 1 | 1 |
Justification of the course:
It is a core course of the Software Engineering and Data Science Master of Science Program. The focus in the treatment of topics will be on breadth, rather than depth, and emphasis will be placed on integration and synthesis of concepts and their application to solving problems.
Overlapping with or complementing topics in courses:
This course includes overlapping content with the Data Science elective courses. As mentioned above, the treatment of topics will be on breadth rather than depth.

