CENG 544
Large-Scale Data Management
This course introduces the fundamental concepts and computational paradigms of large-scale data management. This includes major methods for storing, updating and querying large datasets as well as for data-intensive computing. The course covers concepts, algorithms, and system issues on the topics of parallel and distributed databases, peer-to-peer data management, MapReduce and its ecosystem, Spark and dataflows, datalakes and NoSQL databases.
Course Objectives
To introduce students to the current trends in large-scale data management covering concepts, architectures, algorithms and system issues.
Recommended or Required Reading
T. Özsu, P. Valduriez. Principles of Distributed Database Systems. Springer, 4th ed., 2020.
M. Kleppman. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly Media, Inc., 2017.
Learning Outcomes
- To understand current research and technological trends in large-scale data management
- To comprehend the fundamental principles of modern database management systems
- To identify bottlenecks in large-scale data management applications and make appropriate design decisions
- To install and utilize open-source software systems and libraries required for meaningful data management operations
| Week | Topics |
| 1 | Distributed Database Systems
Distributed Database Design |
| 2 | Distributed Query Processing |
| 3 | Distributed Transaction Processing |
| 4 | Parallel Database Systems
Parallel Architectures and Data Placement |
| 5 | Parallel Query Processing |
| 6 | Peer-to-Peer Data Management
Infrastructure and Schema Mapping |
| 7 | Querying and Replica Consistency |
| 8 | Blockchain |
| 9 | Big Data Processing
Distributed Storage Systems |
| 10 | MapReduce and its Ecosystem |
| 11 | Spark and Data Flows and DataLakes |
| 12 | NOSQL, NewSQL and Polystores
Key-Value Stores and Document Stores |
| 13 | Wide-Column Stores and Graph DBMSs |
| 14 | Hybrid Data Stores and Polystores |
Midterm: 30%
Research Presentation: 30%
Final: 40%
Instructor(s)
- CENG 500
- CENG 501
- CENG 502
- CENG 503
- CENG 504
- CENG 505
- CENG 506
- CENG 507
- CENG 508
- CENG 509
- CENG 511
- CENG 512
- CENG 513
- CENG 514
- CENG 515
- CENG 516
- CENG 517
- CENG 518
- CENG 521
- CENG 522
- CENG 523
- CENG 524
- CENG 525
- CENG 531
- CENG 532
- CENG 533
- CENG 534
- CENG 541
- CENG 542
- CENG 543
- CENG 551
- CENG 552
- CENG 555
- CENG 556
- CENG 557
- CENG 561
- CENG 562
- CENG 563
- CENG 564
- CENG 565
- CENG 566
- CENG 567
- CENG 568
- CENG 590
- CENG 608
- CENG 611
- CENG 612
- CENG 613
- CENG 631
- CENG 632
- CENG 641
- CENG 642
- CENG 643
- CENG 651
- CENG 661
- CENG 662
- CENG 663

