CENG 544

Large-Scale Data Management

This course introduces the fundamental concepts and computational paradigms of large-scale data management. This includes major methods for storing, updating and querying large datasets as well as for data-intensive computing. The course covers concepts, algorithms, and system issues on the topics of parallel and distributed databases, peer-to-peer data management, MapReduce and its ecosystem, Spark and dataflows, datalakes and NoSQL databases.

Course Objectives

To introduce students to the current trends in large-scale data management covering concepts, architectures, algorithms and system issues.

Recommended or Required Reading

T. Özsu, P. Valduriez. Principles of Distributed Database Systems. Springer, 4th ed., 2020.

M. Kleppman. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly Media, Inc., 2017.

Learning Outcomes

To understand current research and technological trends in large-scale data management
To comprehend the fundamental principles of modern database management systems
To identify bottlenecks in large-scale data management applications and make appropriate design decisions
To install and utilize open-source software systems and libraries required for meaningful data management operations

Week	Topics
1	Distributed Database Systems Distributed Database Design
2	Distributed Query Processing
3	Distributed Transaction Processing
4	Parallel Database Systems Parallel Architectures and Data Placement
5	Parallel Query Processing
6	Peer-to-Peer Data Management Infrastructure and Schema Mapping
7	Querying and Replica Consistency
8	Blockchain
9	Big Data Processing Distributed Storage Systems
10	MapReduce and its Ecosystem
11	Spark and Data Flows and DataLakes
12	NOSQL, NewSQL and Polystores Key-Value Stores and Document Stores
13	Wide-Column Stores and Graph DBMSs
14	Hybrid Data Stores and Polystores

Midterm: 30%

Research Presentation: 30%

Final: 40%

Instructor(s)

Assistant Professor / Vice Chair

Damla Oğuz

Other MS Courses

CENG 500
CENG 501
CENG 502
CENG 503
CENG 504
CENG 505
CENG 506
CENG 507
CENG 508
CENG 509
CENG 511
CENG 512
CENG 513
CENG 514
CENG 515
CENG 516
CENG 517
CENG 518
CENG 521
CENG 522
CENG 523
CENG 524
CENG 525
CENG 531
CENG 532
CENG 533
CENG 534
CENG 541
CENG 542
CENG 543
CENG 551
CENG 552
CENG 555
CENG 556
CENG 557
CENG 561
CENG 562
CENG 563
CENG 564
CENG 565
CENG 566
CENG 567
CENG 568
CENG 590
CENG 608
CENG 611
CENG 612
CENG 613
CENG 631
CENG 632
CENG 641
CENG 642
CENG 643
CENG 651
CENG 661
CENG 662
CENG 663

About

CENG 544

Large-Scale Data Management

Instructor(s)