SEDS 542
Large-Scale Data Management
This course introduces the fundamental concepts and computational paradigms of large-scale data management. This includes major methods for storing, updating and querying large datasets as well as for data-intensive computing. The course covers concepts, algorithms, and system issues on the topics of parallel and distributed databases, peer-to-peer data management, MapReduce and its ecosystem, Spark and dataflows, datalakes and NoSQL databases.
Topics |
Distributed Database Design |
Distributed Query Processing |
Distributed Transaction Processing |
Parallel Architectures and Data Placement |
Parallel Query Processing |
Infrastructure and Schema Mapping |
Querying and Replica Consistency |
Blockchain |
Distributed Storage Systems |
MapReduce and its Ecosystem |
Spark and Data Flows and DataLakes |
Key-Value Stores and Document Stores |
Wide-Column Stores and Graph DBMSs |
Hybrid Data Stores and Polystores |