CS554 Syllabus & Progress

CS554 - Advanced Database Systems
Syllabus and Progress

  1. Introduction

  2. Review of undergraduate database material (my CS377 class notes is here: click here )

    • Review of Relational Algebra:
      • Relations: click here
      • The Relational (Data) Model click here

      • Overview of Relational Algebra
        • Intro: click here
        • The high school set operators (, , ): click here
        • The operators σ, π, × and : click here                                                   (cs377 material)
        • The Aggregate (= set) Functions: click here                                                 (cs377 material)

      • Examples of Relational Algebra: click here                                       (cs377 material)

      • Bags, sets and the δ and γ operators: click here

        Homework 1: click here

    • Review of SQL:
      • The model of the database used in these notes: click here
      • Accessing the MySQL server: click here

      • Intro to SQL: click here                                                                       (cs377 material)
      • The CREATE TABLE command: click here
      • The (abbreviated) INSERT command: click here

      • Intro to the SELECT command                                                       (cs377 material)

      • Qualifying attributes and aliasing: click here                                             (cs377 material)

      • Set operations (∪, ∩, −) in SQL: click here                                       (cs377 material)
      • The tuple conditions that can be used in the where clause: click here                 (cs377 material)

      • Nested queries:                                                 (cs377 material)

      • Correlated nested queries:

    • Neat trick to get other attributes using a set of foreign keys in a nested query: click here

    • Set functions:

    • Forming groups based on grouping attributes and conditions on a group:

    • Virtual table (relation) or view: click here

      Homework 2: click here

  3. Secondary Storage (Disks) management

  4. Indexing

  5. One-pass Algorithms for Query execution (processing a physical query plan)

  6. The nested-loop Join Algorithm

  7. 2-pass Algorithms for Query execution that are based on (TPMMS) sorting

  8. 2-pass Algorithms for Query execution that are based on hashing

    • Introduction: click here

    • Unary operators:
      • Selection σ --- no need: use a one-pass algorithm
      • Projection π --- no need: use a one-pass algorithm
      • Duplicate elimination δ: click here
      • Grouping γ: click here

    • Binary operators:
      • Union ∪
        • Bag union: --- no need: use a one-pass algorithm
        • Set union: click here
      • Intersection ∩
      • Difference −
        • Bag difference: --- same process as set intersection
        • Set difference: --- same process as set intersection

      • Product (cartesian product) × --- same comment as click here

      • Join ⋈: click here

    • Summary: click here

  9. Multi-pass algorithms

  10. Algorithms that are based on indexing:

    Query optimization

  11. Overview: query optimization: click here

  12. Parsing and pre-processing

  13. Converting a Parse Tree into an initial logical query plan (tree)

    • Converting an SQL command that does not contain a sub-query: click here

    • Converting an SQL command that contains a sub-query:
      • Converting an SQL command with sub-query using 2-argument selection: click here
      • Replacing a 2-argument selection in an uncorrelated sub-query: click here
      • Replacing a 2-argument selection in an correlated sub-query: click here

  14. Algebraic Laws used to transform/optimize logical query plans

  15. Cost estimation of a Logical Query Plan (= Relational Algebra Tree)

    Homework 6: click here

  16. Ordering the join (and cartesian product) operations

    • Intro: click here
    • Choosing a join order for R⋈S (R⋈S or S⋈R ?): click here

    • Left-deep trees:

    • Finding the best left-deep join tree:
      • The Dynamic Programming approach (= exhaustive search):

      • A greedy heuristic to find the best left-deep join ordering: click here

  17. The physical query plan

    Homework 7: click here

  18. Recoverability: protecting database from system failure (logging)

  19. Serializability: correctness of concurrent execution of transactions

  20. Serializability and Recoverability

  21. Deadlock

  22. Parallel Data Processing Algorithms

  23. The Map-Reduce Parallelism framework

  24. Distributed Databases: query processing

  25. Commiting Distributed Transactions

  26. Handling "Big Data"