CS 377 - Database Systems

Hw8: Query Processing and Transaction Management

Out: Wednesday, November 16

Due: Wednesday, November 23, 11:59pm

 

 

1)    Consider three relations, R1(A,B,C), R2(C,D,E), R3(E, F, A). We want to estimate the cost (number of disk IO’s) in answering the query R1 ./ R2 ./ R3 where ./ denotes natural join. Assume that we only consider a hash join algorithm (not a hybrid join), as described in class, and assume there is enough memory for running hash join. Also assume pipelined processing is used. We have the following information:

 

• R1 contains 10,000 tuples, stored contiguously in 1,000 blocks.

• R2 contains 20,000 tuples, stored contiguously in 2,000 blocks.

• R3 contains 30,000 tuples, stored contiguously in 3,000 blocks.

• If we compute the partial result R1 ./ R2, it would contain 1,500 tuples that if stored on disk would occupy 300 blocks.

• If we compute the partial result R2 ./ R3, it would contain 1,000 tuples that if stored on disk would occupy 200 blocks.

• If we compute the partial result R1 ./ R3, it would contain 500 tuples that if stored on disk would occupy 100 blocks.

 

a)    Briefly describe the best query plan (order of joins) to compute the query and why.

b)    How many disk block accesses (both reading and writing) are required for this plan? You don’t have to count the cost for writing the final join results. Briefly explain how you obtain your answer or show each cost component.

 

2)    Examine the schedule given below. There are three transactions, T1, T2, and T3. Initially, the salary = 2 and the tax = 1. Assume immediate update and undo logging with nonquiescent checkpointing are used. The assignments happen within the memory space of the transactions and the updates are not reflected in the database until the WRITE operation.

 

 

       T1                                       T2                                             T3             

------------------------------------------------------------------------------------------------------------

0                                                                                                 start

1                                                                                                 READ tax

2                                                                                                 tax := tax + 1

3     start

4     READ salary

5     salary := salary + 1

6                                                                                                 WRITE tax

7                                                                                                 commit

8   ------------------------------ checkpoint start -------------------------------

9                                                start

10                                              READ tax

11                                              READ salary

12                                              tax := tax + salary

13                                              WRITE tax

14                                              commit

15   READ tax

16   tax := tax + salary

17   WRITE salary

18   commit

 

a)    Show the undo log file entries (including checkpoint entries) that would be generated by this execution.  For each log entry, indicate which line above generates it.

b)    Assume the system crashes immediately after statement 7. What are the values of salary and tax in the database? (Assume that all the log records up to this point are on disk.) Which transactions would have to be undone? What are the values of salary and tax in the database after the recovery?

c)    Assume the system crashes just after statement 13. Redo question b) above.

d)    Assume the system crashes just after statement 17. Redo question b) above.

 

 

3)    Consider the three transactions T1, T2, and T3, and the schedules S1 and S2 given below.

 

T1: r1(x); r1(z); w1(x)

T2: r2(z); r2(y); w2(z); w2(y)

T3: r3(x); r3(y); w3(y)

 

S1: r1(x); r2(z); r1(x); r3(x); r3(y); w1(x); w3(y); r2(y); w2(z); w2(y)

S2: r1(x); r2(z); r3(x); r1(z); r2(y); r3(y); w1(x); w2(z); w3(y); w2(y)

 

a)    Draw the serializibility (precedence) graphs for S1 and S2 and state whether each schedule is conflict serializable. If a schedule is serializable, write down the equivalent serial schedule(s).

Submission

·           Please place your solution in your cs377 directory ~/cs377/hw8/hw8.pdf