### The basic (relation) access operators and their cost

• The basic tuple access operators

• There are 2 basic tuple access operators available for the Physical Query Plan:

• Table-Scan(R):

 Read tuples from the relation R by reading data blocks - one block at a time from disk to memory

• Index-Scan(R):

 The relation R must have an index The index is scanned to find blocks that contain the desired tuples All the blocks containing desired tuples are then read into the memory --- one block at a time.

• Recall: clustered and unclustered files/relations

• Clustered file:

• A file that stores records of one relation

• Unclustered file: (less common)

• A file that stores records of multiple relation

• The cost of the basic scan operators

• Situation: the relation R is clustered

Operator I/O cost Explanation
Table-Scan B(R) Read all blocks, and there are B(R) blocks.

Index-Scan ≤ B(R) Depends on number of values in the index is scanned

Single value: I/O cost << B(R)

• Situation: the relation R is unclustered

Operator I/O cost Explanation
Table-Scan ~ T(R) Assume that the next tuple is not found in the current block.

We will read 1 block per tuple (when given the record database address of the tuple)

Index-Scan ≤ T(R) Depends on number of values in the index is scanned

Single value: I/O cost << B(R) << T(R)

• Sort-scan: a possible table scan operator that you can add to the basic operators

• Fact:

 Sometimes, we need the output of the tuples in a relation in an ordered manner I.e.: sorted on some attribute values

• Sort-scan:

• Sort-scanA(R):

 Produces tuples in relation R sorted in the attribute(s) A

• Possible implementations:

• If there is an index on the attibute(s) A:

 Scan the index file Read the block pointers in the index and output the tuples in sorted order

• If there is no index on the attibute(s) A:

 Read all blocks of relation R into memory Sort Then output the sorted tuples

• Note:

 Things gets more complicated if relation R does not fit in the main memory... We will then need to use multi-pass sort algorithms to sort the relation R This topic will be discussed later.