Other join operators: Equi-join, Theta-join and Natural join operations

• Recall: the inner join operation

• Recall the inner join operation (⋈) is defined as:

The condition clause is a boolean expression that only uses:

 Attribute names in relations R1 or R2              Constant values

• The "equi-join" operation

• The most common form of condition used as a join condition is:

• Comparing whether some values are equal to each other

Example:

 ``` employee ⋈(dno = dnumber) department ```

• Definition: Equi-join

• Equi-join = a join operation where the the condition in the join (⋈condition) contain only equality compare operators

Example:

 ``` employee ⋈(dno=dnumber AND sex='F') department ```

• More general join conditions: the "theta (θ) join" operation

• In general, the condition in a join operation can contain any form of comparisons:

 = ≠ < ≤ ≥ >

• Definition:

 Theta(θ)-join = a join operation where the (join) condition contain general forms of comparision operations

• Example of a use of general join condition

• Consider the employee relation:

 ``` employee: +-----------+--------+---------+----------+ | ssn | fname | lname | salary | +-----------+--------+---------+----------+ | 123456789 | John | Smith | 30000.00 | | 333445555 | Frankl | Wong | 40000.00 | | 999887777 | Alicia | Zelaya | 25000.00 | | 987654321 | Jennif | Wallace | 43000.00 | | 666884444 | Ramesh | Narayan | 38000.00 | | 453453453 | Joyce | English | 25000.00 | | 987987987 | Ahmad | Jabbar | 25000.00 | | 888665555 | James | Borg | 55000.00 | +-----------+--------+---------+----------+ ```

• The output of σ fname='John' AND lname='Smith' (employee) is the relation (= set !) of 1 tuple containing information on John Smith:

 ``` σfname='John' AND lname='Smith'(employee) +-----------+--------+---------+----------+ | ssn | fname | lname | salary | +-----------+--------+---------+----------+ | 123456789 | John | Smith | 30000.00 | +-----------+--------+---------+----------+ ```

• The cartesian product of employee and σ fname='John' AND lname='Smith' (employee) is the following relation:

 ``` employee × (σfname='John' AND lname='Smith'(employee)) +-----------+--------+---------+----------+-----------+--------+---------+----------+ | ssn | fname | lname | salary | ssn | fname | lname | salary | +-----------+--------+---------+----------+-----------+--------+---------+----------+ | 123456789 | John | Smith | 30000.00 | 123456789 | John | Smith | 30000.00 | | 333445555 | Frankl | Wong | 40000.00 | 123456789 | John | Smith | 30000.00 | | 999887777 | Alicia | Zelaya | 25000.00 | 123456789 | John | Smith | 30000.00 | | 987654321 | Jennif | Wallace | 43000.00 | 123456789 | John | Smith | 30000.00 | | 666884444 | Ramesh | Narayan | 38000.00 | 123456789 | John | Smith | 30000.00 | | 453453453 | Joyce | English | 25000.00 | 123456789 | John | Smith | 30000.00 | | 987987987 | Ahmad | Jabbar | 25000.00 | 123456789 | John | Smith | 30000.00 | | 888665555 | James | Borg | 55000.00 | 123456789 | John | Smith | 30000.00 | +-----------+--------+---------+----------+-----------+--------+---------+----------+ ```

We can find the employees who have a higher salary than John Smith as follows:

 ``` employee ⋈salary > salary (σfname='John' AND lname='Smith'(employee)) +-----------+--------+---------+----------+-----------+--------+---------+----------+ | ssn | fname | lname | salary | ssn | fname | lname | salary | +-----------+--------+---------+----------+-----------+--------+---------+----------+ | 333445555 | Frankl | Wong | 40000.00 | 123456789 | John | Smith | 30000.00 | | 987654321 | Jennif | Wallace | 43000.00 | 123456789 | John | Smith | 30000.00 | | 666884444 | Ramesh | Narayan | 38000.00 | 123456789 | John | Smith | 30000.00 | | 888665555 | James | Borg | 55000.00 | 123456789 | John | Smith | 30000.00 | +-----------+--------+---------+----------+-----------+--------+---------+----------+ ```

• The "natural join" (*) operation

• Suppose we have the following relations:

 ``` Employee: fname lname dnumber salary ------ -------- ----------- --------- John Smith 5 30000.00 Frankl Wong 5 40000.00 Alicia Zelaya 4 25000.00 Jennif Wallace 4 43000.00 Ramesh Narayan 5 38000.00 Joyce English 5 25000.00 Ahmad Jabbar 4 25000.00 James Borg 1 55000.00 Department: dname dnumber mgrssn mgrstartdate --------------- ----------- --------- ------------ Research 5 333445555 22-MAY-78 Administration 4 987654321 01-JAN-85 Headquarters 1 888665555 19-JUN-71 ```

• The equi-join of the relations on Employee ⋈dnumber=dnumber Department will result in:

 ```fname lname dnumber salary dname dnumber mgrssn mgrstartdate ------ -------- ----------- --------- --------------- ----------- --------- ------------ John Smith 5 30000.00 Research 5 333445555 22-MAY-78 Frankl Wong 5 40000.00 Research 5 333445555 22-MAY-78 Alicia Zelaya 4 25000.00 Administration 4 987654321 01-JAN-85 Jennif Wallace 4 43000.00 Administration 4 987654321 01-JAN-85 Ramesh Narayan 5 38000.00 Research 5 333445555 22-MAY-78 Joyce English 5 25000.00 Research 5 333445555 22-MAY-78 Ahmad Jabbar 4 25000.00 Administration 4 987654321 01-JAN-85 James Borg 1 55000.00 Headquarters 1 888665555 19-JUN-71 ```

Notice that:

• The join attribute attribute dnumber occurs twice in the result relation !!!

Therefore:

 One of the attribute dnumber is redundant

• We can remove one copy of the attribute dnumber using the following project π operation:

 ``` πfname,lname,dnumber,salary,dname,mgrssn,mgrstartdate( ... ) ```

• Natural join (*):

 Natural join (*) = an equi-join operation on all attributes with the same names and                                 followed by a projection operation (π) that remove                                 all duplicate attribute names

Examples:

• Suppose:

 ``` R1 = (A, B, C) R2 = (C, D, E, F) ```

then:

 ``` R1 * R2 = πA,B,C,D,E,F ( R1 ⋈C=C R2 ) ```

• Suppose:

 ``` R1 = (A, B, C, D) R2 = (C, D, E, F) ```

then:

 ``` R1 * R2 = πA,B,C,D,E,F ( R1 ⋈C=C ∧ D=D R2 ) ```