CS457 Syllabus & Progress

## Formulating Set Division Queries using the Contains (Superset) test in SQL

• Set Division Query...

• The set division operation is not part of the SQL standard - and it probably will never be... (too complex to implement)

• Recall that the set division is used to answer on specific type of queries, such as:

 Find employees who work on all projects controlled by the 'Research' department" See: click here

I will now explain how to answer a division query using the CONTAINS (superset) test in a number of examples

• Division Query Example 1

• Example:

 Find fname and lname of employees who works on all projects controlled by department number 4.

• Steps towards the solution:

1. Formulate the solution using a CONTAINS condition

You can see that:

• An employee works on all projects controlled by department number 4 if and only if:

 The set of projects worked on by the employee CONTAINS the set of projects controlled by department 4

The SQL query can be formulated using the following pseudo code:

 ``` SELECT fname, lname FROM employee WHERE "set of projects worked on by emplyee.ssn" CONTAINS "set of projects controlled by department 4" ```

2. As discussed about, we can re-write the CONTAINS condition into a NOT EXISTS condition as follows:

 ``` SELECT fname, lname FROM employee WHERE "set of projects controlled by department 4" - "set of projects worked on by emplyee.ssn" = empty set Or: SELECT fname, lname FROM employee WHERE NOT EXISTS ("set of projects controlled by department 4" - "set of projects worked on by emplyee.ssn") Apply the Set Subtraction technique: SELECT fname, lname FROM employee WHERE NOT EXISTS (SELECT pnumber FROM project /* = the universe set of pnumber) */ WHERE pnumber IN "set of projects controlled by department 4" and pnumber NOT IN "set of projects worked on by emplyee.ssn") ```

3. We can work out the inner 2 set as 2 queries as follows:

 ``` "set of projects controlled by department 4": (SELECT pnumber FROM project WHERE dnum = 4) "set of projects worked on by emplyee.ssn (= a specific employee with this ssn)": (SELECT pno FROM works_on WHERE essn = employee.ssn) ```

4. The complete SQL solution is then:

 ``` SELECT ssn, fname, lname FROM employee e WHERE NOT EXISTS (SELECT pnumber FROM project /* Universe set used in the subtract operation */ WHERE pnumber IN (SELECT pnumber FROM project WHERE dnum = 4) and pnumber NOT IN (SELECT pno FROM works_on WHERE essn = e.ssn) ) (I aliased Employee as e for convenience) ```

• Output:

 ``` ssn fname lname --------- ------ -------- 999887777 Alicia Zelaya 987987987 Ahmad Jabbar ```

because:

 ```Projects control by department 4: pname pnumber plocation dnum --------------- ----------- ---------- ----------- Computerization 10 Stafford 4 Newbenefits 30 Stafford 4 Employees working on department 4's proejcts: essn pno hours --------- ----------- ------- 333445555 10 10.0 999887777 30 30.0 999887777 10 10.0 987654321 30 20.0 987987987 30 5.0 987987987 10 35.0 ```

• Another example of a set division queries

• Query:

 Find fname and lname of employees who works on all projects that are worked on by John Smith.

• Steps towards the solution:

1. Formulate the solution using a CONTAINS condition

 ``` SELECT fname, lname FROM employee WHERE "set of projects worked on by emplyee.ssn" contains "set of projects worked on by John Smith" ```

2. Re-write CONTAINS as NOT EXISTS:

 ``` Re-write: SELECT fname, lname FROM employee WHERE "set of projects worked on by John Smith" - "set of projects worked on by emplyee.ssn" = empty set Or: SELECT fname, lname FROM employee WHERE NOT EXISTS ("set of projects worked on by John Smith" - "set of projects worked on by emplyee.ssn"); ```

3. Apply the set difference technique to re-write the inner query:

 ``` Use Set Difference technique: SELECT fname, lname FROM employee WHERE NOT EXISTS (SELECT pnumber FROM project /* Universe set of pnumber */ WHERE pnumber IN "set of projects worked on by John Smith" and pnumber NOT IN "set of projects worked on by emplyee.ssn" ) ```

4. Work out the 2 inner sets as 2 different SQL queries:

 ``` SELECT fname, lname FROM employee e WHERE NOT EXISTS (SELECT pnumber FROM project /* Universe set of pnumber */ WHERE pnumber IN (SELECT pno FROM works_on, employee WHERE essn = ssn AND fname = 'John' AND lname = 'Smith') and pnumber NOT IN (SELECT pno FROM works_on WHERE essn = e.ssn) ) ```

• Last example of set division query....

• Query

 Find names of projects that are worked on by all employees in the Administration department

• Steps towards the solution:

1. Formulate the solution using a CONTAINS condition

 ``` SELECT pname FROM project P WHERE "set of employees working on project P (id. by P.pnumber)" contains "set of employees in the Administration department" ```

2. Re-write CONTAINS as NOT EXISTS:

 ``` Re-write: SELECT pname FROM project P WHERE "set of employees in the Administration department" − "set of employees working on project P (id. by P.pnumber)" = empty set Or: SELECT pname FROM project P WHERE NOT EXISTS ( "set of employees in the Administration department" − "set of employees working on project P (id. by P.pnumber)" ) ```

3. Apply the set difference technique to re-write the inner query:

 ``` Use Set Difference technique: SELECT pname FROM project P WHERE NOT EXISTS (SELECT ssn FROM employee /* Universe set of employees */ WHERE ssn IN "set of employees in the Administration department" and ssn NOT IN "set of employees working on project P (id. by P.pnumber)" ) ```

4. Work out the 2 inner sets as 2 different SQL queries:

 ``` SELECT pname FROM project P WHERE NOT EXISTS (SELECT ssn FROM employee /* Universe set of employees */ WHERE ssn IN (SELECT ssn FROM employee, department WHERE dno=dnumber AND dname='Administration') and ssn NOT IN (SELECT essn FROM works_on WHERE works_on.pno = P.pnumber) ) ```

• Result:

 ``` pnumber pname ----------- --------------- 30 Newbenefits ```

becasue:

 ``` The Administration department is: dname dnumber mgrssn mgrstartdate --------------- ----------- --------- ------------ Research 5 333445555 22-MAY-78 Administration 4 987654321 01-JAN-85 Headquarters 1 888665555 19-JUN-71 Employees in the Administration department: ssn fname lname dno --------- ------ -------- ----------- 999887777 Alicia Zelaya 4 987654321 Jennif Wallace 4 987987987 Ahmad Jabbar 4 Projects worked on by these projects: essn pno hours --------- ----------- ------- 999887777 10 10.0 987987987 10 35.0 987654321 20 15.0 999887777 30 30.0 987654321 30 20.0 987987987 30 5.0 ```

Project 30 (Newbenefits) is worked on by all employees in the Administration department !