CS457 Syllabus & Progress

## Doing group by queries in another way: using a sub-query

• Alternative way to formualte "group by" queries

• Fact:

 We can often formulate a query that uses a group by clause as a sub-query without using a group by clause

I will illustrate this technique with some examples

• Formulating group by and related queries

1. Query:

 Find the fname and lname of employees who have more than 2 dependents

Solution:

 ``` The depedent relation: select * from dependent +-----------+-----------+------+-----------+--------------+ | essn | name | sex | bdate | relationship | +-----------+-----------+------+-----------+--------------+ | 333445555 | Alice | F | 05-APR-76 | DAUGHTER | | 333445555 | Theodore | M | 25-OCT-73 | SON | | 333445555 | Joy | F | 03-MAY-48 | SPOUSE | | 987654321 | Abner | M | 29-FEB-32 | SPOUSE | | 123456789 | Micheal | M | 01-JAN-78 | SON | | 123456789 | Alice | F | 31-DEC-78 | DAUGHTER | | 123456789 | Elizabeth | F | 05-MAY-57 | SPOUSE | +-----------+-----------+------+-----------+--------------+ Add in the fname, lname: select fname, lname, name, relationship from dependent, employee where ssn = essn +--------+---------+-----------+--------------+ | fname | lname | name | relationship | +--------+---------+-----------+--------------+ | John | Smith | Micheal | SON | | John | Smith | Alice | DAUGHTER | | John | Smith | Elizabeth | SPOUSE | | Frankl | Wong | Alice | DAUGHTER | | Frankl | Wong | Theodore | SON | | Frankl | Wong | Joy | SPOUSE | | Jennif | Wallace | Abner | SPOUSE | +--------+---------+-----------+--------------+ Employees with > 2 dependents: (count per employee) select fname, lname, count(*) from dependent, employee where ssn = essn group by fname, lname having count(name) > 2 +--------+-------+----------+ | fname | lname | count(*) | +--------+-------+----------+ | Frankl | Wong | 3 | | John | Smith | 3 | +--------+-------+----------+ ```

Alternative solution using a sub-query:

 ``` Logic: select fname, lname from employee E where # { dependents of E } > 2 To find # (cardinality) of a set, we use count( ): select fname, lname from employee E where count { dependents of E } > 2 Work ou the {dependents of E} using SQL: select fname, lname from employee E where count ( select * from dependent where essn = E.ssn ) > 2 We must re-write "count( select * )" in correct SQL syntax: select fname, lname from employee E where ( select count(*) from dependent where essn = E.ssn ) > 2 +--------+-------+ | fname | lname | +--------+-------+ | John | Smith | | Frankl | Wong | +--------+-------+ ```

Comment:

• The outer query select one employee at a time

• For each employee in the outer query, the inner query will find the group of dependents for that employee

Hence:

 The nested query works like a group by clause !!!

2. Query:

 Find the fname and lname of the employees with more than 2 dependents and   work on more than 2 projects

Solution 1: using two group by clauses

 ``` 1. In "pseudo" code: SELECT fname, lname FROM employee WHERE ssn IN { set of ssn of employee with > 2 dependents } AND ssn IN { set of ssn of employee works on > 2 projects } 2. Employees with > 2 dependents: SELECT essn FROM dependent GROUP BY essn HAVING count(name) > 2 3. Employees who work on > 2 projects: SELECT essn FROM works_on GROUP BY essn HAVING count(pno) > 2 4. Combine to find the solution: SELECT fname, lname FROM employee WHERE ssn IN ( SELECT essn FROM dependent GROUP BY essn HAVING count(name) > 2 ) AND ssn IN ( SELECT essn FROM works_on GROUP BY essn HAVING COUNT(pno) > 2 ) FNAME LNAME ------ -------- Frank Wong ```

Alternative solution: using a nested sub-query:

 ``` Logic of the solution: select fname, lname from employee E where # { dependents of E } > 2 and # { projects worked on by E } > 2 To find # (cardinality) of a set, we use count( ): select fname, lname from employee E where count { dependents of E } > 2 and count { projects worked on by E } > 2 Work out the inner queries: select fname, lname from employee E where count ( select * from dependent where essn = E.ssn ) > 2 and count { select * from works_on where essn = E.ssn ) > 2 Re-write the highlighted part in correct SQL syntax: select fname, lname from employee E where (select count(*) from dependent where essn = E.ssn ) > 2 and (select count(*) from works_on where essn = E.ssn ) > 2 Answer: +--------+-------+ | fname | lname | +--------+-------+ | Frankl | Wong | +--------+-------+ ```