Relational Algebra: Selection (σ) and Projection (π)

• The selection operation (σ)

• Syntax:

 ``` σ(Condition)(R) ```

• Effect:

• Selects tuples from a relation R that satisfies the condition Condition

• The Condition expression contains:

 constants and/or attributes of relation R However: the condition expression cannot be a set !!!

• The result of σ (Condition) (R) is a subset of tuples of R that satisfies the boolean condition Condition

• Example 1:

• Retrieve all employee tuples for employees that work for department number 4:

σ (dno = 4) (employee)

• The following diagram shows the result graphically (hopefully, it will illustrates the concept unambiguously):

• Example 2:

• Retrieve all employees earning more than \$30,000:

σ (salary > 30000) (employee)

• Example 3:

• Retrieve all employees working for department number 4 and eaning more than \$30,000:

σ (dno = 4 and salary > 30000) (employee)

OR:

σ (salary > 30000) ( σ (dno = 4) (employee) )

• The projection operation (π)

• Syntax:

 ``` π (attribute-list) (R) ```

Effect:

 "Selects out" only the attribute values given in the attribute-list from all tuples in relation R The attribute-list contains the list of attributes in relation R that will be selected. The result of π (attribute-list) (R) contains the is a subset of tuples of R that satisfies the boolean condition Condition

• Example 1:

• Retrieve only the SSN field from the employees

π ssn (employee)

• The following diagram shows the result graphically (hopefully, it will illustrates the concept unambiguously):

• Example 2:

• Retrieve the sex information from all employees:

π sex (employee)

The following diagram shows the result graphically (hopefully, it will illustrates the concept of projection unambiguously):

Notes:

 The output of a Relational Algebra query is set. A set in Mathematics does not have duplicate elements Therefore, the result of πsex (employee) is the set {F, M} because duplicates are removed In SQL, the user will have the option to remove duplicates That is because removing duplicates require longer time to process.... (The user has the option to incur the cost or not...)

• A taste of the power of Data Manipulation Languages

• Recall the data manipulation languages:

 Relational Algebra Relational Calculus            SQL

• Characteristics of data manipulation languages:

 Data manipulation langugaes can express (specify) what (data) a user wants

Examples:

• Retrieve the names of the female employees

 πfname, lname ( σsex = 'F' ( employee ) )

Example using a specific Employee relation:

• Retrieve the names of the employees who earn more than \$50,000

 πfname, lname ( σsalary > 50000 ( employee ) )

Example using a specific Employee relation:

• Generations of Programming Languages:

• 1st generation programming language:

• The earliest way to program a computer:

 By flipping switches (= bits (0/1) !!!)

• You write a computer program as a sequence of bit patterns

• 2nd generation programming languages:

 You write a computer program using assembler (nmemonic) codes

• 3rd generation programming languages:

• You write a computer program using procedural statements such as:

 Assignment statement Conditional statements Repetition statemanets

• Procedural languages expresses how to get things done

Examples:

 Java, C, Python, ....

• 4th generation programming languages:

• You express what you want in a 4th generation language (and leave the how to get it done to the compiler)

Examples:

 Relational Algebra Relational Calculus SQL