CS457 Syllabus & Progress

## Decomposition algorithm to remove violation of the 3NF

• Recall: condition that guarantee lossless decomposition

• Recall: a sufficient condition that guarantees that a decomposition of a relation R into the relations R1 and R2 is lossless:

• Let R1 and R2 be a decomposition of R

If either   R1 ∩ R2 R1   or   R1 ∩ R2 R2,   then:

 content(R) = content(R1) * content(R2)

(i.e., the decomposition is lossless)

• Recall: Third Normal Form (3NF):

• Definition of the 3NF :

• A relation R is in 3NF iff:

For every functional dependency

 X → B

in relation R, one of the following must be true:

 X   is a superkey ,            or B   is a key attribute    (i.e., B is part of some key )

(If one of the above is true, then the functional dependency is "good" (= harmless))

• Decomposition algorithm to obtain a lossless decomposition using Lemma 2

• Fact:

 The decomposition algorithm is used to remove one violating functional dependency by decomposing (= breaking up) a relation that contain the violating functional dependency

I will illustrate the decomposition algorithm using an example

• Input relation:

 ``` Employee1(SSN, Fname, LName, PNumber, PName, Hours) Key(s) of the relation: (SSN, PNumber) Functional dependencies: SSN → FName, LName PNumber → PName SSN, PNumber → Hours ```

• Relation Employee1 has 3NF violation because of the following functional dependency:

 ``` SSN → FName, LName SSN is not a superkey ```

• Decomposition algorithm: (explained using an example)

• Input relation:

 ``` R = Employee1(SSN, FName, LName, PNumber, PName, Hours) Key: (SSN, PNumber) ```

with violating functional dependency:

 ``` SSN → LName ```

• Step 1:

 Compute SSN+

Result of step 1:

 ``` SSN+ = (SSN, FName, LName) ```

• Step 2:

• Decompose the input relation into the follow 2 relations:

 ``` R1 = (all attributes in SSN+) R2 = R − R1 ```

Result of step 2:

 ``` Employee1(SSN, FName, LName, PNumber, PName, Hours) / \ / \ R1 (= SSN+) R2 (= R - R1) | | R1(SSN, FName, LName) R2(PNumber, PName, Hours) ```

Notice that:

 SSN is a key in relation R1 !!!

• Step 3: to obtain a lossless decomposition :

Final result:

 ``` Employee1(SSN, FName, LName, PNumber, PName, Hours) / \ / \ R1(SSN, FName, LName) R2(SSN, PNumber, PName, Hours) ```

Note:

• R1 ∩ R2 = SSN

• Since:   SSN R1, by Lemma 2:

 The decomposition is lossless !!!

• Applying functional dependencies to decomposed relations

• Fact:

• A functional dependency is applicable to a relation R iff:

 All attributes in the Left Hand Side of the functional dependency is found in the relation R         and     At least one attribute in the Right Hand Side of the functional dependency is found in the relation R

• Example:

• We have the following decomposed relations in the example:

 ``` R1(SSN, FName, LName) R2(SSN, PNumber, PName, Hours) ```

• The given set of functional dependencies are:

 ``` SSN → FName, LName PNumber → PName SSN, PNumber → Hours ```

• Then:

• The functional dependency SSN FName, LName is applicabale to the following relation:

 ``` R1(SSN, FName, LName) ```

• The functional dependency PNumber PName is applicabale to the following relation:

 ``` R2(SSN, PNumber, PName, Hours) ```

• The functional dependency SSN, PNumber Hours is applicabale to the following relation:

 ``` R2(SSN, PNumber, PName, Hours) ```

• We are not done !!!

• Decomposed relations:

 ``` R1(SSN, FName, LName) R2(SSN, PNumber, PName, Hours) ```

We must verify if the decomposed relations satisfy the 3NF criteria !!!

Further processing:

 Are R1 and R2 in the 3NF ???

Recall: the functional dependencies are:

 ``` SSN → FName, LName PNumber → PName SSN, PNumber → Hours ```

• Check for 3NF condition in each decomposed relation:

• R1( SSN, FName, LName):

 ``` R1(SSN, FName, LName) Key(s): (SSN) ```

Functional dependencies that apply to R1:

 SSN → FName, LName Since SSN is a key --- it's a "good" (natural) functional dependency !

Conclusion:

 R1( SSN, FName, LName) is in 3NF

• R2(SSN, PNumber, PName, Hours):

 ``` R2(SSN, PNumber, PName, Hours) Key(s): (SSN, PNumber) ```

Functional dependencies that apply to R2:

 SSN, PNumber → Hours Since (SSN, PNumber) is a key --- it's a good functional dependency !         PNumber → PName Since PNumber is not a key, this is a violation !!!

Conclusion:

 R2( SSN, PName, PNumber, Hours) is not in 3NF !!!!

• Decompose R2 to to remove 3NF violation:

• Input relation:

 ``` R2(SSN, PNumber, PName, Hours) ```

with violating functional dependency:

 ``` PNumber → PName ```

• Compute PNumber+ to remove   all   functional dependent attributes:

 ``` PNumber+ = (PNumber, PName) ```

• Initial decomposition:

 ``` R2(SSN, PNumber, PName, Hours) / \ / \ R21 (= PNumber+) R22 (= R2 - R21) | | R21(PNumber, PName) R22(SSN, Hours) ```

• To obtain a lossless decomposition:

Final decomposition:

 ``` R2(SSN, PNumber, PName, Hours) / \ / \ R21 (= PNumber+) R22 (= R2 - R21) | | R21(PNumber, PName) R22(SSN, PNumber, Hours) ```

Note:

 R21 ∩ R22 = PNumber Since:   PNumber → R21: the decomposition is lossless !!!

• Post-post processing

• Decomposed relations:

 ``` R1(SSN, FName, LName) R21(PNumber, PName) R22(SSN, PNumber, Hours) ```

Question:

 Are R1, R21 and R22 in 3NF ???

 ``` R1 = (SSN, FName, LName) R21 = (PNumber, PName) R22 = (SSN, PNumber, Hours) ```