cp i /home/cs554000/Handouts/Delete/* ~/cs554 
You will copy the following files:
EMP.orig  The EMP database file catalog.orig  The catalog for EMP demoexpr.cc  Demo program to show you how to make a parser and interpreter 
You can compile the demoexpr.cc as follows:
g++ o demoexpr g demoexpr.cc 
You must work in your ~/cs554 project directory to keep your program files private.

DELETE TABLE <table name> [ WHERE <boolean expression> ] 
Meaning:

The difficulty of this project is in the parsing and evaluating the boolean expression
DELETE TABLE student /* Delete all tuples in student */ DELETE TABLE student WHERE name = 'Smith' /* Delete all tuples with name='Smith' */ DELETE TABLE student WHERE student.name = 'Smith' /* Attrbutes can be qualified */ DELETE TABLE student WHERE student.name = 'Smith' AND age > 25 /* Compound expr */ DELETE TABLE student WHERE (NOT student.name = 'Smith') AND (age > 25 OR sex = 'M') 
The remainder of the handout will explain how to parse and evaluate expressions.
I will illustrate the recursive descent parsing (and evaluation) technique for parsing expressions using of syntax rules for arithmetic expressions. Parsing boolean expressions follows a similar pattern. If you understand how to parse arithmetic expressions, you can easily adapt what you learned for boolean expressions.
The syntax rules can define the priority of the arithmetic operators
(1) 45 (2) 45 + 67.9 (3) x + 67.9 (4) x + 67.9 * y /* Note: * has higher priority ! */ (5) (x + 67.9) * y /* Note: ( ) has highest priority ! */ 
I just simply give you the syntax rules for forming arithmetic expressions without explaining how they were derived (take a compiler course if you are interested):
(1) E ::= T [+ T]...  T [ T]... (2) T ::= F [* F]...  F [/ F]... (3) F ::= INTNUM  FLOATNUM  ID  ID.ID  ( E ) 
The symbol E represents an arithmetic expression. The other symbols are helper symbols to define what an arithmetic expression will look like. The meaning of the symbols are:

Briefly, these rules state the following:

The terminology of expression, term and factor are termed associated with the priorities to the arithmetic operators +, −, * and /.
Let's take a look at the above arithmetic expressions and see how they fit in the syntax rules.

Examples:
(1) 45 E > T > F > INTNUM 
Notice in case (4), the only way to map the expression to the set of rules is by mapping 67.9 * y to a term, so that the expression is an addition of two terms. In other words:

Similar, in case (5), the only way to map the expression is by mapping (x + 67.9) to a factor, so that the expression is a multiplication of two factors; forcing the expression between ( ) to be evaluated first.
Each node of the tree is either an arithmetic operator or an operand.
If the node is an arithmetic operator, then this node is a parent of exactly 2 other nodes which are the operands of the arithmetic operator.
(This is because arithmetic operators are binary operators)
(4) x + 67.9 * y Parse Tree: + / \ x * / \ 67.9 y 
The tree represents an expression which is the addition of:

(5) (x + 67.9) * y Parse Tree: * / \ + y / \ x 67.9 
Notice the difference in tree structure. The tree represents an expression which is the product of

(The tree structure is dictated by how the expression is mapped to the syntax rules).
I will use the following set of rules:
(1) E > T [+ T   T ]... (2) T > F [* F  / F ]... (3) F > FLOATNUM  ( E ) 
So, the type of arithmetic expressions I am looking for involves only floating point numbers and the usual arithmethic operators and ().
Further simplification:


Structure of the demo program:

The main program calls E() to parse an arithmetic expression. The functions T() and F() are support functions for E().
class MyNODE { public: int type; /* Can be one of FLOAT, PLUS, MINUS, MULT, DIV */ union { float f; struct MyNODE *p[2]; } value; .... (other stuff omitted) } 
The variables have the following meaning:

In other words:

This is why this parsing technique is known as:

(The descend part in the name describes that this is a topdown parsing technique)
(3) F > FLOATNUM  ( E ) 
Notice that:

We use this fact to parse (recognize) an arithmetic factor.
struct MyNODE *F() { struct MyNODE *p; token = GetToken(s); if (token == FLOAT) { Make a tree node containing a FLOAT constant; return its pointer; } else if (token == '(') { p = E(); /* E() will recurse and return a pointer to a node representing an arithmetic expression */ Check for accompanying ')'; return(p); } else { Syntax error; } } 
F() returns the pointer to a tree node that is a factor.

The function T() is based on the syntax rule for T:
(2) T > F [* F]...  F [/ F]... 
Since in both cases you must parse a factor F, the function T() will first call F() to parse an arithmetic factor.
Afterwards, T() decides whether the arithmetic term is completed as follows:

Note that because the optional part ("* F" and "/ F") is repeated, each time the function T() finishes parsing a factor F it must check whether the next symbol is * or /.
struct MyNODE *T() { /* ============================================ We know that T must contain at least one F ============================================ */ help1 = F(); /* Parse a Factor */ nextToken = peek at the next token in input; while ( nextToken == '*'  nextToken == '/' ) { if ( nextToken == '*') { Read and toss away the `*' token; (because we know it's *) Call F() to parse a second factor; Make an MULT operation node with left operand the factor in help1 and the second operand the second factor; Make help1 points to the MULT operation node (prepare to loop); } else { Read and toss away the `/' token; (because we know it's /) Call F() to parse a second factor; Make an DIV operation node with left operand the factor in help1 and the second operand the second factor; Make help1 points to the DIV operation node (prepare to loop); } } return help1; } 
It's structure is based on the syntax rule for E:
(1) E > T [+ T]...  T [ T]... 
The program structure of E() is very similar to that of F() and I will not spend more time on this.
class MyNODE { public: int type; /* Can be one of FLOAT, PLUS, MINUS, MULT, DIV */ union { float f; struct MyNODE *p[2]; } value; double eval_expr() { if ( type == FLOAT ) return value.f; else if ( type == PLUS ) return value.p[0]>eval_expr() + value.p[1]>eval_expr(); else if ( type == MINUS ) return value.p[0]>eval_expr()  value.p[1]>eval_expr(); else if ( type == MULT ) return value.p[0]>eval_expr() * value.p[1]>eval_expr(); else if ( type == DIV ) return value.p[0]>eval_expr() / value.p[1]>eval_expr(); } }; 
The eval_expr() function traverses the binary parse tree and performs the operations given by the type variable in each node.
The enhancement you need add to make the parse work for both INTEGER and FLOAT data is as follows:
class MyNODE { public: int type; /* Can be one of INT, FLOAT, PLUS, MINUS, MULT, DIV */ union { int i; // To store int values float f; struct MyNODE *p[2]; } value; double eval_expr() { if ( type == INT ) return value.i; else if ( type == FLOAT ) return value.f; else if ( type == PLUS ) return value.p[0]>eval_expr() + value.p[1]>eval_expr(); else if ( type == MINUS ) return value.p[0]>eval_expr()  value.p[1]>eval_expr(); else if ( type == MULT ) return value.p[0]>eval_expr() * value.p[1]>eval_expr(); else if ( type == DIV ) return value.p[0]>eval_expr() / value.p[1]>eval_expr(); } }; 

(1) BE ::= BT  BT [OR BT]... (2) BT ::= BF  BF [AND BF]... (3) BF ::= E RELOP E  NOT BF  ( BE ) {RELOP ::= <, <=, >, >=, =, !=} 
The rules (1), (2) and (3) define a boolean expression.
The rules (4), (5) and (6) define an arithmetic expression that is a part of a boolean expression (e.g.: 4 < 5)

A logical factor is the most elementary ("atomic") logical form that evaluates to TRUE or FALSE.
BTW: use the value 1 to represent TRUE and 0 for FALSE.
(3) BF > E RELOP E  NOT BF  ( BE ) 
There is one small problem:

In other words: ( can be the start of:
BF > ( BE ) BF > E RELOP E (E can start with a '(') 
We can handle the ambiguity as follows:
Peek (don't read) at the next token if ( next token == NOT ) { read next token (and discard); call BF() to parse a boolean factor make a NOT node: NOT  BF() return the NOT node } else if ( next token == '(' ) { Remember the current position; !!!! /* ======================================= Try: ( BE ) ======================================= */ read next token (and discard); call BE() to parse a boolean expression Check for the enclosing ')' if ( successful ) return the node returned by BE() /* ======================================= We arrive here if ( BE ) FAILED !!! Try: E RelOp E ======================================= */ Restore read position to '('; !!!! call E() read and save the Relational Op call E() make a RELOP node: Relational Op / \ E() E() if ( successful ) return RELOP node return NULL; // FAIL... } 
Notes:


Here is a suggestion on what to do:

class DeleteCmd { ScanTable *R; // Relation in the delete table command // R contains buf[ ] variable to hold tuples // of the relation. // Methods in the ScanTable class can help you // locate the attributes in the tuple Node *BE; // Boolean express of the Delete cmd } 
Note:

Here is what a Node structure would look like:
class Node { int node_type; /* Contain type of node */ union { int i; /* INT_CONST */ float f; /* FLOAT_CONST */ char c[100]; /* STRING_CONST */ int * i_attr; /* Attribute of type INT */ float * f_attr; /* Attribute of type FLOAT */ char * c_attr; /* Attribute of type CHAR */ struct Node * p[2]; /* Points to operands of a binary arithmetic, relational or logic operation. */ struct Node * q; /* Points to operand of a uniary arithmetic (e.g.: x) or logic (e.g.: NOT x) operation */ } value; } 
How to use the union variable value:

The values for the node_type variable can be one of the following (symbolic constants):


The pointer variable i_attr, f_attr and c_attr must be set up during the parse phase:


lseek( relationFile, 1, SEEK_CUR ); 
write( relationFile, "N", 1); 
cp EMP.orig EMP cp catalog.orig catalog 
The content of the EMP table is as follows:
The relation contains these fields: EMP.FNAME type = C startpos = 0 size = 7 EMP.LNAME type = C startpos = 8 size = 9 EMP.SEX type = C startpos = 16 size = 2 EMP.DNO type = I startpos = 20 size = 4 EMP.SALARY type = F startpos = 24 size = 8 `Johnny' `Smith' `M' 5 30000.000000   Valid flag: Y `Frankl' `Wongs' `M' 5 40000.000000   Valid flag: Y `Alicia' `Zelay' `F' 4 25000.000000   Valid flag: Y `Jennif' `Walla' `F' 4 43000.000000   Valid flag: Y `Ramesh' `Naray' `M' 5 38000.000000   Valid flag: Y `Joyces' `Engli' `F' 5 25000.000000   Valid flag: Y `Ahmads' `Jabba' `M' 4 25000.000000   Valid flag: Y `Jamess' `Borgo' `M' 1 55000.000000   Valid flag: Y 
DELETE TABLE EMP GO  DELETE TABLE EMP WHERE FNAME = 'Alicia' GO DELETE TABLE EMP WHERE LNAME <= 'Bx' GO DELETE TABLE EMP WHERE DNO = 4 GO  DELETE TABLE EMP WHERE SALARY >= 54000.0 GO DELETE TABLE EMP WHERE SALARY >= 42000 GO DELETE TABLE EMP WHERE 2*SALARY >= 79000 GO DELETE TABLE EMP WHERE (SALARY + 10000) >= 47000 GO DELETE TABLE EMP WHERE SEX = 'M' AND SALARY <= 30000 GO  DELETE TABLE EMP WHERE (SEX = 'F' OR SALARY <= 30000) AND DNO = 5 GO 
mkdir ~/cs554/TURNIN 
Make sure the program is present at the deadline .
I will make a copy of the main program and grade it.