CS 584 - Stream Database Systems
Homework 5

Due: See class webpage

## 1. Student's programs

• Implements Greenwald's ε-quantile approximation algorithm to find the φ-quantile in a stream, for an arbitrary φ.

 You should implement the algorithm given at the end of the lecture notes click here This version does not use bands...

• Programming required in this assignment:

• Command line to invoke program:

 ``` Greenwald ε < InputFile or: jave Greenwald ε < InputFile ```

• Make sure the capitalization of the command "Greenwald" is exactly as given above.

• Meaning of the input arguments:

• InputFile = input data file.

I will provide a test data file later in this handout. The data file will have the following format:

 ``` v1 (data point 1) v2 (data point 2) v3 (data point 3) ...... vN (data point N) ```

• ε = the precision parameter in Greenwald's algorithm

• Output that you need to generate:

 At each deletion point in Greenwald's algorithm, print out the content of the data structure as: ``` S: (e1, g1, Δ1) (e2, g2, Δ2) ... ``` and as: ``` S: e1[rmin(e1)..[rmax(e1)], e2[rmin(e2)..[rmax(e2)] ... ``` before the space reduction step. Then perform the space reduction step and Print the new content of the data structure again. (You can learn a lot from this output). Then print the answer for these quantile queries: Find the element at rank 3n - for 3n ≤ N

• At the end of the algorithm, print:

• The content of the data structure at the end of the algorithm as:
```
S: (e1, g1, Δ1) (e2, g2, Δ2) ...
```
and as:
```
S: e1[rmin(e1)..[rmax(e1)], e2[rmin(e2)..[rmax(e2)] ...
```

• Print the following list:

 Print εN once... The ranking index (1, 2, ..., N) the actual item at the rank position (store the items from the input in an array and sort it !) the item that Greenwald's algorithm will output when given the ranking index, and the actual error that the algorithm made (how many positions off)

The output format will look like this:

 ``` εN (the value of εN let us know what is max error is) rank Actual Answer Actual err of answer --------------------------------------------------------------- 1 v1 G1 err1 2 v2 G2 err2 3 v3 G3 err3 .... N vN GN errN ```

How to compute the actual error:

 The correct answer for rank r is vr The provided answer is Gr Suppose the value Gr is the value vs The the actual error is: s − r

## 2. Help Material

• Try running it with ε=0.1

I don't have the answers, but I can tell you to make sure that the summary always contains the smallest value and the largest value in the input.

## 3. Turn in

• Turn in a Makefile using the command:
```    /home/cs584000/turnin Makefile hw5
```
(If you use Java, you still need a Makefile. Let me know if you don't know how to create a Makefile)

• Turn in each header and program file using this command:
```    /home/cs584000/turnin Filename hw5-?
```
where "?" is a number from 0 to 9.