CS 584 - Stream Database Systems
Homework 4

Due: See class webpage

## 1. Student's programs

• Implements Manku's ε-approximation algorithm to find the count of the number of occurrences of items in a stream.

• Programming required in this assignment:

• Command line to invoke program:

 ``` Manku InputFile s ε or: jave Manku InputFile s ε ```

• Make sure the capitalization of the command "Manku" is exactly as given above.

• Meaning of the input arguments:

• InputFile = input data file.

The data file must conform to the following format:

```	v1   (data point 1)
v2   (data point 2)
v3   (data point 3)
......
vN   (data point N)
```
• s = support level

• ε = the precision parameter in Manku's algorithm

• Output that you need to generate:

• At each deletion point, print out the content of the data structure as:
```	D: (e1, f1, &Delta1) (e2, f2, &Delta2) ...
```

before the space reduction step.

Perform the space reduction step and print the new content of the data structure again. (You can learn a lot from this output).

• At the end of the algorithm, print:

• The items found by Manku's algorithm that satisfy the support s with maximum error ε, and the estimated frequency.

• For each item found by Manku's algorithm, prints the actual number of occurences in the input - to determine the actual frequency, use a linked list of the form (item, frequency) and keep an exact count (it's slow and space consuming, and we do it to study how good Manku's algorithm is)

• The output format is:
``` 	(e1, f1, real1)
(e2, f2, real2)
(e3, f3, real3)
...
```

## 2. Help Material

• Output using Manku small.inp 0.2 0.1:
```Before deletion
D: (8, 1, 0) (7, 1, 0) (6, 2, 0) (4, 3, 0) (3, 1, 0) (2, 1, 0) (1, 1, 0)
After deletion
D: (6, 2, 0) (4, 3, 0)

Before deletion
D: (1, 1, 1) (2, 3, 1) (9, 1, 1) (7, 2, 1) (6, 3, 0) (4, 5, 0)
After deletion
D: (2, 3, 1) (7, 2, 1) (6, 3, 0) (4, 5, 0)

Before deletion
D: (8, 1, 2) (9, 1, 2) (1, 1, 2) (5, 1, 2) (2, 3, 1) (7, 4, 1) (6, 3, 0) (4, 9, 0)
After deletion
D: (2, 3, 1) (7, 4, 1) (4, 9, 0)

Before deletion
D: (9, 1, 3) (5, 1, 3) (1, 1, 3) (8, 4, 3) (2, 3, 1) (7, 4, 1) (4, 12, 0)
After deletion
D: (8, 4, 3) (7, 4, 1) (4, 12, 0)

D structure at end of execution D: (8, 4, 3) (7, 4, 1) (4, 12, 0)

Output:
(8, 4, 6)
(7, 4, 5)
(4, 12, 12)
```

• This input file will help you test the program for correctness.

• The output (items that exceeds the threshold) using Manku data1 0.05 0.01 is:
```         (999,798,798)
(888,532,532)
(777,531,531)
(111,527,527)
```

## 3. Turn in

• Turn in a Makefile using the command:
```    /home/cs584000/turnin Makefile hw4
```
(If you use Java, you still need a Makefile. Let me know if you don't know how to create a Makefile)

• Turn in each header and program file using this command:
```    /home/cs584000/turnin Filename hw4-?
```
where "?" is a number from 0 to 9.