### Intro to bitmap indexes

• Preliminary

• Assumption:

• Records in a file/relation occupy a permanent location in the file/relation

• I.e.:

 A records is uniquely identified by a position ID

• Definition: Current value set

• Current Value Set:

 F = the current set of values stored in a field f in the records

• Example:

 ``` Records: field f ---------+--------+------ Record 1: ..... 5 ... Record 2: ..... 7 ... Record 3: ..... 8 ... F = {5, 7, 8} ```

• Bitmap index

• Bitmap index of a field f:

• is a collection of bit vectors of length n (where n is the number of records)

• there is one bit vector for each value v that appears in field f

• The bit vector for the value v is equal to:

 ``` x1 x2 x3 x4 ..... xi ...... xn xi = 1 if the ith record's field f = v = 0 otherwise ```

• Examples:

• A file has 6 records

 ``` Fields: A B -------------------------------- record 1: 30 foo record 2: 30 bar record 3: 40 baz record 4: 50 foo record 5: 40 bar record 6: 30 baz ```

The bitmap index for the field A is:

 ``` value 123456 ---------------------------- 30 110001 <---- bit vector 40 001010 50 000100 Explanation: The value 30 appears in the records: 1, 2, 6 So: bit #1, #2 and #6 are set ```

The bitmap index for the field B is:

 ``` value 123456 ---------------------------- foo 100100 <---- bit vector bar 010010 baz 001001 Explanation: The value foo appears in the records: 1, 4 So: bit #1 and #4 are set ```

• Bigger example: people who buy jewelry

• Data on people who buy jewelry:

 ``` (age, salary (in \$1,000)) 1(25,60) 2(45,60) 3(50,75) 4(50,100) 5(50,120) 6(70,110) 7(85,140) 8(30,260) 9(25,400) 10(45,350) 11(50,275) 12(60,260) ```

• The bitmap index on age:

 ``` Value 123456789012 ------------------------------ 25 100000001000 30 000000010000 45 010000000100 50 001110000010 60 000000000001 70 000001000000 85 000000100000 ```

The bitmap index on salary:

 ``` Value 123456789012 ------------------------------- 60 110000000000 75 001000000000 100 000100000000 110 000001000000 120 000010000000 140 000000100000 260 000000010001 275 000000000010 350 000000000100 400 000000001000 ```

(The bit index for 2 records are high lighted in red)