Sei sulla pagina 1di 3

CST363 project 2 – implementing a database index

Part 2

In part 2 you will complete implementation of the storage component of a DBMS. The buffer manager
handles the data blocks to/from the disk and the access methods handle the index and returns rows to
the execution component.

Design

 a table consists of a series of blocks, each block is 4096 bytes in size


 the current implementation does not actually keep the blocks on disks but in a HashMap in the
Blocks class. However, it would be straight forward to change this class to read/write the blocks
to a file.

 the first block (0) is reserved to record schema information in the future when the blocks are
written to disk
 the second block (1) is 8 * 4096 = 32,768 bits. Each bit indicates if the corresponding block has
any free space to insert a new row.
 0 bit = space for at least one row is available in the block. 1 bit = block is full.
 blocks 2 and beyond are data blocks.
 note that there can be at most 32,768 blocks.

1
CST363 project 2 – implementing a database index

 the format of a data block is shown below

 the block is divided up into equal size records. Each record has a length determined by the
tupleSize in the Schema.
 There is a BitMap at the beginning of the block. There is one bit for each Record. 0 bit means
the Record is empty and available to insert a new record. 1 bit means the Record contains a
Tuple of data.
 the data block in java is a byte array of size 4096.
 To find the index of the start of record on in the data block

index = bitmap length + (n-1)*record length

Step 1. OrdIndex

 implement the binsearch method using the binary search algorithm. This method’s argument is
an int key value. Return the index value in the ArrayList<Entry> entries that has this key value,
or in where the key should be for the insert case.
 implement the insert method. Remember that the entries must be in order by key in order for
the binary search algorithm to work.
 implement the delete, lookup methods.
 Run the Deindexes method and fix any problems so that all tests pass before going on.

Step 2. Complete Table methods.

 the methods in Table class are marked with // YOUR CODE HERE to indicate you need to
complete this part of the code.
 complete the implementation of insert method. The insert method checks that the primary key
of the Tuple to be inserted does not already exist in the table. If there is no index on the
primary key, the method does a linear scan of the table. If there is an index, you need to
complete this part so that is used the OrdIndex lookup method.
 if the primary key passed the check for duplicate value, then the insert method finds a data
block with empty space or allocates a new block at the end of the table. It serializes the tuple
into this empty space and then updates the block bitmap. You must complete the code so that
all indexes that exist (if any) are updated for this new row.
 complete the implementation of findDeleteTuple. The block containing the row is found and
the tuple is deleted, and the bitmap is updated to indicate this row is now free space. But the
indexes must also be updated.
 complete the lookup method. If there is an index, use the index to find the Tuple for this key
value.

2
CST363 project 2 – implementing a database index

 complete the create Index method. This method is called when a new index is defined. It scans
all rows in the tables and inserts Entries into the index.
 run the TableTest method and fix any bugs so all tests pass.

What to submit for this assignment

 only one member of the group needs to submit


 Submit the Table, OrdIndex files.
 If you do extra credit, submit the other java classes that you modified or created.

Check your understanding and Extra Credit

1. Write down the steps that Table insert must do for a row (tuple) with a primary key assuming
there is an index on the primary key?
2. What if all Records were not the same length. String columns current are stored with the
maximum length rather than the actual length. What changes would be needed to handle
variable length Records?
3. The implementation of OrdIndex does not allow for duplicate key values. This is ok for the
primary key column, but not for other columns that are not unique.
4. How would you modify OrdIndex to handle non unique key values?
5. For extra credit #1 make the modifications to handle non unique keys. Create a test method
where there are duplicate key values for a column and test your code.
6. How could the implementation of the natural Join method in SelectQuery be modified to take
advantage of an index on table2?
7. For extra credit #2, modify naturalJoin to take advantage of an index.
8. How would you implement a HashIndex, similar to OrdIndex, but using a HashMap instead of an
ordered ArrayList to keep the collection of Entry objects for the index.
9. For extra credit #3, create a HashIndex class and test it out.

Potrebbero piacerti anche