Wednesday, September 20, 2006

Thesis Paper

Last week I started my Master thesis. I didn't start writing anything yet of course, but I am doing a research on the relevant theory. What I am going to write about is Hash tables.

My supervisor, Rasmus Pagh, has already published his research and his proposal on better hashing algorithms. An easy to read version of his idea and proposal could be downloaded from here.

We will now go further and investigate other benefits and advantages of his proposals. In particular, we are going to use his algorithmic idea to make comparisons of two sets for duplicated words faster if possible.

The problem could be described as the following. Suppose that we have 2 sets with words. We want to find words in the first set that appear in the second set as well.

The two main competitive algorithms are the Merge algorithm and the use of hash tables. The Merge algorithm is the same one that merge sort utilizes. The use of hash tables, allows us to use one hash table for accessing words and the other one for searching for existence of words.
Both algorithms run in linear time. The proposed algorithm also runs in linear time, but I am trying to make it faster than the others by avoiding branch mispredictions(take advantage of branch predictions) and decreasing cache faults.

Wish me good luck :)

5 comments:

Neura said...

We wish you all the luck in the world. !!!!

Unknown said...

hash tables; this sounds suspicious... I thought you didn't do any drugs any more...

Stavros Amanatidis said...

Thanks neura...

Zaf, I am still trying to quit :)

guitarlikas said...

Yo man! sounds interesting! Good luck!

Stavros Amanatidis said...

Yo man!
Thanks a lot..