Logistics
Go...
- Worth 4 points, or 4% of your class grade
- Assigned Saturday May 6, 2006
- Due Saturday May 13, 2006 by the end of the day...
a 1-weeker!
- Covers hash tables
Description
In Program #4, you'll implement your own hash table. Web site
URL's and IP addresses will provide a nice testbed for some hash
table experimentation.
So, you're surfing the net with your favorite browser. You enter a
web site name, called a
Universal Resource Locator (URL), and load in the web page. Easy
enough.
Well, under the hood, each URL must be converted to an
Internet Protocol (IP)
address so that the web site can be located on the net. For example:
www.wikipedia.com has an IP
address of 207.142.131.248.
In Program #4, we will simulate this process and write our own hash
table to do it. Here are the steps your program should take:
- Load a file of "fake" URLs and IP addresses that I have created
for you
- Query the user for a URL
- Find the corresponding IP address for the URL in your hash table
- If the IP address was found, then report it
- Else ask the user if he/she would like to add the URL to your
table... if so, do it
- Go to step 2
Implementation
You'll find a file of web sites in the common area of the k:
drive. Here's a copy if you're working at home:
websites.txt. Please implement your own
hash table, complete with get(),
put() and
remove() methods. The book has a nice partial implementation of a hash table. Here
are some side-comments:
- I used chaining to resolve collisions. I used the generic
LinkedList<> class from the
Collections API as my linked list.
- I simplified things (old school) by assuming that the
elements of my hash table have their keys included in them. I did
this because it's a different than our book, and it's easier and
less expensive, memory-wise. So,
MyHashTable<E> has only one generic parameter, rather than
two.
- Please implement your own hash function. I would get your
program running using Java's standard
hashCode() method and then replace it with your own. Let's
use the hash function given at the bottom of page 473 in our text...
and use Horner's Rule (see below) to make the calculation more
palatable and efficient.
- You can see some experimentation in my implementation. The
MyHasher interface is an alternate
way of communicating a hash function to your table. In fact, my hash
table does both. I first look whether a
MyHasher has been defined, and if not, then I just use the
hashCode() method. This is similar
to the Comparator interface shown on page 444 that can be used for
sorting.
- I rehash my table whenever it hits the load threshold (a
fraction between 0 and 1) specified.
- It's fun to keep track of statistics like # collisions, #
rehashes, etc.
- Concentrate on your table and getting it working and such,
rather than a fancy UI. I just used that handy
JOptionPane.showInputDialog() from
Program #2. I wrote statistics and such to the console.
- I want to see some preliminary test code in your hash
table or other classes. Don't expect to write your hash table and
then just load it up with 20,000 items, expecting it to work. Also,
exercise removal in your small test main().
- I have this goofy warning I keep getting, that I can't
seem to silence (for now!). If you get something like this,
investigate it but don't let it stop your work.
What is Horner's Rule? It's a little niftier way to calculate
your hash function with all those exponents and such. If you want to
calculate value of the polynomial:
A3X3 + A2X2
+ A1X1 + A0X0
it can be done like this:
(((A3)X + A2)X + A1)X + A0
With a nice prime multiplier like 37, your Java to hash
String str using Horner's Rules
might look something like this:
int hashValue = 0; |
for( int i = 0; i <
str.length(); i++) { |
hashValue
= (hashValue * 37) |
+ (int) str.charAt( i); |
} |
return (hashValue %
tableSize); |
Please peruse my Javadoc. If you have questions about my
choices, then please email me. Remember, the book and I are
just resources... implement things as you want.
Here's a code fragment that I used to read the
websites.txt file:
// The reader
parameter is our input file |
void loadFile(
BufferedReader reader) { |
String url, ip; |
Scanner sc =
new Scanner( reader); |
while(
sc.hasNext()) { |
url
= sc.next(); |
ip
= sc.next(); |
if(
url != null && ip != null) { |
// valid url, ip to process |
} |
} |
} |
Grading
Same deal, guys... By the due date, please place your work for Program
#4 in your folder on the k: drive. I'll be looking for:
- A README file describing the state of your
program... what works, what doesn't, etc.
- All your *.java Java source code files
- All your *.class files... please compile
your Java
- A javadoc folder with the nifty auto-magic
documentation of your classes
Of course, your code must be beautiful and follow
the class coding guidelines. Code that does not meet
this metric will be served a harsh brand of grading
justice.
Etc
Hey,
www.register.com does something similar to our
Program #4... but for real.
A little debug check for you... my program #4 says that
websites.txt has 22,145 URL's in
there.
BTW, I've been looking for a "real" file of web addresses but had
difficulty finding it. the file is actually called the
hosts file.
It's format is similar to my websites.txt,
except it allows comments and the URL and IP fields are switched.
Just FYI.
good luck... yow, bill |