Hi all,
Just to get it out of the way, I'm a bit of a Java noob but I have a guide on reading and writing files through RandomAccessFiles, BufferedOutput(or Input)Streams and DataOutput(or Input)Streams.
I'm trying to code a Java program to sort any file of ints as fast as possible. The file could be any size, and for each instance of the problem I'm given another file (of the same size as the one I have to sort) to use to store intermediate results. I think they'll also restrict the amount of memory available to the JVM but it hasn't specified how, so I'll deal with that as I run it against the test harness.
I was considering writing some kind of external mergesort but I was wondering if it'd be possible to do a counting sort?
Here are the problems I don't know how to deal with:
1. The test harness may give me a file containing the largest negative int and the largest positive int. This would mean my algorithm would be unable to make a "count" array big enough. Is it possible to get around this using some kind of sparse data structure instead of an array, while still preserving the O(n) time complexity of counting sort?
2. I think this is actually unlikely, but the test harness may give a file that simply has so many of a certain integer that the "count" array/data structure overflows. Actually I think this is REALLY unlikely but worth mentioning.
So, any ideas?