Two ways to sort externally are Multi-Line Merging Sort and Bucket Sort, both with two passes. The Bucket Sort burdens the CPU less and is more efficient, while its usage is restricted heavily by the High-Bit scheme that distributes records into subfiles: the keys have to be integers; the sizes of subfiles may vary too much; the number of subfiles cannot be chosen freely. Based on statistical theory, this paper presents a sample-seperators scheme to broaden the ussage of bucket sort algorithm. A brief discussion on the convergance of sample-seperator estimation is given and the probability to avoid memory overflow is calculated. This scheme enables the bucket sort algorithm to be applied in the SheenkSort system to win the 2003 PennySort (the Indy category) competition.
[1]Owen A. Bubble sort: An archaeological algorithmic analysis. In: Grissom S, Knox D, Joyce D, Dann W, eds. Proc. of the 34th SIGCSE Technical Symp. on Computer Science Education. New York: ACM Press, 2003.1-5.
[2]Hore CAR. Quicksort. The Computer Journal, 1962,5(1):10-16.
[3]Tang XY. Fast sorting method of separating segment. Journal of Software, 1993,4(2):53-57 (in Chinese with English abstract).
[4]Wang XY. A new sorting method by base distribution and linking. Chinese Journal of Computers, 2002,23(7):774-777(in Chinese with English abstract).
[5]Chen JC. Proportion extend sort. SIAM Journal on Computing, 2001,(31)1:323-330.
[6]Lorin H. Sorting and Sort Systems. Reading: Addison-Wesley Publishing Company, 1975.
[7]Knuth DE. The Art of Computer Programming, Vol 3: Sorting and Searching. Reading: Addison-Wesley Publishing Company,1973.
[8]Gray J, Coates J, Nyberg C. Performance/Price sort and PennySort. Technical Report, MS-TR-98-45, Microsoft Research, 1998.
[10]Shi Y, Zhang L, Liu P. THSORT: A single-processor parallel sorting algorithm. Journal of Software, 2003,14(2):159-165 (in Chinese with English abstract). http//www.jos.org.cn/1000-9825/15/159.htm
[11]Wegner LM, Teuhola JI. The external heapsort. IEEE Trans. on Software Engineering, 1989,15(7):917-925.