toil.test.sort.restart_sort¶
A demonstration of toil. Sorts the lines of a file into ascending order by doing a parallel merge sort. This is an intentionally buggy version that doesn’t include restart() for testing purposes.
Attributes¶
Functions¶
|
Sets up the sort. |
|
Input is a file, a subdivision size N, and a path in the hierarchy of jobs. |
|
Merges the two files and places them in the output. |
|
Sorts the given file. |
|
Merges together two files maintaining sorted order. |
|
Copies the range (in bytes) between fileStart and fileEnd to the given |
|
Finds the point in the file to split. |
|
|
|
Module Contents¶
- toil.test.sort.restart_sort.defaultLines = 1000¶
- toil.test.sort.restart_sort.defaultLineLen = 50¶
- toil.test.sort.restart_sort.sortMemory = '600M'¶
- toil.test.sort.restart_sort.setup(job, inputFile, N, downCheckpoints, options)[source]¶
Sets up the sort. Returns the FileID of the sorted file
- toil.test.sort.restart_sort.down(job, inputFileStoreID, N, path, downCheckpoints, options, memory=sortMemory)[source]¶
Input is a file, a subdivision size N, and a path in the hierarchy of jobs. If the range is larger than a threshold N the range is divided recursively and a follow on job is then created which merges back the results else the file is sorted and placed in the output.
- toil.test.sort.restart_sort.up(job, inputFileID1, inputFileID2, path, options, memory=sortMemory)[source]¶
Merges the two files and places them in the output.
- toil.test.sort.restart_sort.merge(fileHandle1, fileHandle2, outputFileHandle)[source]¶
Merges together two files maintaining sorted order.
All handles must be text-mode streams.
- toil.test.sort.restart_sort.copySubRangeOfFile(inputFile, fileStart, fileEnd)[source]¶
Copies the range (in bytes) between fileStart and fileEnd to the given output file handle.
- toil.test.sort.restart_sort.getMidPoint(file, fileStart, fileEnd)[source]¶
Finds the point in the file to split. Returns an int i such that fileStart <= i < fileEnd