Updated March 2011
Random file syntax:
xcall BASORT, ch, reccnt, recsiz,
k1siz, k1pos, k1ord
{,k2siz, k2pos, k2ord,
k3siz, k3pos, k3ord,
k1typ, k2typ, k3typ,
k4siz, k4pos, k4ord,
k5siz, k5pos, k5ord,
k6siz, k6pos, k6ord,
k4typ, k5typ, k6typ}
Sequential file syntax:
xcall BASORT, chin, chout, recsiz,
k1siz, k1pos, k1ord
{,k2siz, k2pos, k2ord,
k3siz, k3pos, k3ord,
k1typ, k2typ, k3typ}
Note that in the syntax shown above, spaces and line breaks have no meaning—i.e., they do not and should not exist in the actual program. They are used here strictly as visual aides to illustrate the structure of the commands.
BASORT.SBR sorts files, both random (aka contiguous) and sequential (aka stream).For sorting arrays in memory, see SORTIT.
All BASORT parameters are [in] and Num.
All parameters after k1ord are optional and if not specified, default to zero. Keys with 0 size or position are ignored. Keys with no explicit type default to string (0).
Param |
Description |
ch |
File channel (used only with random files). |
chin |
Input file channel (used only with sequential files) |
chout |
Output file channel (used only with sequential files) |
reccnt |
Number of records in the file. |
recsiz |
Record length. For sequential files, this should be equal to or greater than the longest line in the file. |
k?siz |
Size of the sort key, in bytes. May use .SIZEOF to rely on compiler to specify correct size. |
k?pos |
First character position occupied by the key. May use .OFFSET macro to have the compiler calculate the correct offset. |
k?ord |
Sort order of the key, where 0=ascending, 1=descending. |
k?typ |
Data type of the key: 0=String, 1=Floating Point (random files only), 2=Binary (random files only), 3=Integer (random only), 4=Natural Sort, 5=case-insensitive natural sort. See History note for A-Shell 5.1.1135 below for more details on types 3-5. |
A-Shell's implementation of BASORT.SBR is fully compatible with the original, optimized to make the best use of memory available. The sorting process is based around the quicksort algorithm, but up to three different types of sort may be performed depending on the amount of free memory. In the default case, the free memory is taken from the A-Shell partition. However, under systems with large amounts of efficient virtual memory available (e.g. UNIX and Windows), you may specify the system parameter SBR=MALLOCSORT to cause A-Shell to dynamically allocate, if possible, a chunk of memory large enough to sort the entire file in one quicksort operation. See the SBR system parameter topic for more details on the SBR options.
In the case of the sequential file sort, there is no option for key types because the keys must be string type. Also, since the record lengths can be variable, you should specify a recsiz as large or larger than the largest line in the file, up to a maximum of 4096. The file to be sorted must be open for input on chin, and the program must supply the channel (chout) of an output file into which you want the file sorted. If you want to sort the file on itself, you may specify chout as the same value as chin, in which case it only needs to be open once for input. If you want to re-read the sorted output file, you will need to close it and reopen it for input.
Here is a description of the three sort algorithms:
Type |
Description |
Small |
The entire file is loaded, quicksorted and then saved to disk. This will occur if there is sufficient memory to load the entire file. It is extremely fast. |
Medium |
The keys are extracted from the file, and then quicksorted. A new file is then created by reading the records from the old file in the order of the quicksorted keys. This will occur if there is sufficient memory to load a tag file consisting of the sort keys and record numbers. It is reasonably fast. |
Large |
The file is quicksorted in chunks into two separate files. A disk-based poly-phase merge sort is then used to repeatedly merge these until a single sequence of sorted records results. This type of sort occurs in all other situations. It is quite slow. |
The type of sort being used may be determined by specifying:
TRACE=AMSORT
in the system configuration file or with the command:
SET TRACE AMSORT
from the dot prompt.
Among other uses, the tracing display may be useful for analyzing the performance of a LARGE model sort, since the individual phases are shown.
Records with the same sort key are guaranteed not to be re-ordered in the resulting file. This also applies to direct memory sorting of files in MEM:.
The normal limit on memory allocation with the MALLOCSORT option is 8MB, but you can change this with the MALLOCLIMIT=<bytes>{K}{M} system parameter. Bumping this up, to say, 32MB, on a typical modern desktop PC can make a huge difference when sorting files larger than 8MB but smaller than the limit you specify.
The absolute limit for BASORT file size is 2GB.
Troubleshooting tip: if you're having problems with large sorts failing, the first recommendation is to activate and configure the MALLOCSORT and MALLOCLIMIT options so that the sort can be performed using the Medium, or preferably, the Small algorithm, which is not only the fastest but the simplest and thus the least likely to run into complications.
2009 January A-Shell 5.1.1136: Six sort keys are now allowed, as opposed to three keys in prior versions.
2009 January A-Shell 5.1.1135: BASORT now includes three new sort types:
3: Integer (I,1 or I,2 or I,4)
4: Natural Sort
5: Case insensitive Natural Sort
Warning about using signed integer sort keys (B5, I1, I2 and I4) in files with a control record: traditionally the technique for preventing the control record from getting sorted out of place was to temporarily set it to all nulls, prior to sorting. But this is not adequate for signed integer sort keys, since nulls = 0, which is not at the bottom but in the middle of the range of signed values. The lowest possible signed integer value is -2^N where N is the # of bits in the field (e.g. -128, -32768, etc.) Similarly, for descending sort, the maximum value is 2^N-1.