File: SORT.RO of Tape: Various/Decus/decus-5
(Source file text) 

.! SORT DOCUMENTATION
.! CLYDE G. ROBY, JR.
.! DEPARTMENT OF PHYSIOLOGY AND BIOPHYSICS
.! WEST VIRGINIA UNIVERSITY MEDICAL CENTER
.! MORGANTOWN, WEST VIRGINIA
.! CHANGES IN SEPTEMBER, 1974 FOR VERSION 3 (MULTI-LINE RECORDS)
.! CHANGES IN OCTOBER, 1974 FOR CHAINING ABILITY
.! CHANGES IN MAY, 1975 FOR VERSION 5
.!   (HANDLING LARGER FILES, MULTI-LINE BUG)
.! CHANGES IN MAY, 1976 FOR VERSION 7
.!   (DROPPED FILE SIZE & MADE COMPATIBLE WITH MERGE) JBC
.! CHANGES JULY 1977 ADDED XTRACT DESCRIPTION, CLEANED UP INTRODUCTION TMC
.FLAG CAPITALIZE;.FLAG FIRSTCAPITALIZE;.TR
.HEADER BOTTOM CENTER NUMBER
.CHAPTER <SORT
.T 
.M 10, 70
.B 5
.CM; ^^SORT/MERGE/XTRACT\\
.B 1
^'
.CM; A PACKAGE OF <OS/8 COMPATIBLE SORT - MERGE PROGRAMS
.B 3
.CM; SMALL COMPUTER LABORATORY
.CM; DEPARTMENT \OF PHYSIOLOGY \AND BIOPHYSICS
.CM; WEST VIRGINIA UNIVERSITY MEDICAL CENTER
.CM; MORGANTOWN, WEST VIRGINIA 26506
.B 3
.CM; <SORT VERSION 8
.CM; JUNE 1, 1977
.B 3
.CM; <MERGE VERSION 2
.CM; JUNE 1, 1977
.B 3
.CM; <XTRACT VERSION 1
.CM;JUNE 1, 1977
\'
.PG
.CM;<SORT/MERGE/XTRACT
.P
THIS DOCUMENT DESCRIBES A PACKAGE OF PROGRAMS FOR DEALING WITH <OS/8
<ASCII FILES.
THE PRINCIPAL UTILITY IS <SORT WHICH WAS WRITTEN IN IT_'S ORIGINAL
VERSION BY ^'C. G. ROBY JR\'..
<MERGE AND <XTRACT ARE COMPANION PROGRAMS TO ASSIST IN THE EFFICIENT
SORTING OF LARGE DATA SETS.
<MERGE WAS WRITTEN BY ^'JAMES B. CORYELL \OF DATAPRODUCTS CORPORATION
\AND <XTRACT \BY THOMAS W. MC^INTYRE \AND ALAN SMOTHERS\' OF <WVU.
<JBC ALSO EXTENSIVELY REVISED <SORT FROM VERSION 6 TO 7.
<TMC IS RESPONSIBLE FOR THE PRESENT VERSIONS OF ALL THE PROGRAMS.
.S 2;.CM;<SORT
.P
<SORT IS A PROGRAM TO SORT <OS/8 COMPATIBLE <ASCII FILES.
THE SORTING IS BY RECORDS.
A RECORD IS DEFINED TO BE A STRING OF NO MORE THAN 512
<ASCII CHARACTERS.
A RECORD IS TERMINATED BY _"N_" LINES
OR A 'RECORD-'MARK (RM).
NORMALLY ANY OF THE CHARACTERS <LF, <FF, OR <VT WILL BE TAKEN AS A LINE TERMINATOR.
<CR IS ALWAYS IGNORED.
IF /<Y IS GIVEN AS AN OPTION, <FF IS ALSO IGNORED.
THE STRING CONSISTS OF ONE OR MORE
LINES.  A LINE IS TERMINATED WITH A LINE FEED.
ALL OTHER CONTROL CHARACTERS EXCEPT TABS ARE IGNORED.
A RECORD WHICH HAS ONLY CONTROL CHARACTERS IS
CALLED A _'NULL_' RECORD.
_'^NULL_' RECORDS ARE IGNORED.
THAT IS, THERE WILL BE NO NULL RECORDS IN THE OUTPUT FILE.
.P
THE USER HAS THE OPTION TO DEFINE FIELDS FOR THE SORTING
EITHER BY FIXED COLUMN POSITIONS OR BOUNDED BY ARBITRARY
DELIMITING CHARACTERS.  THE SORTING CAN BE EITHER ASCENDING
OR DESCENDING WITHIN EACH FIELD.
THE SORTING PROCEDURE USED IS A MULTI-PASS SORT-MERGE WITH
INTERMEDIATE TEMPORARY OUTPUT FILES.
THE DEVICES FOR THE OUTPUT FILES MAY BE SPECIFIED BY THE
USER TO OPTIMIZE THE SORTING.
.P
TO RUN <SORT UNDER <OS/8, TYPE IN RESPONSE TO THE MONITOR _"._":
.TS 15;.NF;.B 1
	<.R <SORT
OR	<.R <SORT (FIELD DEFINITIONS)
OR	<.R <SORT (FIELD DEFINITIONS) N
OR	<.R <SORT (FIELD DEFINITIONS) N _#RM
.F
.P
WHEN <SORT IS LOADED IT CALLS ^COMMAND ^DECODER FOR USER
INPUT OF <I/O SPECIFICATIONS.
IN RESPONSE TO THE ASTERISK THE USER MAY SPECIFY UP TO NINE
INPUT FILES AND UP TO THREE OUTPUT FILES.
THE INPUT FILES WILL BE CONSTRUED TO BE A SINGLE FILE.
THE FIRST OUTPUT FILE IS THE SORTED OUTPUT,
THE NEXT TWO ARE THE INTERMEDIATE WORK FILES.
(<MERGE HAS NO WORK FILES, ONE OUTPUT FILE, AND TWO INPUT FILES ONLY).
IF THE INTERMEDIATE FILES ARE OMITTED, THE FILES <DSK:SRTWK1.TM
AND <DSK:SRTWK2.TM ARE CREATED.
IF ONLY DEVICES ARE SPECIFIED FOR THE SECOND AND THIRD
OUTPUT FILES, THEN <SRTWK1.TM AND <SRTWK2.TM WILL BE CREATED
ON THE USER SPECIFIED DEVICES.
FOR SMALL OR SLOW <I/O DEVICES, THE TEMPORARY FILES SHOULD BE ON DIFFERENT
DEVICES TO IMPROVE THE SORT SPEED.
THE FIRST PASS OUTPUT IS TO THE THIRD
OUTPUT FILE.
THE LENGTHS OF THE INTERMEDIATE FILES WILL BE ONE _& ONE-HALF TO TWICE THE
LENGTH OF ALL THE INPUT FILES.
THE INTERMEDIATE FILES MUST RESIDE ON DIRECTORY DEVICES.
IF <SORT CANNOT OPEN THESE TWO FILES, AN ERROR WILL BE
PRINTED OUT.
 THE OPTION SWITCH _"^S_" MAY BE USED TO SPECIFY <BASIC COMPATIBLE STRIPPED <ASCII COMPARES.
THE INPUT FILES ARE STILL ASSUMED TO BE IN FULL <ASCII, BUT THE STRANGE COMPARES OF <BASIC ARE USED
(I.E. THE CONTROL CHARACTERS AND NUMBERS ARE _"LATER_" IN THE <ASCII SEQUENCE THAN THE ALPHABETICS).
IF THIS OPTION IS USED, THE DELIMITER FORM OF FIELD SPECIFICATION CAN NOT BE USED WITH A NUMERIC FIELD.
'EXAMPLES OF COMMAND DECODER LINES FOLLOW:
.NF;.B 1
	^^*SRTOUT_<INPUT\\
.F
.P
<SRTWK1.TM AND <SRTWK2.TM ARE CREATED ON <DSK:.
<SRTOUT WILL GO TO <DSK: WHEN SORTING IS COMPLETE.
<INPUT IS ASSUMED TO BE ON <DSK:.
.NF;.B 1
	^^*DSK:SRTOUT,SYS:WORKB,DSK:WORKA_<SYS:INPUT/Z\\
.F
.P
ON A SYSTEM WITH SLOW OR SMALL MASS STORE DEVICES THE _'^ZIGZAG_' OPTION WILL OPTIMIZE THE <I/O OPERATIONS.
THE DEVICES FOR THE FILES SHOULD BE ALTERNATED IN THE COMMAND DECODER LINE AS SHOWN AND THE OPTION SPECIFIED.
<SORT WILL MAKE AN EXTRA PASS IF NECESSARY TO AVOID READING AND WRITING ON THE SAME DEVICE.
THIS OPTION SHOULD NOT BE USED ON DISK SINCE IT WILL GENERALLY REQUIRE AN ADDITIONAL PASS WHICH WILL GENERALLY TAKE LONGER THAN THE ONE DISK COPY.
THERE NEED NOT BE ROOM ON THE FINAL OUTPUT DEVICE FOR BOTH A WORK FILE AND THE OUTPUT SINCE THE WORK FILE WILL HAVE BEEN DELETED BEFORE THE FINAL PASS.
.NF;.B 1
	^^*SRTOUT,DTA1:,SYS:_<INPUT/S/T\\
.F
.P
<INPUT WILL BE SORTED AND SENT TO <SRTOUT ON DEVICE <DSK:.
THE FILES <DTA1:SRTWK1.TM AND <SYS:SRTWK2.TM WILL BE USED
AS INTERMEDIATE FILES.
THE COMPARISONS WILL BE MADE ON STRIPPED 6 BIT CODES.
THE /'T WILL CAUSE A LINE STARTING WITH CONTROL-'T TO OVER-RIDE THE SORT
FIELD DEFINITIONS.
.B 5
.CM; <RECORD <DEFINITION
.P;THE NUMBER N ON THE COMMAND LINE, IF PRESENT,
IS THE NUMBER OF LINES WHICH <SORT WILL USE AS A SINGLE
RECORD.  IF THIS NUMBER IS OMITTED, THEN <SORT WILL USE
EACH INPUT LINE AS A RECORD.
THE _#RM IF PRESENT WILL CAUSE THE OCTAL CODE (RM) TO
TERMINATE THE RECORD IF FOUND BEFORE N LINES.
(_#RM MUST BE IN THE RANGE OF 201-376 OCTAL).
.P;FOR EXAMPLE:
.NF;.B
	<.R <SORT (FIELD DEFINITIONS) 3 _#244
.B;.F
INDICATES THAT EACH RECORD CONSISTS OF 1 TO 3 LINES, TERMINATED BY
THE THIRD LINE-FEED OR THE FIRST $ SIGN FOUND.
IF YOU USE THE _#RM THEN N SHOULD BE LONGER THAN THE LARGEST RECORD.
ONE SHOULD NOTE THAT IF THE INPUT FILE HAS EXTRA LINES
WHICH DO NOT CONSTITUTE A FULL RECORD, THEN THAT
RECORD IS LOST.  THAT IS, IF THERE ARE 6 LINES PER RECORD,
AND THERE ARE 4 EXTRA LINES AT THE END OF THE INPUT
FILE, THESE 4 LINES ARE LOST.
IF RECORD MARKS ARE USED AS THE END OF LINE CHARACTER, THE CHARACTER
FOLLOWING THE RECORD MARK BECOMES THE FIRST CHARACTER OF THE NEXT
RECORD.
.TP 8
.B 5
.CM; ^^FIELD DEFINITIONS\\
.P
THE FIELD DEFINITIONS FOR THE SORTING ARE ENTERED ON THE
LINE WITH THE "_<R#SORT_" COMMAND TO THE <OS/8 MONITOR, OR AS A LINE
OR LINES IN THE INPUT FILE USING THE _"/_" OPTION WITH
COMMAND DECODER TO SELECT THE DESIRED CONTROL LINE.
THERE ARE TWO POSSIBLE FORMATS FOR FIELD
DEFINITIONS: COLUMN OR DELIMITER.
A TOTAL OF 32 KEYS MAY BE DEFINED BY
COLUMNS OR 42 BY DELIMITER.
.P
TO ENTER DEFINITIONS BY COLUMN, TYPE:
.TS 10;.NF;.B 1
	(STARTING COL., LENGTH, ^C OR ^N, ^A OR ^D; ...)
.F
.P
EACH FIELD TO BE USED IN THE SORTING IS DEFINED AS THE
INITIAL COLUMN OF THE FIELD AND THE FIELD LENGTH.
ADDITIONAL PARAMETERS ARE ^CHARACTER OR ^NUMERIC SORT
SPECIFIED BY ^C OR ^N,
AND ^ASCENDING OR ^DESCENDING ORDER SPECIFIED BY ^A OR ^D.
ALL PARAMETERS WITHIN A FIELD ARE SEPARATED BY COMMAS AND
FIELD DEFINITIONS ARE SEPARATED BY SEMICOLONS.
THE FIELDS ARE DEFINED IN ORDER AS MAJOR FIELD, 1ST MINOR,
2ND MINOR, ETC.
FIELDS ARE NOT REORDERED ON OUTPUT SO THAT THE INDIVIDUAL
RECORDS MAINTAIN THEIR ORIGINAL INTERNAL SEQUENCE.
THE ENTIRE SET OF DEFINITIONS MUST BE ENCLOSED BY PARENTHESES.
TABS IN THE INPUT FILE ARE EXPANDED TO ENOUGH SPACES TO TAB
TO FIXED TAB STOPS OF 9, 17, 25, ...
.P
IF ANY FIELDS ARE DEFINED THE STARTING COLUMN MUST ALWAYS
BE GIVEN.
IF THE LENGTH IS OMITTED IT IS ASSUMED TO BE ONE.
IF THE TYPE IS OMITTED IT IS ASSUMED TO BE ^CHARACTER AND
IF THE ORDER IS OMITTED IT IS ASSUMED TO BE ^ASCENDING.
FOR EXAMPLE:
.TS 15;.NF;.B 1
	.<R <SORT (7,3; 10,4,,^D)
.F
.B 1
DEFINES THE MAJOR FIELD TO START AT COLUMN 7 AND GO TO
COLUMN 9, ^CHARACTER, ^ASCENDING; AND DEFINES ONE MINOR
FIELD STARTING AT COLUMN 10 AND GOING TO COLUMN 13,
^CHARACTER, ^DESCENDING.
NOTE THAT IF A PARAMETER IS OMITTED, THE COMMAS MUST STILL
BE PRESENT IF ANY OTHER PARAMETERS FOLLOW IN THE SAME
FIELD DEFINITION.
.P;FOR MULTIPLE-LINE RECORDS, THE LAST CHARACTER OF
A LINE IMMEDIATELY PRECEDES THE FIRST CHARACTER OF THE NEXT LINE.
.TP 7
FOR EXAMPLE, THE THREE-LINE RECORD:
.NF;.B
<ABC
<DEFGHI
<JKLM
.B;.F
IS EQUIVALENT TO THE SINGLE-LINE RECORD:
.B
<ABCDEFGHIJKLMN
.P
ALTERNATIVELY, FIELDS MAY BE DEFINED BY A DELIMITING CHARACTER.
ANY <ASCII CHARACTER WHICH MAY BE TYPED INTO A MONITOR
LINE MAY BE USED.
THE FORMAT USING DELIMITERS IS:
.TS 10;.NF;.B 1
	(=DELIMITER CHAR., FIELD NO., ^C OR ^N, ^A OR ^D; ...)
.F
.P
THE FIRST ITEM MUST BE THE EQUAL SIGN WHICH IS FOLLOWED
BY THE DELIMITING CHARACTER.
THE _"=DELIMITER CHAR._" CAN BE REPLACED WITH _"_#223_" WHERE
_"223_" IS THE OCTAL VALUE OF THE DELIMITER. ('USED TO GET AROUND
'KEYBOARD 'MONITOR LIMITATIONS).
THIS CONSTRUCTION IS NOT REPEATED IN MINOR FIELD DEFINITIONS.
THE FIELDS ARE THEN SPECIFIED BY NUMBER WITH ONE BEING
THE FIELD TO THE LEFT OF THE FIRST OCCURRENCE OF THE
DELIMITING CHARACTER.
THE REMAINING PARAMETERS ARE THE SAME AS FOR COLUMN
DEFINITIONS.  TABS IN THE INPUT FILE ARE NOT EXPANDED.
FOR EXAMPLE:
.TS 15;.NF;.B 1
	.<R <SORT (=_#, 3; 2,,^D)
OR	.<R <SORT (_#243, 3; 2,,'D)
.F
.B 1
DEFINES THE MAJOR FIELD TO LIE BETWEEN THE SECOND AND
THIRD OCCURRENCE OF THE _# AND TO BE SORTED BY ^CHARACTER,
^ASCENDING; AND DEFINES ONE MINOR FIELD BETWEEN THE FIRST
AND SECOND OCCURRENCES OF _# TO BE SORTED BY ^CHARACTER,
^DESCENDING.  THE TAB CHARACTER IS CONVENIENT FOR SORTING
TABULATED RECORDS BUT IS INCONVENIENT TO USE IN THIS
DOCUMENTATION.
.P;FOR MULTIPLE-LINE RECORDS, THE END OF LINE
IS EQUIVALENT TO PLACING A DELIMITER CHARACTER
AT THAT POSITION.  FOR EXAMPLE, IF _# WAS
THE DELIMITING CHARACTER AND THE FOLLOWING THREE-LINE
RECORD WAS GIVEN TO <SORT:
.NF;.B
<ABC_#DEFGHI
<JKLM
<NOP_#QRS
.F;.B
THEN <SORT WOULD USE IT AS IF THE RECORD HAD BEEN A
ONE-LINE RECORD:
.B
<ABC_#DEFGHI_#JKLM_#NOP_#QRS
.P
FINALLY, IF NO FIELDS ARE DEFINED AT ALL, <SORT
ASSUMES A FIELD FROM COLUMN 1 TO 80 TO BE SORTED BY
CHARACTER IN ASCENDING ORDER.
THUS:
.TS 25;.NF;.B 1
	.<R <SORT
IS EQUIVALENT TO	.<R <SORT (1,80,^C,^A)
.F
.B 1
^THIS SERVES TO ALPHABETIZE UNIT RECORDS WHICH EITHER
BEGIN AT COLUMN ONE OR HAVE A UNIFORM FIRST CHARACTER
SUCH AS A TAB.
.B 3
.CM; <INFILE <FIELD <DEFINITIONS
.P
THE SAME RULES APPLY TO <INFILE <FIELD <DEFINITIONS AS TO
DEFINITIONS ON THE .<R <SORT LINE.
THE DIFFERENCE IS THAT THE SELECTED CONTROL LINE MUST START
WITH A CONTROL-CHARACTER _^'A THRU _^'H, OR _^'N THRU _^'X.
THE CHARACTERS _^'I (TAB), _^'J (LF), _^'K (VT), _^'L (FF), AND _^'M (CR) ARE NOT USED.
IN MULTILINE RECORDS, THE INFILE FIELD DEFINITION IS CONSIDERED A NON-SORTING RECORD.
THEREFORE IT SHOULD BE TERMINATED BY A RECORD MARK OR PADDED WITH ENOUGH LINES TO MAKE IT ONE RECORD LONG.
IF THERE IS ONLY ONE <INFILE DEFINITION RECORD (I.E. ONLY ONE LINE STARTING
WITH A CONTROL CHARACTER IN THE FILE), THE DEFAULT RECORD DEFINITION OF
ONE LINE RECORDS IS USED TO READ IT AND PADDING IS UNNECESSARY.
IF THERE IS MORE THAN ONE SUCH LINE, THE SECOND MUST CONFORM TO THE
RECORD DEFINITION OF THE FIRST IF THE FIRST IS SELECTED BY THE OPTION
SWITCH.
VERY LITTLE HARM CAN COME FROM USING THE RECORD DEFINITION CHARACTER
AFTER EACH <INFILE DEFINITION  RECORD EXCEPT IN THE USE OF <XTRACT
WHERE IT MAY CAUSE THE ^BOUNDARY ^RECORDS TO BE IMPROPERLY ASSIGNED.
THEREFORE, DO NOT USE MORE THAN ONE <INFILE DEFINITION WITH <XTRACT AND
DO NOT USE A RECORD TERMINATOR FOR IT.
IF YOU ARE CONFUSED, SEE THE EXAMPLE WITH THE <XTRACT WRITEUP AT THE
END OF THIS DOCUMENT.
.P
TO SELECT  AN INFILE DEFINITION AT RUN TIME, TYPE _"/_" FOLLOWED BY THE LETTER WHICH
CORRESPONDS TO THE CONTROL-CHARACTER.
SORTING LINES STARTING WITH _^'A THRU _^'H WILL
BE PASSED TO THE FRONT OF THE OUTPUT FILE IN SAME SEQUENCE AS FOUND IN FILE.
('THIS IS USEFUL TO FORCE TITLES TO FRONT OF SORT).
LINE STARTING WITH _^'N THRU _^'X WILL BE DROPPED FROM THE OUTPUT FILE.
IF </S IS USED ON THE COMMAND LINE, THE FILE WILL BE SORTED IN <BASIC STRIPPED <ASCII USING THE INFILE DEFINITION LINE STARTING WITH _^'S.
THE SAME RESTRICTION APPLIES TO THE </R OPTION OF <MERGE.
IF THE CONTROL LINE IS NOT AT THE START OF INPUT FILE THE SORT
PARAMETERS WILL BE CHANGED WHEN THE LINE IS FOUND.
THIS MAY CAUSE STRANGE OUTPUT SORTS AS PASS 2 WILL NOT RE-SORT THE
FIRST PART OF THE FILE.
THERE CAN BE MORE THAN ONE CONTROL LINE IN THE FILE AND EACH DEFINITION SELECTED
WILL CHANGE THE SORT AT THE POINT WHERE IT IS FOUND.
.B 5
.CM; <SORTING <METHODS
.P
<SORT CAN EITHER SORT A FIELD BY CHARACTERS OR BY NUMBERS.
.P
WHEN SORTING BY CHARACTERS, <SORT USES THE 8-BIT
<ASCII SORTING SEQUENCE.
THAT IS, A SPACE (240(8)) IS LOWER IN THE CHARACTER
SEQUENCE THAN A ZERO CHARACTER (260) AND THEY ARE BOTH
LOWER THAN AN _'^A_' (301).
A TAB IS ALWAYS CONVERTED TO SPACES EXCEPT WHEN
IT IS USED AS A DELIMITER CHARACTER.
<SORT DOES NOT USE ANY CHARACTER WHOSE <ASCII
EQUIVALENT IS LESS THAN 240.
.P
WHEN SORTING BY NUMBERS, <SORT CONVERTS THE NUMBER BY
THE FOLLOWING ALGORITHM.  LEADING SPACES IN THE FIELD
ARE IGNORED.  A PLUS OR MINUS SIGN INDICATES WHETHER OR
NOT TO NEGATE THE FINAL RESULT.  THE SIGN CAN BE
DIRECTLY IN FRONT OF THE FIRST DIGIT OR MAY BE SEPARATED
FROM THE NUMBER BY SPACES.
CONVERSION BEGINS WITH A DIGIT 0-9 AND ENDS WHEN THE
FIELD IS EXHAUSTED OR WHEN A NON-NUMERIC CHARACTER IS
RECOGNIZED.  NUMBERS SHOULD BE IN THE RANGE -2048 TO +2047.
IF NUMBERS IN THE RECORD ARE NOT IN THIS RANGE,
BREAK THE NUMERIC FIELD DOWN INTO A NUMERIC FIELD WITH
RANGE AS ABOVE AND A CHARACTER FIELD FOR THE REMAINDER OF IT.
.B 5;.CM; <SPECIAL <OPERATING <FEATURES
.P;<SORT CAN BE CHAINED TO IN THE NORMAL WAY DESCRIBED IN
THE <OS/8 ^HANDBOOKS, PARTICULARLY IN THE DESCRIPTION OF THE
^USER ^SERVICE ^ROUTINE FUNCTIONS.  WHEN <SORT IS CHAINED TO,
IT ASSUMES THAT THE ^COMMAND ^DECODER AREA OF FIELD 1 HAS BEEN
SET UP WITH APPROPRIATE INPUT AND OUTPUT FILES.
IT ALSO ASSUMES THAT LOCATION 01000+ HAS BEEN INITIALIZED
WITH A COMMAND STRING OF THE FROM THAT <SORT EXPECTS.
THIS IS SIMPLY ANY STRING WHICH HAS A LEFT PARENTHESIS,
THE FIELD DEFINITIONS, AND A RIGHT PARENTHESIS TO END THEM.
AN OPTIONAL NUMBER FOR NUMBER OF LINES PER RECORD CAN
ALSO BE GIVEN.  THE LINE MUST END WITH A ZERO WORD.
<SORT WILL ALSO TAKE THE COMMAND STRING FROM THE INPUT FILE
IF _"/_" OPTION IS USED.
.PAGE
.CM; <MERGE
.P; <MERGE WILL MERGE 2 SORTED INPUT FILES INTO 1 OUTPUT FILE.
STRANGE BUT PREDICTABLE RESULTS WILL OCCUR IF
BOTH INPUT FILES ARE NOT SORTED IN THE MERGE SEQUENCE.
<MERGE HAS NO WORK FILES.
.P; <MERGE HAS A REPLACE OPTION _"/'R_".
WITH THIS OPTION IF A LINE IN FILE_#1 MATCHES A LINE IN FILE_#2 IN
THE SORTED FIELDS,
THE LINE FROM FILE_#1 WILL GO TO THE OUTPUT FILE, AND THE LINE FROM FILE_#2
WILL BE DROPPED.
THIS PROVIDES A WAY TO UPDATE RECORDS WITH A NEW FILE_#1 REPLACING THE MATCH IN
OLD FILE_#2.
TO DO A DELETE WITHOUT REPLACING, BRING IN A DUMMY LINE AND LATER USE
<TECO OR <EDIT TO FIND AND DELETE THE DUMMY LINES IN THE OUTPUT FILE.
.B 2
.CM;<XTRACT
.P;<XTRACT CAN BE USED TO REDUCE THE SIZE OF A DATA SET BEFORE SORTING.
THE <RECORD AND <FIELD DEFINITIONS FOR <XTRACT ARE IDENTICAL TO THOSE
FOR <SORT AND <MERGE.
<XTRACT ALSO USES A PAIR OF SENTINAL RECORDS CALLED THE ^LOWER ^BOUND
AND THE ^UPPER ^BOUND.
THESE ARE TAKEN AS THE FIRST TWO RECORDS IN THE INPUT STREAM.
IN THE BASIC <XTRACT OPERATION, ALL RECORDS GREATER THAN OR EQUAL TO THE
LOWER BOUND AND LESS THAN OR EQUAL TO THE UPPER BOUND A PASSED TO THE
OUTPUT FILE.
AN OPTION SWITCH (/^V) IS AVAILABLE TO INVERT THE SENSE OF THE OPERATION.
IF /^V IS SPECIFIED ALL RECORDS LOWER THAN THE LOWER BOUND OR GREATER THAN
THE UPPER BOUND ARE PASSED TO THE OUTPUT FILE.
THE FOLLOWING EXAMPLE SHOWS A CONTROL FILE CONTAINING THE RECORD AND
FIELD DEFNITION INFORMATION AND BOUNDARY RECORDS TO EXTRACT ALL
NAMES STARTING WITH THE LETTERS ^G TO ^N:
.NF;.S 1
^^
_^G THIS STARTS WITH CTRL/G (_#240,2,C,A) 15 _#243
DUMMY G
_#
DUMMY NZZZZ
_#
\\

.F
NOTE THAT THE BOUNDARY RECORDS NEED NOT LOOK EXACTLY LIKE THE ACTUAL INPUT
RECORDS AS LONG AS THEY CONFORM TO THE FIELD AND RECORD DEFINITIONS.
THE ACTUAL INPUT RECORDS LIKELY HAVE AN IDENTIFICATION KEY ON THE FIRST
LINE OF THE RECORD WITH THE NAME ON THE SECOND LINE.
THE RECORDS ARE NO MORE THAN 15 LINES LONG AND END WITH THE _# RECORD
TERMINATOR.
IT SHOULD ALSO BE NOTED THAT THE <INFILE SORT KEY SHOULD NOT BE TERMINATED
BY THE RECORD TERMINATOR SINCE AT THE TIME IT IS READ, THE DEFAULT
RECORD DEFINITION IS IN EFFECT (1 LINE RECORDS).
IF A _# RECORD MARK WERE USED THE RECORD COUNT WOULD BE OFF BY ONE
AND THE BOUNDARY RECORDS WOULD BE IMPROPERLY READ.
.PAGE
.CM
<SPECIAL <OPTION <SUMMARY FOR <SORT
.LM +10
.S 2;.I -7
/^S#####^SORT USING <BASIC_'S SIXBIT <ASCII CHARACTERS.
.B;.I -7
/^Y#####^IGNORE FORMFEED CHARACTERS IMBEDDED IN THE INPUT FILE.
.B;.I -7
/^Z#####^OPTIMIZE SORTING FOR SLOW <I/O DEVICES.
.S 2
.CM -10
<SPECIAL <OPTION <SUMMARY FOR <MERGE
.S 2;.I -7
/^R#####^MERGE TWO FILES, REPLACING DUPLICATE LINES IN THE SECOND INPUT
FILE WITH LINES FROM THE FIRST INPUT FILE.
.S 2;.CM -10;<SPECIAL <OPTION FOR <XTRACT
.B 2;.I -7;/<V#####^INVERT THE SENSE OF RECORD EXTRACTION, EXTRACT THOSE RECORDS
FALLING OUTSIDE THE RANGE OF THE BOUNDARY RECORDS.
.LM -10
.S -6
.CM; <NOTE\\
.S 1
'ALL OCTAL NUMBERS MUST BE IN THE RANGE OF 201 TO 376.
('SORT FORCES THE PARITY BIT ON WHILE READING INPUT).
.PG
.B 4
.C;<FATAL <ERROR <MESSAGES
.S 2
.M 18,70
.I -8
<OE?#####^^OUTPUT ERROR
.B
.S 1
.I -8
<OER#####^^NO ROOM FOR OUTPUT FILE
.B
.S 1
.I -8
<IE######<INPUT <ERROR, FILENAME
.B
.S 1
.I -8
<OE######^^CAN_'T OPEN OUTPUT FILE
.B
.S 1
.I -8
<WD######^^WORK DEVICE IS NOT DIRECTORY ORIENTED
.B
.S 1
.I -8
<NRW#####^^NO ROOM FOR WORK FILE
.B
.S 1
.I -8
<FE######^^FATAL SORT PROGRAM ERROR
.B
.S 1
.I -8
<FI######^^INPUT EXCEEDED BUFFER.
.B
.S 1
.I -8
IS######ILLEGAL SYNTAX IN SORT SPECIFICATION.
.M 1,70
.S 4
.C;<NON-FATAL <ERROR <MESSAGES
.S 2
.I 9
<IE 512##^^INPUT RECORD EXCEEDS 512 CHARACTERS.  <TRUNCATING.
.! END OF SORT.RO