File: SORT.RO of Tape: Various/Decus/decus-5
(Source file text)
.! SORT DOCUMENTATION .! CLYDE G. ROBY, JR. .! DEPARTMENT OF PHYSIOLOGY AND BIOPHYSICS .! WEST VIRGINIA UNIVERSITY MEDICAL CENTER .! MORGANTOWN, WEST VIRGINIA .! CHANGES IN SEPTEMBER, 1974 FOR VERSION 3 (MULTI-LINE RECORDS) .! CHANGES IN OCTOBER, 1974 FOR CHAINING ABILITY .! CHANGES IN MAY, 1975 FOR VERSION 5 .! (HANDLING LARGER FILES, MULTI-LINE BUG) .! CHANGES IN MAY, 1976 FOR VERSION 7 .! (DROPPED FILE SIZE & MADE COMPATIBLE WITH MERGE) JBC .! CHANGES JULY 1977 ADDED XTRACT DESCRIPTION, CLEANED UP INTRODUCTION TMC .FLAG CAPITALIZE;.FLAG FIRSTCAPITALIZE;.TR .HEADER BOTTOM CENTER NUMBER .CHAPTER <SORT .T .M 10, 70 .B 5 .CM; ^^SORT/MERGE/XTRACT\\ .B 1 ^' .CM; A PACKAGE OF <OS/8 COMPATIBLE SORT - MERGE PROGRAMS .B 3 .CM; SMALL COMPUTER LABORATORY .CM; DEPARTMENT \OF PHYSIOLOGY \AND BIOPHYSICS .CM; WEST VIRGINIA UNIVERSITY MEDICAL CENTER .CM; MORGANTOWN, WEST VIRGINIA 26506 .B 3 .CM; <SORT VERSION 8 .CM; JUNE 1, 1977 .B 3 .CM; <MERGE VERSION 2 .CM; JUNE 1, 1977 .B 3 .CM; <XTRACT VERSION 1 .CM;JUNE 1, 1977 \' .PG .CM;<SORT/MERGE/XTRACT .P THIS DOCUMENT DESCRIBES A PACKAGE OF PROGRAMS FOR DEALING WITH <OS/8 <ASCII FILES. THE PRINCIPAL UTILITY IS <SORT WHICH WAS WRITTEN IN IT_'S ORIGINAL VERSION BY ^'C. G. ROBY JR\'.. <MERGE AND <XTRACT ARE COMPANION PROGRAMS TO ASSIST IN THE EFFICIENT SORTING OF LARGE DATA SETS. <MERGE WAS WRITTEN BY ^'JAMES B. CORYELL \OF DATAPRODUCTS CORPORATION \AND <XTRACT \BY THOMAS W. MC^INTYRE \AND ALAN SMOTHERS\' OF <WVU. <JBC ALSO EXTENSIVELY REVISED <SORT FROM VERSION 6 TO 7. <TMC IS RESPONSIBLE FOR THE PRESENT VERSIONS OF ALL THE PROGRAMS. .S 2;.CM;<SORT .P <SORT IS A PROGRAM TO SORT <OS/8 COMPATIBLE <ASCII FILES. THE SORTING IS BY RECORDS. A RECORD IS DEFINED TO BE A STRING OF NO MORE THAN 512 <ASCII CHARACTERS. A RECORD IS TERMINATED BY _"N_" LINES OR A 'RECORD-'MARK (RM). NORMALLY ANY OF THE CHARACTERS <LF, <FF, OR <VT WILL BE TAKEN AS A LINE TERMINATOR. <CR IS ALWAYS IGNORED. IF /<Y IS GIVEN AS AN OPTION, <FF IS ALSO IGNORED. THE STRING CONSISTS OF ONE OR MORE LINES. A LINE IS TERMINATED WITH A LINE FEED. ALL OTHER CONTROL CHARACTERS EXCEPT TABS ARE IGNORED. A RECORD WHICH HAS ONLY CONTROL CHARACTERS IS CALLED A _'NULL_' RECORD. _'^NULL_' RECORDS ARE IGNORED. THAT IS, THERE WILL BE NO NULL RECORDS IN THE OUTPUT FILE. .P THE USER HAS THE OPTION TO DEFINE FIELDS FOR THE SORTING EITHER BY FIXED COLUMN POSITIONS OR BOUNDED BY ARBITRARY DELIMITING CHARACTERS. THE SORTING CAN BE EITHER ASCENDING OR DESCENDING WITHIN EACH FIELD. THE SORTING PROCEDURE USED IS A MULTI-PASS SORT-MERGE WITH INTERMEDIATE TEMPORARY OUTPUT FILES. THE DEVICES FOR THE OUTPUT FILES MAY BE SPECIFIED BY THE USER TO OPTIMIZE THE SORTING. .P TO RUN <SORT UNDER <OS/8, TYPE IN RESPONSE TO THE MONITOR _"._": .TS 15;.NF;.B 1 <.R <SORT OR <.R <SORT (FIELD DEFINITIONS) OR <.R <SORT (FIELD DEFINITIONS) N OR <.R <SORT (FIELD DEFINITIONS) N _#RM .F .P WHEN <SORT IS LOADED IT CALLS ^COMMAND ^DECODER FOR USER INPUT OF <I/O SPECIFICATIONS. IN RESPONSE TO THE ASTERISK THE USER MAY SPECIFY UP TO NINE INPUT FILES AND UP TO THREE OUTPUT FILES. THE INPUT FILES WILL BE CONSTRUED TO BE A SINGLE FILE. THE FIRST OUTPUT FILE IS THE SORTED OUTPUT, THE NEXT TWO ARE THE INTERMEDIATE WORK FILES. (<MERGE HAS NO WORK FILES, ONE OUTPUT FILE, AND TWO INPUT FILES ONLY). IF THE INTERMEDIATE FILES ARE OMITTED, THE FILES <DSK:SRTWK1.TM AND <DSK:SRTWK2.TM ARE CREATED. IF ONLY DEVICES ARE SPECIFIED FOR THE SECOND AND THIRD OUTPUT FILES, THEN <SRTWK1.TM AND <SRTWK2.TM WILL BE CREATED ON THE USER SPECIFIED DEVICES. FOR SMALL OR SLOW <I/O DEVICES, THE TEMPORARY FILES SHOULD BE ON DIFFERENT DEVICES TO IMPROVE THE SORT SPEED. THE FIRST PASS OUTPUT IS TO THE THIRD OUTPUT FILE. THE LENGTHS OF THE INTERMEDIATE FILES WILL BE ONE _& ONE-HALF TO TWICE THE LENGTH OF ALL THE INPUT FILES. THE INTERMEDIATE FILES MUST RESIDE ON DIRECTORY DEVICES. IF <SORT CANNOT OPEN THESE TWO FILES, AN ERROR WILL BE PRINTED OUT. THE OPTION SWITCH _"^S_" MAY BE USED TO SPECIFY <BASIC COMPATIBLE STRIPPED <ASCII COMPARES. THE INPUT FILES ARE STILL ASSUMED TO BE IN FULL <ASCII, BUT THE STRANGE COMPARES OF <BASIC ARE USED (I.E. THE CONTROL CHARACTERS AND NUMBERS ARE _"LATER_" IN THE <ASCII SEQUENCE THAN THE ALPHABETICS). IF THIS OPTION IS USED, THE DELIMITER FORM OF FIELD SPECIFICATION CAN NOT BE USED WITH A NUMERIC FIELD. 'EXAMPLES OF COMMAND DECODER LINES FOLLOW: .NF;.B 1 ^^*SRTOUT_<INPUT\\ .F .P <SRTWK1.TM AND <SRTWK2.TM ARE CREATED ON <DSK:. <SRTOUT WILL GO TO <DSK: WHEN SORTING IS COMPLETE. <INPUT IS ASSUMED TO BE ON <DSK:. .NF;.B 1 ^^*DSK:SRTOUT,SYS:WORKB,DSK:WORKA_<SYS:INPUT/Z\\ .F .P ON A SYSTEM WITH SLOW OR SMALL MASS STORE DEVICES THE _'^ZIGZAG_' OPTION WILL OPTIMIZE THE <I/O OPERATIONS. THE DEVICES FOR THE FILES SHOULD BE ALTERNATED IN THE COMMAND DECODER LINE AS SHOWN AND THE OPTION SPECIFIED. <SORT WILL MAKE AN EXTRA PASS IF NECESSARY TO AVOID READING AND WRITING ON THE SAME DEVICE. THIS OPTION SHOULD NOT BE USED ON DISK SINCE IT WILL GENERALLY REQUIRE AN ADDITIONAL PASS WHICH WILL GENERALLY TAKE LONGER THAN THE ONE DISK COPY. THERE NEED NOT BE ROOM ON THE FINAL OUTPUT DEVICE FOR BOTH A WORK FILE AND THE OUTPUT SINCE THE WORK FILE WILL HAVE BEEN DELETED BEFORE THE FINAL PASS. .NF;.B 1 ^^*SRTOUT,DTA1:,SYS:_<INPUT/S/T\\ .F .P <INPUT WILL BE SORTED AND SENT TO <SRTOUT ON DEVICE <DSK:. THE FILES <DTA1:SRTWK1.TM AND <SYS:SRTWK2.TM WILL BE USED AS INTERMEDIATE FILES. THE COMPARISONS WILL BE MADE ON STRIPPED 6 BIT CODES. THE /'T WILL CAUSE A LINE STARTING WITH CONTROL-'T TO OVER-RIDE THE SORT FIELD DEFINITIONS. .B 5 .CM; <RECORD <DEFINITION .P;THE NUMBER N ON THE COMMAND LINE, IF PRESENT, IS THE NUMBER OF LINES WHICH <SORT WILL USE AS A SINGLE RECORD. IF THIS NUMBER IS OMITTED, THEN <SORT WILL USE EACH INPUT LINE AS A RECORD. THE _#RM IF PRESENT WILL CAUSE THE OCTAL CODE (RM) TO TERMINATE THE RECORD IF FOUND BEFORE N LINES. (_#RM MUST BE IN THE RANGE OF 201-376 OCTAL). .P;FOR EXAMPLE: .NF;.B <.R <SORT (FIELD DEFINITIONS) 3 _#244 .B;.F INDICATES THAT EACH RECORD CONSISTS OF 1 TO 3 LINES, TERMINATED BY THE THIRD LINE-FEED OR THE FIRST $ SIGN FOUND. IF YOU USE THE _#RM THEN N SHOULD BE LONGER THAN THE LARGEST RECORD. ONE SHOULD NOTE THAT IF THE INPUT FILE HAS EXTRA LINES WHICH DO NOT CONSTITUTE A FULL RECORD, THEN THAT RECORD IS LOST. THAT IS, IF THERE ARE 6 LINES PER RECORD, AND THERE ARE 4 EXTRA LINES AT THE END OF THE INPUT FILE, THESE 4 LINES ARE LOST. IF RECORD MARKS ARE USED AS THE END OF LINE CHARACTER, THE CHARACTER FOLLOWING THE RECORD MARK BECOMES THE FIRST CHARACTER OF THE NEXT RECORD. .TP 8 .B 5 .CM; ^^FIELD DEFINITIONS\\ .P THE FIELD DEFINITIONS FOR THE SORTING ARE ENTERED ON THE LINE WITH THE "_<R#SORT_" COMMAND TO THE <OS/8 MONITOR, OR AS A LINE OR LINES IN THE INPUT FILE USING THE _"/_" OPTION WITH COMMAND DECODER TO SELECT THE DESIRED CONTROL LINE. THERE ARE TWO POSSIBLE FORMATS FOR FIELD DEFINITIONS: COLUMN OR DELIMITER. A TOTAL OF 32 KEYS MAY BE DEFINED BY COLUMNS OR 42 BY DELIMITER. .P TO ENTER DEFINITIONS BY COLUMN, TYPE: .TS 10;.NF;.B 1 (STARTING COL., LENGTH, ^C OR ^N, ^A OR ^D; ...) .F .P EACH FIELD TO BE USED IN THE SORTING IS DEFINED AS THE INITIAL COLUMN OF THE FIELD AND THE FIELD LENGTH. ADDITIONAL PARAMETERS ARE ^CHARACTER OR ^NUMERIC SORT SPECIFIED BY ^C OR ^N, AND ^ASCENDING OR ^DESCENDING ORDER SPECIFIED BY ^A OR ^D. ALL PARAMETERS WITHIN A FIELD ARE SEPARATED BY COMMAS AND FIELD DEFINITIONS ARE SEPARATED BY SEMICOLONS. THE FIELDS ARE DEFINED IN ORDER AS MAJOR FIELD, 1ST MINOR, 2ND MINOR, ETC. FIELDS ARE NOT REORDERED ON OUTPUT SO THAT THE INDIVIDUAL RECORDS MAINTAIN THEIR ORIGINAL INTERNAL SEQUENCE. THE ENTIRE SET OF DEFINITIONS MUST BE ENCLOSED BY PARENTHESES. TABS IN THE INPUT FILE ARE EXPANDED TO ENOUGH SPACES TO TAB TO FIXED TAB STOPS OF 9, 17, 25, ... .P IF ANY FIELDS ARE DEFINED THE STARTING COLUMN MUST ALWAYS BE GIVEN. IF THE LENGTH IS OMITTED IT IS ASSUMED TO BE ONE. IF THE TYPE IS OMITTED IT IS ASSUMED TO BE ^CHARACTER AND IF THE ORDER IS OMITTED IT IS ASSUMED TO BE ^ASCENDING. FOR EXAMPLE: .TS 15;.NF;.B 1 .<R <SORT (7,3; 10,4,,^D) .F .B 1 DEFINES THE MAJOR FIELD TO START AT COLUMN 7 AND GO TO COLUMN 9, ^CHARACTER, ^ASCENDING; AND DEFINES ONE MINOR FIELD STARTING AT COLUMN 10 AND GOING TO COLUMN 13, ^CHARACTER, ^DESCENDING. NOTE THAT IF A PARAMETER IS OMITTED, THE COMMAS MUST STILL BE PRESENT IF ANY OTHER PARAMETERS FOLLOW IN THE SAME FIELD DEFINITION. .P;FOR MULTIPLE-LINE RECORDS, THE LAST CHARACTER OF A LINE IMMEDIATELY PRECEDES THE FIRST CHARACTER OF THE NEXT LINE. .TP 7 FOR EXAMPLE, THE THREE-LINE RECORD: .NF;.B <ABC <DEFGHI <JKLM .B;.F IS EQUIVALENT TO THE SINGLE-LINE RECORD: .B <ABCDEFGHIJKLMN .P ALTERNATIVELY, FIELDS MAY BE DEFINED BY A DELIMITING CHARACTER. ANY <ASCII CHARACTER WHICH MAY BE TYPED INTO A MONITOR LINE MAY BE USED. THE FORMAT USING DELIMITERS IS: .TS 10;.NF;.B 1 (=DELIMITER CHAR., FIELD NO., ^C OR ^N, ^A OR ^D; ...) .F .P THE FIRST ITEM MUST BE THE EQUAL SIGN WHICH IS FOLLOWED BY THE DELIMITING CHARACTER. THE _"=DELIMITER CHAR._" CAN BE REPLACED WITH _"_#223_" WHERE _"223_" IS THE OCTAL VALUE OF THE DELIMITER. ('USED TO GET AROUND 'KEYBOARD 'MONITOR LIMITATIONS). THIS CONSTRUCTION IS NOT REPEATED IN MINOR FIELD DEFINITIONS. THE FIELDS ARE THEN SPECIFIED BY NUMBER WITH ONE BEING THE FIELD TO THE LEFT OF THE FIRST OCCURRENCE OF THE DELIMITING CHARACTER. THE REMAINING PARAMETERS ARE THE SAME AS FOR COLUMN DEFINITIONS. TABS IN THE INPUT FILE ARE NOT EXPANDED. FOR EXAMPLE: .TS 15;.NF;.B 1 .<R <SORT (=_#, 3; 2,,^D) OR .<R <SORT (_#243, 3; 2,,'D) .F .B 1 DEFINES THE MAJOR FIELD TO LIE BETWEEN THE SECOND AND THIRD OCCURRENCE OF THE _# AND TO BE SORTED BY ^CHARACTER, ^ASCENDING; AND DEFINES ONE MINOR FIELD BETWEEN THE FIRST AND SECOND OCCURRENCES OF _# TO BE SORTED BY ^CHARACTER, ^DESCENDING. THE TAB CHARACTER IS CONVENIENT FOR SORTING TABULATED RECORDS BUT IS INCONVENIENT TO USE IN THIS DOCUMENTATION. .P;FOR MULTIPLE-LINE RECORDS, THE END OF LINE IS EQUIVALENT TO PLACING A DELIMITER CHARACTER AT THAT POSITION. FOR EXAMPLE, IF _# WAS THE DELIMITING CHARACTER AND THE FOLLOWING THREE-LINE RECORD WAS GIVEN TO <SORT: .NF;.B <ABC_#DEFGHI <JKLM <NOP_#QRS .F;.B THEN <SORT WOULD USE IT AS IF THE RECORD HAD BEEN A ONE-LINE RECORD: .B <ABC_#DEFGHI_#JKLM_#NOP_#QRS .P FINALLY, IF NO FIELDS ARE DEFINED AT ALL, <SORT ASSUMES A FIELD FROM COLUMN 1 TO 80 TO BE SORTED BY CHARACTER IN ASCENDING ORDER. THUS: .TS 25;.NF;.B 1 .<R <SORT IS EQUIVALENT TO .<R <SORT (1,80,^C,^A) .F .B 1 ^THIS SERVES TO ALPHABETIZE UNIT RECORDS WHICH EITHER BEGIN AT COLUMN ONE OR HAVE A UNIFORM FIRST CHARACTER SUCH AS A TAB. .B 3 .CM; <INFILE <FIELD <DEFINITIONS .P THE SAME RULES APPLY TO <INFILE <FIELD <DEFINITIONS AS TO DEFINITIONS ON THE .<R <SORT LINE. THE DIFFERENCE IS THAT THE SELECTED CONTROL LINE MUST START WITH A CONTROL-CHARACTER _^'A THRU _^'H, OR _^'N THRU _^'X. THE CHARACTERS _^'I (TAB), _^'J (LF), _^'K (VT), _^'L (FF), AND _^'M (CR) ARE NOT USED. IN MULTILINE RECORDS, THE INFILE FIELD DEFINITION IS CONSIDERED A NON-SORTING RECORD. THEREFORE IT SHOULD BE TERMINATED BY A RECORD MARK OR PADDED WITH ENOUGH LINES TO MAKE IT ONE RECORD LONG. IF THERE IS ONLY ONE <INFILE DEFINITION RECORD (I.E. ONLY ONE LINE STARTING WITH A CONTROL CHARACTER IN THE FILE), THE DEFAULT RECORD DEFINITION OF ONE LINE RECORDS IS USED TO READ IT AND PADDING IS UNNECESSARY. IF THERE IS MORE THAN ONE SUCH LINE, THE SECOND MUST CONFORM TO THE RECORD DEFINITION OF THE FIRST IF THE FIRST IS SELECTED BY THE OPTION SWITCH. VERY LITTLE HARM CAN COME FROM USING THE RECORD DEFINITION CHARACTER AFTER EACH <INFILE DEFINITION RECORD EXCEPT IN THE USE OF <XTRACT WHERE IT MAY CAUSE THE ^BOUNDARY ^RECORDS TO BE IMPROPERLY ASSIGNED. THEREFORE, DO NOT USE MORE THAN ONE <INFILE DEFINITION WITH <XTRACT AND DO NOT USE A RECORD TERMINATOR FOR IT. IF YOU ARE CONFUSED, SEE THE EXAMPLE WITH THE <XTRACT WRITEUP AT THE END OF THIS DOCUMENT. .P TO SELECT AN INFILE DEFINITION AT RUN TIME, TYPE _"/_" FOLLOWED BY THE LETTER WHICH CORRESPONDS TO THE CONTROL-CHARACTER. SORTING LINES STARTING WITH _^'A THRU _^'H WILL BE PASSED TO THE FRONT OF THE OUTPUT FILE IN SAME SEQUENCE AS FOUND IN FILE. ('THIS IS USEFUL TO FORCE TITLES TO FRONT OF SORT). LINE STARTING WITH _^'N THRU _^'X WILL BE DROPPED FROM THE OUTPUT FILE. IF </S IS USED ON THE COMMAND LINE, THE FILE WILL BE SORTED IN <BASIC STRIPPED <ASCII USING THE INFILE DEFINITION LINE STARTING WITH _^'S. THE SAME RESTRICTION APPLIES TO THE </R OPTION OF <MERGE. IF THE CONTROL LINE IS NOT AT THE START OF INPUT FILE THE SORT PARAMETERS WILL BE CHANGED WHEN THE LINE IS FOUND. THIS MAY CAUSE STRANGE OUTPUT SORTS AS PASS 2 WILL NOT RE-SORT THE FIRST PART OF THE FILE. THERE CAN BE MORE THAN ONE CONTROL LINE IN THE FILE AND EACH DEFINITION SELECTED WILL CHANGE THE SORT AT THE POINT WHERE IT IS FOUND. .B 5 .CM; <SORTING <METHODS .P <SORT CAN EITHER SORT A FIELD BY CHARACTERS OR BY NUMBERS. .P WHEN SORTING BY CHARACTERS, <SORT USES THE 8-BIT <ASCII SORTING SEQUENCE. THAT IS, A SPACE (240(8)) IS LOWER IN THE CHARACTER SEQUENCE THAN A ZERO CHARACTER (260) AND THEY ARE BOTH LOWER THAN AN _'^A_' (301). A TAB IS ALWAYS CONVERTED TO SPACES EXCEPT WHEN IT IS USED AS A DELIMITER CHARACTER. <SORT DOES NOT USE ANY CHARACTER WHOSE <ASCII EQUIVALENT IS LESS THAN 240. .P WHEN SORTING BY NUMBERS, <SORT CONVERTS THE NUMBER BY THE FOLLOWING ALGORITHM. LEADING SPACES IN THE FIELD ARE IGNORED. A PLUS OR MINUS SIGN INDICATES WHETHER OR NOT TO NEGATE THE FINAL RESULT. THE SIGN CAN BE DIRECTLY IN FRONT OF THE FIRST DIGIT OR MAY BE SEPARATED FROM THE NUMBER BY SPACES. CONVERSION BEGINS WITH A DIGIT 0-9 AND ENDS WHEN THE FIELD IS EXHAUSTED OR WHEN A NON-NUMERIC CHARACTER IS RECOGNIZED. NUMBERS SHOULD BE IN THE RANGE -2048 TO +2047. IF NUMBERS IN THE RECORD ARE NOT IN THIS RANGE, BREAK THE NUMERIC FIELD DOWN INTO A NUMERIC FIELD WITH RANGE AS ABOVE AND A CHARACTER FIELD FOR THE REMAINDER OF IT. .B 5;.CM; <SPECIAL <OPERATING <FEATURES .P;<SORT CAN BE CHAINED TO IN THE NORMAL WAY DESCRIBED IN THE <OS/8 ^HANDBOOKS, PARTICULARLY IN THE DESCRIPTION OF THE ^USER ^SERVICE ^ROUTINE FUNCTIONS. WHEN <SORT IS CHAINED TO, IT ASSUMES THAT THE ^COMMAND ^DECODER AREA OF FIELD 1 HAS BEEN SET UP WITH APPROPRIATE INPUT AND OUTPUT FILES. IT ALSO ASSUMES THAT LOCATION 01000+ HAS BEEN INITIALIZED WITH A COMMAND STRING OF THE FROM THAT <SORT EXPECTS. THIS IS SIMPLY ANY STRING WHICH HAS A LEFT PARENTHESIS, THE FIELD DEFINITIONS, AND A RIGHT PARENTHESIS TO END THEM. AN OPTIONAL NUMBER FOR NUMBER OF LINES PER RECORD CAN ALSO BE GIVEN. THE LINE MUST END WITH A ZERO WORD. <SORT WILL ALSO TAKE THE COMMAND STRING FROM THE INPUT FILE IF _"/_" OPTION IS USED. .PAGE .CM; <MERGE .P; <MERGE WILL MERGE 2 SORTED INPUT FILES INTO 1 OUTPUT FILE. STRANGE BUT PREDICTABLE RESULTS WILL OCCUR IF BOTH INPUT FILES ARE NOT SORTED IN THE MERGE SEQUENCE. <MERGE HAS NO WORK FILES. .P; <MERGE HAS A REPLACE OPTION _"/'R_". WITH THIS OPTION IF A LINE IN FILE_#1 MATCHES A LINE IN FILE_#2 IN THE SORTED FIELDS, THE LINE FROM FILE_#1 WILL GO TO THE OUTPUT FILE, AND THE LINE FROM FILE_#2 WILL BE DROPPED. THIS PROVIDES A WAY TO UPDATE RECORDS WITH A NEW FILE_#1 REPLACING THE MATCH IN OLD FILE_#2. TO DO A DELETE WITHOUT REPLACING, BRING IN A DUMMY LINE AND LATER USE <TECO OR <EDIT TO FIND AND DELETE THE DUMMY LINES IN THE OUTPUT FILE. .B 2 .CM;<XTRACT .P;<XTRACT CAN BE USED TO REDUCE THE SIZE OF A DATA SET BEFORE SORTING. THE <RECORD AND <FIELD DEFINITIONS FOR <XTRACT ARE IDENTICAL TO THOSE FOR <SORT AND <MERGE. <XTRACT ALSO USES A PAIR OF SENTINAL RECORDS CALLED THE ^LOWER ^BOUND AND THE ^UPPER ^BOUND. THESE ARE TAKEN AS THE FIRST TWO RECORDS IN THE INPUT STREAM. IN THE BASIC <XTRACT OPERATION, ALL RECORDS GREATER THAN OR EQUAL TO THE LOWER BOUND AND LESS THAN OR EQUAL TO THE UPPER BOUND A PASSED TO THE OUTPUT FILE. AN OPTION SWITCH (/^V) IS AVAILABLE TO INVERT THE SENSE OF THE OPERATION. IF /^V IS SPECIFIED ALL RECORDS LOWER THAN THE LOWER BOUND OR GREATER THAN THE UPPER BOUND ARE PASSED TO THE OUTPUT FILE. THE FOLLOWING EXAMPLE SHOWS A CONTROL FILE CONTAINING THE RECORD AND FIELD DEFNITION INFORMATION AND BOUNDARY RECORDS TO EXTRACT ALL NAMES STARTING WITH THE LETTERS ^G TO ^N: .NF;.S 1 ^^ _^G THIS STARTS WITH CTRL/G (_#240,2,C,A) 15 _#243 DUMMY G _# DUMMY NZZZZ _# \\ .F NOTE THAT THE BOUNDARY RECORDS NEED NOT LOOK EXACTLY LIKE THE ACTUAL INPUT RECORDS AS LONG AS THEY CONFORM TO THE FIELD AND RECORD DEFINITIONS. THE ACTUAL INPUT RECORDS LIKELY HAVE AN IDENTIFICATION KEY ON THE FIRST LINE OF THE RECORD WITH THE NAME ON THE SECOND LINE. THE RECORDS ARE NO MORE THAN 15 LINES LONG AND END WITH THE _# RECORD TERMINATOR. IT SHOULD ALSO BE NOTED THAT THE <INFILE SORT KEY SHOULD NOT BE TERMINATED BY THE RECORD TERMINATOR SINCE AT THE TIME IT IS READ, THE DEFAULT RECORD DEFINITION IS IN EFFECT (1 LINE RECORDS). IF A _# RECORD MARK WERE USED THE RECORD COUNT WOULD BE OFF BY ONE AND THE BOUNDARY RECORDS WOULD BE IMPROPERLY READ. .PAGE .CM <SPECIAL <OPTION <SUMMARY FOR <SORT .LM +10 .S 2;.I -7 /^S#####^SORT USING <BASIC_'S SIXBIT <ASCII CHARACTERS. .B;.I -7 /^Y#####^IGNORE FORMFEED CHARACTERS IMBEDDED IN THE INPUT FILE. .B;.I -7 /^Z#####^OPTIMIZE SORTING FOR SLOW <I/O DEVICES. .S 2 .CM -10 <SPECIAL <OPTION <SUMMARY FOR <MERGE .S 2;.I -7 /^R#####^MERGE TWO FILES, REPLACING DUPLICATE LINES IN THE SECOND INPUT FILE WITH LINES FROM THE FIRST INPUT FILE. .S 2;.CM -10;<SPECIAL <OPTION FOR <XTRACT .B 2;.I -7;/<V#####^INVERT THE SENSE OF RECORD EXTRACTION, EXTRACT THOSE RECORDS FALLING OUTSIDE THE RANGE OF THE BOUNDARY RECORDS. .LM -10 .S -6 .CM; <NOTE\\ .S 1 'ALL OCTAL NUMBERS MUST BE IN THE RANGE OF 201 TO 376. ('SORT FORCES THE PARITY BIT ON WHILE READING INPUT). .PG .B 4 .C;<FATAL <ERROR <MESSAGES .S 2 .M 18,70 .I -8 <OE?#####^^OUTPUT ERROR .B .S 1 .I -8 <OER#####^^NO ROOM FOR OUTPUT FILE .B .S 1 .I -8 <IE######<INPUT <ERROR, FILENAME .B .S 1 .I -8 <OE######^^CAN_'T OPEN OUTPUT FILE .B .S 1 .I -8 <WD######^^WORK DEVICE IS NOT DIRECTORY ORIENTED .B .S 1 .I -8 <NRW#####^^NO ROOM FOR WORK FILE .B .S 1 .I -8 <FE######^^FATAL SORT PROGRAM ERROR .B .S 1 .I -8 <FI######^^INPUT EXCEEDED BUFFER. .B .S 1 .I -8 IS######ILLEGAL SYNTAX IN SORT SPECIFICATION. .M 1,70 .S 4 .C;<NON-FATAL <ERROR <MESSAGES .S 2 .I 9 <IE 512##^^INPUT RECORD EXCEEDS 512 CHARACTERS. <TRUNCATING. .! END OF SORT.RO