HeapUniqueX

Preamble

Some people enjoy Sudoku, and some people enjoy devising, and endlessly fiddling with, minor routines to make them as useful and efficient as possible. Im one of the latter persuasion. While Sudoku may be useful for oiling the cogs of your mind, the grinding of your gears are of little direct consequence to others. I present these modest offerings borne of my toils in the hope that they may be useful to others as well as myself.

These routines do not come about due to some random brain fart, but mostly arise out of a larger project, when some functionality seems to be general enough to lend itself to wrapping up and documenting for later re-use (a fact well enough known to those who do this for a living). All that is then wanted is knowledge of the inputs and outputs.

What it does

This suite builds on previous routines under the Sorting and Searching banner. Its purpose is to take an incoming stream of strings, possibly of very varying lengths and, if they are unique, add them to a heap in memory. Should the heap fill up, a new chunk of memory is added so the process can continue uninterrupted until the end of the program, or until either free memory is exhausted or the number of items exceeds the index array you, the programmer, have set. All this faffing around happens automagically, without much input from you who, no doubt, would rather be focused on matters of your project.

Limitations

As it stands, the program expects SMSQ/E and SBASIC, but the main routines should all run under Qdos + TK2 too with minor modifications, eg integer FOR loops and SELect..

While you can have numerous logical heaps by maintaining a separate index for each heap, in fact there is only one heap. No attempt is made to manage the heap other than by initiating it, stretching it according to a single rule, and releasing it all in one go. Without additional (and complicating) code, the heap cannot be split, shrunk or selectively released.

To test

The code comes with a primitive harness and some suggestive instructions as to how the code could be slotted into a larger project. To try it out, in SMSQ/E execute the BASIC file directly (using FEW() to capture any errors returned from the program), or run it from QD with the SBAS/QD Thing. Qdos users will have to make some small changes to get it to work. The external commands referred to can all be found in Knoware.no's Toolkit section.

Once up and running, type some strings and see how they are ordered as you add each one. Only unique strings are added to the heap. Press ENTER on an empty input to terminate, or keep going until the index limit is reached (here 10 items). On termination the program will output the contents of the heap, with strings alphabetically ordered, together with some meta information, to ram1_otest_txt.


10 rem + ************************************************************************ +
12 rem *<                              HeapUniqueX                               >*
14 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
16 rem *                 Create a list of unique items in memory                  *
18 rem *                                                                          *
20 rem * Scan a list using binary search routine. If item found return its        *
22 rem * position in the index. (The relative positions in index change!)         *
24 rem * If not found, add item to list thus creating a growing list of unique    *
26 rem * items.                                                                   *
28 rem *                                                                          *
30 rem * Should the list grow beyond the initial heap, the heap is expanded by    *
32 rem * the addition of chunks of memory, until free memory is exhausted         *
34 rem *                                                                          *
36 rem * This version uses an index of pointers to address items in memory.       *
38 rem * Note: Only the index is sorted; the heap items remain in the order       *
40 rem * they were added to the heap.                                             *
42 rem *                                                                          *
44 rem * In this version most GLOBal variables are passed as parameters, so one   *
46 rem * could operate multiple lists in parallel.                                *
48 rem *                                                                          *
50 rem * Uses external commands: CMP%, PEEKSTR$, POKESTR$, EVEN                   *
52 rem + ------------------------------------------------------------------------ +
54 rem * V0.01, pjw, 2020 Mar 10                                                  *
56 rem * V0.01, pjw, 2022 Jul 01, Absolute pointers (ie no offsets)               *
58 rem * V0.02, pjw, 2022 Jul 03, Integrated version, all GLOBals as parameters   *
60 rem + ************************************************************************ +
62 :
64 :
100 EXT_PROC 'POKESTR$'
102 EXT_FN   'PEEKSTR$', 'EVEN', 'CMP%'
104 :
106 rem                            Simple Test Harness
108 :
110 rem $$off
112 IF PEEK$(\\ -4, 4) = 'SBAS' AND JOBID THEN
114  LRESPR 'dev4_str_cmp_CMP_BIN'
116  LRESPR 'dev4_mem_pk_PEEKSTR_BIN'
118  LRESPR 'dev4_num_etc_EVEN2_BIN'
120 END IF
122 rem $$on
124 :
126 :
128 rem     Initialise GLOBal values (User)
130 hp_hsz  = 100:                          rem Initial size of heap
132 hp_isz% =  10:                          rem Index size
134 hp_blk  = 100:                          rem Additional chunks of memory
136 hp_sav  =   1:                          rem Save data flag (0/1)
138 :
140 rem     Initialise GLOBal values (Program)
142 hp_chk% =   1:                          rem Chunk count, including initial
144 hp_mem  = hp_hsz:                       rem Amount of memory used
146 DIM hp_idx(hp_isz%):                    rem Init index
148 hp_adr  = ALCHP(hp_hsz):                rem Init heap
150 hp_hsz  = hp_adr + hp_hsz
152 hp_cnt% = 0:                            rem Current number of items
154 :
156 rem     Set sentinel to lowest value
158 hp_idx(0) = hp_adr
160 POKE_W hp_adr, 0: hp_ptr = hp_adr + 2
162 :
164 :
166 rem     Test harness
168 cd = FOPEN("con"): BORDER#cd; 1, 2: CLS#cd
170 REPeat lp
172  INPUT#cd; 'Enter item'! n$: IF n$ = '': EXIT lp
174  er = AddUnique(n$, hp_idx, hp_hsz, hp_ptr, hp_cnt%)
176  SELect ON er
178   = -4: PRINT#cd; '** Array full **'
180   = -3: PRINT#cd; '** Out of memory **'
182   = -22 TO -1: PRINT#cd; 'Error'! er
184   = REMAINDER : PRINT#cd; er; '>'! PEEKSTR$(hp_idx(er))
186     :
188     rem    Print list of heap items in order:
190     FOR i% = 1 TO hp_cnt%: PRINT#cd; i%! PEEKSTR$(hp_idx(i%))
192     NEXT lp
194  END SELect
196  :
198  PAUSE#cd: EXIT lp
200 END REPeat lp
202 :
204 rem     Save ordered heap items
206 IF hp_sav THEN
208  co = FOP_OVER("ram1_otest_txt")
210  PRINT#co; 'Items:'! hp_cnt%
212  PRINT#co; 'Chunks:'! hp_chk%
214  PRINT#co; 'Memory:'! hp_mem
216  PRINT#co; 'Error:'! er
218  FOR i% = 1 TO hp_cnt%: PRINT#co; PEEKSTR$(hp_idx(i%))
220  CLOSE#co
222 END IF
224 :
226 CLCHP
228 IF JOBID: QUIT er: ELSE : ERT er
230 :
232 :
1000 rem + ------------------------------------------------------------------------ +
1002 rem |<                               AddUnique                                >|
1004 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
1006 rem |                      Add item to heap if unique                          |
1008 rem |                                                                          |
1010 rem | Uses array of pointers to point to strings in a heap                     |
1012 rem | Adds only unique items to list, in alphabetical order (by index)         |
1014 rem | Hard-wired case-agnostic comparisons (or change relevant CMP% parameter) |
1016 rem | Returns item's logical position in index - or negative error code        |
1018 rem | This version requires a sentinel at position 0.                          |
1020 rem |                                                                          |
1022 rem | Note: Parameters r.xxx updated by IndInsert subroutine!                  |
1024 rem | Dependencies: IndInsert, (AddChunk)                                      |
1026 rem | Uses external commands: CMP%, PEEKSTR$, POKESTR$, EVEN                   |
1028 rem + ------------------------------------------------------------------------ +
1030 rem | V0.01, From TAOCP 6.2.1                                                  |
1032 rem | V0.01, pjw, 2020 Mar 10, Indirect offset version                         |
1034 rem | V0.01, pjw, 2022 Jul 01, Indirect absolute pointer version               |
1036 rem | V0.02, pjw, 2022 Jul 03, Uses parameters instead of GLOBals, SELect      |
1038 rem + ------------------------------------------------------------------------ +
1040 :
1042 DEFine FuNction AddUnique(item$, Arr, max, r.ptr, r.cnt%)
1044 LOCal loop, u%, l%, m%, e%
1046 l% = 0: u% = r.cnt%
1048 REPeat loop
1050  IF u% < l% THEN
1052   er = IndInsert(item$, Arr, l%)
1054   IF er < 0: RETurn er
1056   RETurn l%
1058  END IF
1060  :
1062  m% = INT((l% + u%) / 2)
1064  e% = CMP%(item$, PEEKSTR$(Arr(m%)); 1)
1066  SELect ON e%
1068   = -1: u% = m% - 1:            rem <
1070   =  0: EXIT loop:              rem =
1072   =  1: l% = m% + 1:            rem >
1074  END SELect
1076 END REPeat loop
1078 RETurn m%
1080 END DEFine AddUnique
1082 :
1084 :
1086 rem + ------------------------------------------------------------------------ +
1088 rem |<                               IndInsert                                >|
1090 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
1092 rem |                             Indirect Insert                              |
1094 rem |                                                                          |
1096 rem | Appends an item into a heap, updating the index accordingly              |
1098 rem | This version only for strings. Index consists of absolute pointers       |
1100 rem | Does basic checking on index and heap that buffers dont overflow:        |
1102 rem | Returns -4 => index overflow, -3 => free memory used up, 0 => ok         |
1104 rem |                                                                          |
1106 rem | Subroutine of AddUnique, above                                           |
1108 rem | Dependency: AddChunk                                                     |
1110 rem | Uses external PROCs/FNs POKESTR$ & EVEN                                  |
1112 rem + ------------------------------------------------------------------------ +
1114 rem | V0.01, pjw, 2020 Mar 10                                                  |
1116 rem | V0.01, pjw, 2022 Jul 01, Indirect absolute pointer version               |
1118 rem | V0.02, pjw, 2022 Jul 03, Parameters instead of GLOBals, adjusted cnt%    |
1120 rem + ------------------------------------------------------------------------ +
1122 :
1124 DEFine FuNction IndInsert(item$, Arr, at%)
1126 LOCal i%, l%
1128 :
1130 r.cnt% = r.cnt% + 1
1132 IF r.cnt% > DIMN(Arr): r.cnt% = DIMN(Arr): RETurn -4
1134 l% = EVEN(LEN(item$)) + 2
1136 IF (r.ptr + l%) > max THEN
1138  er = AddChunk: IF er < 0: RETurn -3
1140 END IF
1142 :
1144 IF at% = r.cnt% THEN
1146  Arr(at%) = r.ptr
1148 ELSE
1150  FOR i% = r.cnt% TO at% STEP -1
1152   Arr(i%) = Arr(i% - 1)
1154  END FOR i%
1156  Arr(at%) = r.ptr
1158 END IF
1160 POKESTR$ r.ptr, item$
1162 r.ptr = r.ptr + l%
1164 RETurn 0
1166 END DEFine IndInsert
1168 :
1170 :
1172 rem + ------------------------------------------------------------------------ +
1174 rem |<                               AddChunk                                 >|
1176 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
1178 rem |                        Add another block to heap                         |
1180 rem |                                                                          |
1182 rem | In this version, however many threads, there is only one heap            |
1184 rem |                                                                          |
1186 rem | Uses caller's LOCals: max, r.ptr                                         |
1188 rem | GLOBals hp_blk, hp_chk%                                                  |
1190 rem + ------------------------------------------------------------------------ +
1192 rem | V0.01, pjw, 2022 Jul 03                                                  |
1194 rem + ------------------------------------------------------------------------ +
1196 :
1198 DEFine FuNction AddChunk
1200 IF FREE_MEM < (4096 + hp_blk): RETurn -3
1202 r.ptr    = ALCHP(hp_blk)
1204 max      = r.ptr + hp_blk
1206 hp_chk% = hp_chk% + 1
1208 hp_mem = hp_mem + hp_blk
1210 RETurn 0
1212 END DEFine AddChunk
1214 :
1216 :

  
Generated with sb2htm on 2022 Jul 04
©pjwitte 2oo1 - 2o22