UniqueIndOfs

This routine is in essence identical to Unique, except that the items (here, strings) are not kept in an array but in a heap in memory. Only the index is kept in an array. In this case, to keep the index compact, only the offsets from some datum within the heap (here, its start address addr) are maintained. This particular arrangement might be useful for a data set with a smallish number of strings of a very variable length. But this is just an example. There are so many variations and tweaks that could be made, depending on requirements and circumstance.

To test, EXecute it (SBASIC only) or run it directly from QD with the SBAS/QD Thing. It will probably also work directly from SuperBASIC: Just replace line 120 with: cd = 1: CLS, and line 168 with: RECHP addr, and then just RUN it.

Note: You need to LRESPR the extra commands from my toolkits first! (or modify the code accordingly).


10 rem + ************************************************************************ +
12 rem *<                              UniqueIndOfs                              >*
14 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
16 rem *                 Create a list of unique items in memory                  *
18 rem *                                                                          *
20 rem * Scan a list using binary search routine and return +ve if found.         *
22 rem * If not found, add item to list, creating a list of unique items.         *
24 rem *                                                                          *
26 rem * This version uses an index of offsets to address items in memory.        *
28 rem * Note: Only the index is sorted; the heap items remain in the order       *
30 rem * they were added to the heap                                              *
32 rem *                                                                          *
34 rem * Dependencies: POKESTR$, PEEKSTR$, EVEN, CMP%                             *
36 rem + ------------------------------------------------------------------------ +
38 rem * V0.02, pjw, 2022 Jul 03                                                  *
40 rem + ************************************************************************ +
42 :
44 :
46 EXT_PROC 'POKESTR$'
48 EXT_FN   'PEEKSTR$', 'EVEN', 'CMP%'
50 :
100 rem     Initialise GLOBal values
102 max = 512: max% = 10:           rem Size of heap & index
104 DIM index%(max%):               rem Init index
106 addr = ALCHP(max):              rem Heap
108 cnt% = 0:                       rem Current number of items
110 :
112 rem     Set sentinel to lowest value
114 index%(0) = 0: POKE_W addr, 0: ofs = 2
116 :
118 rem     Test harness
120 cd = FOPEN("con"): BORDER#cd; 1, 2: CLS#cd
122 REPeat lp
124  INPUT#cd; 'Enter item'! n$: IF n$ = '': EXIT lp
126  IF cnt% = 0 THEN
128   rem   First item is special
130   cnt% = 1: index%(cnt%) = ofs
132   POKESTR$ addr + ofs, n$: ofs = ofs + 2 + EVEN(LEN(n$))
134  ELSE
136   n% = IndBSearch%(n$, index%(0 TO cnt%))
138   IF n% >= 0 THEN
140    PRINT#cd; n%! PEEKSTR$(addr + index%(n%))
142   ELSE
144    PRINT#cd; n%! '['; n$; ']'
146    er = IndInsert(n$, index%, ABS(n%))
148   END IF
150  END IF
152  IF er < 0: PRINT#cd; '** Array full **': PAUSE#cd: EXIT lp
154  :
156  rem    Print list of heap items in order:
158  FOR i = 1 TO cnt%: PRINT#cd; i! PEEKSTR$(addr + index%(i))
160 END REPeat lp
162 :
164 rem     Save heap (unordered)
166 rem SBYTES_O 'ram1_test_bin', addr, ofs
168 :
170 :
1000 rem + ------------------------------------------------------------------------ +
1002 rem |<                              IndBSearch                                >|
1004 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
1006 rem |                              Binary Search                               |
1008 rem |                                                                          |
1010 rem | This version uses an integer array of offsets. The offsets point to      |
1012 rem | strings in a heap.                                                       |
1014 rem | Returns position if found, or -position where item should go. Thus it    |
1016 rem | can be used to add new, unique items to a list, in alphabetical order.   |
1018 rem |                                                                          |
1020 rem | GLObal addr                                                              |
1022 rem | Uses external FNs CMP% & PEEKSTR$                                        |
1024 rem + ------------------------------------------------------------------------ +
1026 rem | V0.01, From TAOCP 6.2.1                                                  |
1028 rem | V0.01, pjw, 2020 Mar 10, Indirect offset version                         |
1030 rem | V0.02, pjw, 2022 Jul 03, Tweak to use CMP% and SELect                    |
1032 rem + ------------------------------------------------------------------------ +
1034 :
1036 DEFine FuNction IndBSearch%(item$, Arr%)
1038 LOCal loop, u%, l%, i%, e%
1040 l% = 0: u% = DIMN(Arr%)
1042 REPeat loop
1044  IF u% < l%: i% = -l%: EXIT loop
1046  e% = CMP%(item$, PEEKSTR$(Arr(i%)); 1)
1048  SELect ON e%
1050   = -1: u% = i% - 1:               rem <
1052   =  0: EXIT loop:                 rem =
1054   =  1: l% = i% + 1:               rem >
1056  END SELect
1058 END REPeat loop
1060 RETurn i%
1062 END DEFine BSearch%
1064 :
1066 :
1068 rem + ------------------------------------------------------------------------ +
1070 rem |<                               IndInsert                                >|
1072 rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
1074 rem |                             Indirect Insert                              |
1076 rem |                                                                          |
1078 rem | Appends an item into a heap, updating its offset index accordingly       |
1080 rem | This version only for strings. Index items are offsets from addr.        |
1082 rem | Index could be integer or float array.                                   |
1084 rem | Does basic checking on index and heap that buffers dont overflow:        |
1086 rem | Returns -4 => index overflow, -11 => heap space used up, 0 => ok         |
1088 rem |                                                                          |
1090 rem | GLOBal cnt%, addr, ofs, max                                              |
1092 rem | Uses external PROCs/FNs POKESTR$ & EVEN                                  |
1094 rem + ------------------------------------------------------------------------ +
1096 rem | V0.01, pjw, 2020 Mar 10                                                  |
1098 rem + ------------------------------------------------------------------------ +
1100 :
1102 DEFine FuNction IndInsert(item$, Arr, at%)
1104 LOCal i%, l%
1106 cnt% = cnt% + 1
1108 IF cnt% > DIMN(Arr): RETurn -4
1110 l% = EVEN(LEN(item$)) + 2
1112 IF (ofs + l%) > max: RETurn -11
1114 :
1116 IF at% = cnt% THEN
1118  Arr(at%) = ofs
1120 ELSE
1122  FOR i% = cnt% TO at% STEP -1
1124   Arr(i%) = Arr(i% - 1)
1126  END FOR i%
1128  Arr(at%) = ofs
1130 END IF
1132 POKESTR$ addr + ofs, item$
1134 ofs = ofs + l%
1136 RETurn 0
1138 END DEFine IndInsert
1140 :
1142 :

  
Generated with sb2htm on 2022 Jul 12
©pjwitte 2oo1 - 2o22