COMPM

                                COMPM%
                                ======

             Compare two values of any type stored in memory


Compatibility:
==============

SBASIC (ie SMSQ/E) or compiled (all systems).


Usage:
======

        result% = COMPM%((X), (Y), <comp%>)

Where

        result% is either  1 => (X) > (Y) or
                           0 => (X) = (Y) or
                          -1 => (X) < (Y)
                          unless reversed

        (X) and (Y) are pointers to (= addresses of) the two values stored
                somewhere in memory that are to be compared. These values
                must be of the same type as no coersion (conversion) takes
                place with direct memory values.

        <comp%> = <sign><cmptype.b><vartype.b)

        <cmptype.b> in bits 0 to 2 in the most significant byte [msb] of
                the word is the Qdos code for string comparision types
                0..3 (see below for details)

                C-strings and N-strings only accept 0 or <> 0 as
                comparison types, with 0 => case sensitive and <> 0 =>
                case agnostice (more details below).

                Apart from the top bit, <cmptype> is only significant for
                string comparisions and is otherwise ignored.

                For C- and N-strings you can also specify a skip number.
                This is the number of C- or N-strings that must be skipped
                to reach the string we want. The figure can be 0 (default)
                to 15, and is supplied in bit 3 to 6 in the msb of the
                parameter.

                Finally, if bit #7 of the msb is set, the result of the
                comparison is reversed.

        <vartype.b> in the least significant byte of the parameter is a
                code defining the kind of parameters to be compared:


Variable types (vartype%):
==========================

        sbyte = $00  =  0       signed byte
        ubyte = $02  =  2       unsigned byte
        sword = $04  =  4       signed 16 bit word
        uword = $06  =  6       unsigned 16 bit word
        slong = $08  =  8       signed 32 bit long word
        ulong = $0A  = 10       unsigned 32 bit long word
        fltpt = $0C  = 12       48 bit float
        qstrg = $0E  = 14       Q-string (len.w + bytes)
        cstrg = $10  = 16       C-string (zero-terminated)
        nstrg = $12  = 18       Name string (1 byte length)

Note: Except for types 0, 2, 16 and 18, (byte, unsigned byte and C- and
      N-strings) all addresses must be even! (Does not necessarily apply
      where the CPU >= MC68020, but for the sake of compatibility better
      stick with that rule.)

Note: S*BASIC doesnt normally deal with unsigned integer word or
      longword, so, for example, the unsigned word 64302 has to be entered
      as -31534 (= 32768 - 64302). Bytes in S*BASIC are normally treated
      as unsiged. COMP% lets you treat them as signed or unsigned.

A few other types are planned, such as negative types => the current pair
of variables are just long word pointers to two fields of abs(type). This
would be most useful for q-strings of widely variable lengths kept outside
the record, but might also have other uses..


String comparison (cmptype%):
=============================

Comparisons may be:

Type 0  Made directly on a character by character basis

Type 1  Made ignoring the case of the letters

Type 2  Made using the value of any embedded numbers

Type 3  Both ignoring the case of letters and using the value of embedded
        numbers.

More detail of the order of characters etc, may be found in the
various QL Concepts manuals, or in the text accompanying my CMP% keyword
at Knoware.no


Examples:
=========

If case% = 1 (case agostic string comparison) and var% = 14
(variable type Q-string), and if adr1 -> 03,'abc' and
adr2 -> 03,'ABC' then use

        type% = case% * 256 + var%

r% = COMPM%(adr1, adr2, case% * 256 + var%) returns r% = 0 and
r% = COMPM%(adr1, adr2, var%) returns r% = -1 (case% = 0: case sensitive)

C-strings and N-strings can only take 0 and 1 as comparison type codes.
0 => case sensitive and 1 => case agnostic. These are not lexical
comparisons as for the Qdos types above, but straight character-by-
character comparisions.


cmptype% example:
-----------------

Variable type   = $10 - c-string = 16
Comparison type =   1 - case agnostic
Skip strings    =   2 - we want the third string after the pointer

In the formula below the result of the comparison can be straight or
reversed.

Straight: (a2z = 0)

  type% = (skip% * 8 + case% - a2z) * 256 + var%

       => type% = 4368 = $1110 = %0001000100010000

If the result is to be reversed (eg for sorting purposes):


Reversed: (a2z = $80, ie -1 in sbyte)

  type% = (skip% * 8 + case% - a2z) * 256 + var%

       => type% = -28400 = $9110 = %1001000100010000


For non-string comparisons, to reverse the order of the comparison:

  eg: var% = 4: a2z = $8000 (= 32768 = -1 in sword)

  type% = var% - a2z

       => -32764 = $8004 = %1000000000000100 = signed word reverse comp


Example of use:
===============

Included in the zip with this toolkit is an SBASIC program called Sort_bas.
It is not a complete and working program. (It works perfectly in the context
where it is being used!)

To make it work you need some data. The data would consist of Records, like
in a database. Each record contains a number of fields, such as
First_name, Surname, Address, Country, Telephone, Customer number, etc.
Fields may be of different types - some text, some numeric, some of fixed
length and some of variable length.

Each record is stored some place in memory, it may be convenient that
records are not of fixed size nor stored in consecutive locations in memory.
So, to keep track of them, you need to make an index, which is updated each
time a new record is created or loaded from disk.

So, base is some heap space in memory that contains the index - a series
of long words. Each long word points to (is the address of) the base of a
record in some other heap in memory. Offset from the base of each record
are the different fields. The field offsets and their respective types are
the same for every record.


                      +-------> [rec 0]
                      |         : offs 00 [field 1] (float)
location    index     |         : offs 06 [field 2] (uword)
- - - - -  - - - -    |         : offs 08 [field 3] (qstrg, variable length)
base + 00  [rec 0] ---+
                         +--------------------> [rec 2]
                         |                      : offs 00
base + 04  [rec 1] ------)--+                   : offs 06
                         |  |                   : offs 08
                         |  |
base + 08  [rec 2] ------+  +---> [rec 1]
                                  :
...                               :
                                  :


To find the Q-string of record #1, you start with the index base, which is
known by the program or person that created it, and go to its 1st (one
past the 0th) posistion: ptr = peek_l(base + #1 * 4). ptr now contains the
address of record #1. Add to that the offset to Q-strings (which is known to
the creator of the database) to find the Q-string: qstr$ = peekstr$(ptr +
[offs 08])

To search for stuff you need to be able to compare fields against your
search criteria. The same if you need to order your records for whatever
reason (making searches faster is one very good reason.)

COMPM% can help with all that, provided you tell it where your records
are, how much each field is offset from the base of each record, and what
type of data each field contains.

The sort routine, BISM, makes use of that information to perform a simple
Binary Insertion Sort in Memory. It doesnt rearrange any records, but it
does rearrange the pointers to those record (the index) based on the results
of the comparisons.

rem + ------------------------------------------------------------------------ +
rem |<                                  BISM                                  >|
rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
rem |                      Binary Insertion Sort (Memory)                      |
rem |                                                                          |
rem | TAOCP 5.2.1. Modified to sort a long word memory index The indexes       |
rem | point to records, also in memory. Individual fields are offset from      |
rem | the main record address.                                                 |
rem |                                                                          |
rem |         ix      is the base of the index                                 |
rem |         n       is the number of items in index (0..N)                   |
rem |         ofs     is an array of offsets pointing to the various fields    |
rem |                    to be included in the sort                            |
rem |         typ%    is an array of corresponding types for those fields      |
rem |                    In the case of strings the typ also specifies the     |
rem |                    comparison type in the msb of the lsw.                |
rem |                                                                          |
rem | Dependencies: McmpMo% and tk COMPM%                                      |
rem + ------------------------------------------------------------------------ +
rem | V0.01, pjw, 2023 Jul 31                                                  |
rem | V0.02, pjw, 2023 Aug 08, changed to nstrg, COMPM% has new parameters     |
rem + ------------------------------------------------------------------------ +
:
DEFine PROCedure BISM(ix, n, ofs, typ%)
LOCal j, i, sl, t
FOR j = 1 TO n - 1
 rem t$ = arr(j%): i% = j% - 1
 t = PEEK_L(ix + j * 4): i = j - 1
 REPeat sl
  rem IF t$ >= arr(i%): EXIT sl
  IF McmpMo%(t, PEEK_L(ix + i * 4)) >= 0: EXIT sl
  rem arr(i% + 1) = arr(i%)
  POKE_L ix + (i + 1) * 4, PEEK_L(ix + i * 4)
  rem i% = i% - 1: IF i% <= -1: EXIT sl
  i = i - 1: IF i <= -1: EXIT sl
 END REPeat sl
 rem arr(i% + 1) = t$
 POKE_L ix + (i + 1) * 4, t
END FOR j
END DEFine BISM
:
:
rem + ------------------------------------------------------------------------ +
rem |<                                McmpMo%                                 >|
rem + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
rem |         Compare multiple elements stored in memory using offsets         |
rem |                                                                          |
rem | Subroutine of BISM                                                       |
rem |                                                                          |
rem | Compare each element/field in a record: Comparison ends when the         |
rem | result is either "greater" or "less" or until there are no further       |
rem | elements left to compare in which case the result is "equal".            |
rem |                                                                          |
rem | baseX/Y are abs location of pointers to a record                         |
rem | ofs     is an array containing offsets to fields within record           |
rem | typ%    is an array holding the type of each pair of fields              |
rem |                                                                          |
rem | Dependency: COMPM%                                                       |
rem + ------------------------------------------------------------------------ +
rem | V0.01, pjw, 2023 Jul 30                                                  |
rem | V0.02, pjw, 2023 Aug 08, changed to nstrg, COMPM% has new parameters     |
rem + ------------------------------------------------------------------------ +
:
DEFine FuNction McmpMo%(baseX, baseY)
LOCal i%, r%
FOR i% = 0 TO DIMN(ofs)
 r% = COMPM%(baseX + ofs(i%), baseY + ofs(i%), typ%(i%))
 IF r%: RETurn r%:    rem Dont return yet if zero..
END FOR i%
RETurn 0:             rem Equals!
END DEFine McmpMo%
:
:

Binary Insertion Sorts are not terribly efficient on lots of unsorted data,
but for a small number of records - or a large number of records that are
nearly in order - it is reasonably fast. For the purpose of illustration
the main point here is that it is simple.


ToDo:
=====

The next step will be to modify my Quicksort toolkit work with the same sort
of data types as COMPM% does. That should make for a fast and versatile
sorting solution.


Status of This software:
========================

V0.01, pjw, 2023 Jul 28, first release

                  Conditions and DISCLAIMER as per Knoware.no
Generated by QuickHTM, 2023 Sep 19
<-Back
ToP