• tcl with "unsigned int"

    From aotto1968@21:1/5 to All on Mon Aug 22 10:20:49 2022
    Down is a hash code from the TCL source,
    I want to use the "unsigned int" as my own hash for internal purpose so I tried to translate this in tcl

    proc pHash { str } {
    set result 0
    if {[string length $str] > 0} {
    scan [string index $str 0] %c result
    Print str@30 result
    for {set idx 1} {$idx < [string length $str]} {incr idx} {
    scan [string index $str $idx] %c value
    set result [expr {$result + ($result << 3) + $value}]
    Print idx result
    }
    }
    format %08X $result
    }

    the result is frustrating <<

    pHash SrvMkBufferCreateTLS

    pHash -> str<SrvMkBufferCreateTLS>, | result<83>,
    pHash -> idx<1>, | result<861>,
    pHash -> idx<2>, | result<7867>,
    pHash -> idx<3>, | result<70880>,
    pHash -> idx<4>, | result<638027>,
    pHash -> idx<5>, | result<5742309>,
    pHash -> idx<6>, | result<51680898>,
    pHash -> idx<7>, | result<465128184>,
    pHash -> idx<8>, | result<4186153758>,
    pHash -> idx<9>, | result<37675383923>,
    pHash -> idx<10>, | result<339078455421>,
    pHash -> idx<11>, | result<3051706098856>,
    pHash -> idx<12>, | result<27465354889818>,
    pHash -> idx<13>, | result<247188194008463>,
    pHash -> idx<14>, | result<2224693746076264>,
    pHash -> idx<15>, | result<20022243714686492>,
    pHash -> idx<16>, | result<180200193432178529>,
    pHash -> idx<17>, | result<1621801740889606845>,
    pHash -> idx<18>, | result<14596215668006461681>,
    pHash -> idx<19>, | result<131365941012058155212>,
    1F11935808A528CC

    The CORE problem is that in "C" the result is limited to 32bit (unsigned in) and an ADD to a 32bit integer
    with overrun of the upper border will result in start at "0". the number will ALWAYS be restricted to 32 bit.

    → *but* in TCL the result is always added and *not* restricted to 32 bit.

    Question how I add REAL 32bit unsinged integer in TCL

    mfg



    unsigned int
    TclHashObjKey(
    Tcl_HashTable *tablePtr, /* Hash table. */
    void *keyPtr) /* Key from which to compute hash value. */
    {
    Tcl_Obj *objPtr = keyPtr;
    int length;
    const char *string = TclGetStringFromObj(objPtr, &length);
    unsigned int result = 0;

    /*
    * I tried a zillion different hash functions and asked many other people
    * for advice. Many people had their own favorite functions, all
    * different, but no-one had much idea why they were good ones. I chose
    * the one below (multiply by 9 and add new character) because of the
    * following reasons:
    *
    * 1. Multiplying by 10 is perfect for keys that are decimal strings, and
    * multiplying by 9 is just about as good.
    * 2. Times-9 is (shift-left-3) plus (old). This means that each
    * character's bits hang around in the low-order bits of the hash value
    * for ever, plus they spread fairly rapidly up to the high-order bits
    * to fill out the hash value. This seems works well both for decimal
    * and non-decimal strings.
    *
    * Note that this function is very weak against malicious strings; it's
    * very easy to generate multiple keys that have the same hashcode. On the
    * other hand, that hardly ever actually occurs and this function *is*
    * very cheap, even by comparison with industry-standard hashes like FNV.
    * If real strength of hash is required though, use a custom hash based on
    * Bob Jenkins's lookup3(), but be aware that it's significantly slower.
    * Tcl does not use that level of strength because it typically does not
    * need it (and some of the aspects of that strength are genuinely
    * unnecessary given the rest of Tcl's hash machinery, and the fact that
    * we do not either transfer hashes to another machine, use them as a true
    * substitute for equality, or attempt to minimize work in rebuilding the
    * hash table).
    *
    * See also HashStringKey in tclHash.c.
    * See also HashString in tclLiteral.c.
    *
    * See [tcl-Feature Request #2958832]
    */

    if (length > 0) {
    result = UCHAR(*string);
    while (--length) {
    result += (result << 3) + UCHAR(*++string);
    }
    }
    return result;
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From aotto1968@21:1/5 to Ralf Fassel on Mon Aug 22 11:09:07 2022
    On 22.08.22 10:58, Ralf Fassel wrote:
    * aotto1968 <[email protected]>
    | The CORE problem is that in "C" the result is limited to 32bit (unsigned in) and an ADD to a 32bit integer
    | with overrun of the upper border will result in start at "0". the number will ALWAYS be restricted to 32 bit.

    | → *but* in TCL the result is always added and *not* restricted to 32 bit.

    | Question how I add REAL 32bit unsinged integer in TCL

    I'd say you need an "& 0xffffffff" somewhere in your TCL expr code to
    limit to 32bits. Finding the correct position for that is left as an exercise ;-)

    R'

    I choose

    set result [expr {($result + ($result << 3) + $value) & 0xffffffff}]

    and hope it will be fine.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Mon Aug 22 10:58:12 2022
    * aotto1968 <[email protected]>
    | The CORE problem is that in "C" the result is limited to 32bit (unsigned in) and an ADD to a 32bit integer
    | with overrun of the upper border will result in start at "0". the number will ALWAYS be restricted to 32 bit.

    | → *but* in TCL the result is always added and *not* restricted to 32 bit.

    | Question how I add REAL 32bit unsinged integer in TCL

    I'd say you need an "& 0xffffffff" somewhere in your TCL expr code to
    limit to 32bits. Finding the correct position for that is left as an
    exercise ;-)

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Mon Aug 22 16:51:03 2022
    * aotto1968 <[email protected]>
    | > I'd say you need an "& 0xffffffff" somewhere in your TCL expr code to
    | > limit to 32bits. Finding the correct position for that is left as an
    | > exercise ;-)
    | >
    | I choose

    | > set result [expr {($result + ($result << 3) + $value) & 0xffffffff}]

    | and hope it will be fine.

    Note that TCLs "scan %c" does something different for chars > 127
    (notably Unicode) than the C function, since in TCL you get _one_ value
    (which is also probably larger than 256), while in C you get two or more
    bytes, depending on the Unicode representation. If you need an exact
    replica of the C function, you probably need to get at the individual
    bytes on TCL level.

    Also I'm not sure if you need another "&..." after the left-shift, since
    in C "$result << 3" will probably also overflow. Don't know if this
    matters for the final result or the result as a hash value (been a while
    since I did bitshifting).

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)