• faster Lsearch speed wanted

    From [email protected]@21:1/5 to All on Thu Nov 3 01:28:28 2022
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....

    code list below


    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]

    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {
    set new_lst [lindex $lst 0]
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    if {$tmp_val == 0} {set tmp_val 39}
    lappend new_lst $tmp_val
    }
    return $new_lst
    }




    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    }

    }

    ]

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    }

    }

    ]\n

    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}


    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration

    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration


    as the output use single proc -regexp less perfomance ....
    but combine with other proc seem better .. Why....

    could someone give me direction to overcome speed issue

    thanks
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Thu Nov 3 04:05:34 2022
    heinrichmartin 在 2022年11月3日 星期四晚上11:25:02 [UTC+13] 的信中寫道:
    On Thursday, November 3, 2022 at 9:28:30 AM UTC+1, Rolance wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....
    Are you looking for the fastest code to solve a problem or for a better understanding of the observed timing.
    If it is the first, then please also state the actual problem.

    Anyway, here are a few comments (hopefully correct, but most likely not covering all aspects).
    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]
    * $v is always 1 and that factor can be removed from expr.
    * The argument to expr should be braced.
    * "set v" is unnecessary; {expr {int([::tcl::mathfunc::rand]*39)}} will do. * You could try whether a simple for-loop is faster than lrepeat+lmap. Tcl increases the internal list size progressively.
    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {
    I guess there is a reason for a proc, i.e. why are you not creating these values straight away.
    set new_lst [lindex $lst 0]
    The comment says "each number", but you are not processing the first one.
    for {set i 1} {$i < [llength $lst]} {incr i} {
    You seem to know about lmap, why not here?
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    You could [incr id -1] once in advance (inside the proc before the loop).
    if {$tmp_val == 0} {set tmp_val 39}
    You could time whether [expr {(($v+$id)%-39)+39}] is faster to obtain the range 1~39.

    Or the other way round, perform $id%39 in advance and do a conditional range shift in the loop, i.e.
    if {$tmp_val > 39} {incr tmp_val -39} {set tmp_val} ;# assuming lmap

    Please cross-check correctness on your own, i.e. do not blindly trust me who does not know your actual problem statement.
    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    Should that be -integer instead of -exact?
    Internally, you are creating string reps of the list items here.
    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    Is there a reason for calling the proc in every loop?
    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
    Here, you definitely have a single call to the proc.


    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration
    This could include compile time of the proc. Do the timing more than once (or call the proc once in advance).
    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration


    as the output use single proc -regexp less perfomance ....
    Exact string comparison is way faster. Even glob-matching is faster, e.g. Expect creates glob-gatekeepers for regexp-matching.
    You could try whether {^(?:x1|x2|...)$} matches faster (i.e. if the state-machine is simpler), but in the end the above statement will stand.
    but combine with other proc seem better .. Why....
    could someone give me direction to overcome speed issue
    Double the time for 5 times the calls to the proc seems to proof different. You will find out in a cross-check.
    In the end, most of my comments turn out to be irrelevant for that final finding ... Maybe they help anyway.

    hi heinrichmartin

    thanks for your advice

    1. looking for the fastest code to mass data , speed issue need cut down consume time as least half..
    this code is part of project , the most affect effience part -- > lsearch command
    already try serveal way to get best search time .....
    or you have other best way to achieve like dict .. or array search code for reference?

    2. (a).why use --> foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    (b) not ---> Should that be -integer instead of -exact?

    (a): 351 microseconds per iteration <<<<
    (b): 583 microseconds per iteration

    3. single proc Vs combine proc the time consume not proportional ....
    still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
    5times faster than one line (-regexp ) code ....

    thanks
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to Rolance on Thu Nov 3 03:25:00 2022
    On Thursday, November 3, 2022 at 9:28:30 AM UTC+1, Rolance wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....

    Are you looking for the fastest code to solve a problem or for a better understanding of the observed timing.
    If it is the first, then please also state the actual problem.

    Anyway, here are a few comments (hopefully correct, but most likely not covering all aspects).

    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]

    * $v is always 1 and that factor can be removed from expr.
    * The argument to expr should be braced.
    * "set v" is unnecessary; {expr {int([::tcl::mathfunc::rand]*39)}} will do.
    * You could try whether a simple for-loop is faster than lrepeat+lmap. Tcl increases the internal list size progressively.

    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {

    I guess there is a reason for a proc, i.e. why are you not creating these values straight away.

    set new_lst [lindex $lst 0]

    The comment says "each number", but you are not processing the first one.

    for {set i 1} {$i < [llength $lst]} {incr i} {

    You seem to know about lmap, why not here?

    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]

    You could [incr id -1] once in advance (inside the proc before the loop).

    if {$tmp_val == 0} {set tmp_val 39}

    You could time whether [expr {(($v+$id)%-39)+39}] is faster to obtain the range 1~39.

    Or the other way round, perform $id%39 in advance and do a conditional range shift in the loop, i.e.
    if {$tmp_val > 39} {incr tmp_val -39} {set tmp_val} ;# assuming lmap

    Please cross-check correctness on your own, i.e. do not blindly trust me who does not know your actual problem statement.



    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x

    Should that be -integer instead of -exact?
    Internally, you are creating string reps of the list items here.

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x

    Is there a reason for calling the proc in every loop?

    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}

    Here, you definitely have a single call to the proc.



    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration

    This could include compile time of the proc. Do the timing more than once (or call the proc once in advance).

    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration


    as the output use single proc -regexp less perfomance ....

    Exact string comparison is way faster. Even glob-matching is faster, e.g. Expect creates glob-gatekeepers for regexp-matching.
    You could try whether {^(?:x1|x2|...)$} matches faster (i.e. if the state-machine is simpler), but in the end the above statement will stand.

    but combine with other proc seem better .. Why....
    could someone give me direction to overcome speed issue

    Double the time for 5 times the calls to the proc seems to proof different. You will find out in a cross-check.
    In the end, most of my comments turn out to be irrelevant for that final finding ... Maybe they help anyway.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to Rolance on Thu Nov 3 04:41:51 2022
    On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
    heinrichmartin 在 2022年11月3日 星期四晚上11:25:02 [UTC+13] 的信中寫道:
    On Thursday, November 3, 2022 at 9:28:30 AM UTC+1, Rolance wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....
    Are you looking for the fastest code to solve a problem or for a better understanding of the observed timing.
    If it is the first, then please also state the actual problem.

    Anyway, here are a few comments (hopefully correct, but most likely not covering all aspects).
    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]
    * $v is always 1 and that factor can be removed from expr.
    * The argument to expr should be braced.
    * "set v" is unnecessary; {expr {int([::tcl::mathfunc::rand]*39)}} will do.
    * You could try whether a simple for-loop is faster than lrepeat+lmap. Tcl increases the internal list size progressively.
    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {
    I guess there is a reason for a proc, i.e. why are you not creating these values straight away.
    set new_lst [lindex $lst 0]
    The comment says "each number", but you are not processing the first one.
    for {set i 1} {$i < [llength $lst]} {incr i} {
    You seem to know about lmap, why not here?
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    You could [incr id -1] once in advance (inside the proc before the loop).
    if {$tmp_val == 0} {set tmp_val 39}
    You could time whether [expr {(($v+$id)%-39)+39}] is faster to obtain the range 1~39.

    Or the other way round, perform $id%39 in advance and do a conditional range shift in the loop, i.e.
    if {$tmp_val > 39} {incr tmp_val -39} {set tmp_val} ;# assuming lmap

    Please cross-check correctness on your own, i.e. do not blindly trust me who does not know your actual problem statement.
    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    Should that be -integer instead of -exact?
    Internally, you are creating string reps of the list items here.
    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    Is there a reason for calling the proc in every loop?
    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
    Here, you definitely have a single call to the proc.


    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration
    This could include compile time of the proc. Do the timing more than once (or call the proc once in advance).
    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration


    as the output use single proc -regexp less perfomance ....
    Exact string comparison is way faster. Even glob-matching is faster, e.g. Expect creates glob-gatekeepers for regexp-matching.
    You could try whether {^(?:x1|x2|...)$} matches faster (i.e. if the state-machine is simpler), but in the end the above statement will stand.
    but combine with other proc seem better .. Why....
    could someone give me direction to overcome speed issue
    Double the time for 5 times the calls to the proc seems to proof different. You will find out in a cross-check.
    In the end, most of my comments turn out to be irrelevant for that final finding ... Maybe they help anyway.
    hi heinrichmartin

    thanks for your advice

    Not replying inline makes it hard for me to set your points in context.

    1. looking for the fastest code to mass data , speed issue need cut down consume time as least half..
    this code is part of project , the most affect effience part -- > lsearch command
    already try serveal way to get best search time .....
    or you have other best way to achieve like dict .. or array search code for reference?

    lsearch requires O(N) unless you can sort the list in advance WLOG. Dict and array (when using the data as keys) make the values unique; if that was acceptable, then why are you using -all? Besides, lsearch -integer should be unbeatable.

    2. (a).why use --> foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    (b) not ---> Should that be -integer instead of -exact?

    (a): 351 microseconds per iteration <<<<
    (b): 583 microseconds per iteration

    My bad, -integer requires -exact (or -sorted) to take effect and that is not implied. (We had that here on clt; it is counter-intuitive.)

    3. single proc Vs combine proc the time consume not proportional ....
    still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
    5times faster than one line (-regexp ) code ....

    Let me comment with a quote:

    On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:
    [...] I do know that such benchmarks are notoriously difficult to get right. [...]

    And you seem to have missed many very basic hints about possible improvements wrt Tcl/coding/algorithms in my previous answer ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to heinrichmartin on Thu Nov 3 04:49:31 2022
    On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
    On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
    3. single proc Vs combine proc the time consume not proportional ....
    still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
    5times faster than one line (-regexp ) code ....
    Let me comment with a quote:

    On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:
    [...] I do know that such benchmarks are notoriously difficult to get right. [...]

    Btw, here is my quick take on it (including a few cross-checks):
    expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
    expect:~$ ll $l
    10000
    expect:~$ lindex $l 25
    22
    expect:~$ ::tcl::mathfunc::max {*}$l
    38
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000
    230.131 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.889 microseconds per iteration
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000
    230.803 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.114 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000 103.633 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000 104.209 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000 106.351 microseconds per iteration
    expect:~$

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Thu Nov 3 14:09:14 2022
    [email protected] <[email protected]> wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....

    As no one here knows what you are trying to achieve, we are limited in
    our ability to offer suggestions.

    code list below


    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]

    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {
    set new_lst [lindex $lst 0]
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    if {$tmp_val == 0} {set tmp_val 39}
    lappend new_lst $tmp_val
    }
    return $new_lst
    }




    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    }

    }

    ]

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    }

    }

    ]\n

    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}


    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration

    Unsurprising here. The first does 5 searches over a static list.

    The second makes 5 calls into a proc, that iterates the entire list via foreach, returning a new list, and that new list is then searched.

    Iterating the entire list, in Tcl, to generate a new list is where most
    of this time difference is being consumed.

    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration

    Also unsurprising, searches using the regex engine will be slower than
    searches using simple comparisons, because the regex engine does far
    more than simple comparisons.

    The time difference between the static list and the newly generated
    list via a proc comments above also apply here.

    could someone give me direction to overcome speed issue

    If your issue is lsearch speed, then:

    1) only search a static list (i.e., don't recreate the list first
    before searching it)

    2) if your desired results can also be obtained from searching a sorted
    list, then first sort the list (sort only once, then search plural
    times). Using the -sorted option to lsearch causes lsearch to
    perform a binary search of the list, which will be faster than
    without -sorted, which causes lsearch to perform a linear search
    (start at first element, look at each in sequence until found).

    But you've failed to describe your actual problem. You've shown a
    solution that does not work for you speed wise, but not described to us
    what you are trying to achieve by this solution you've given. It is
    quite possible there is some alternative way, without using lsearch, to
    achieve your desired result, but we can't read your mind over Usenet to
    know what your actual problem is to even be able to consider some
    alternate.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Thu Nov 3 12:25:22 2022
    Rich 在 2022年11月4日 星期五凌晨3:09:19 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....
    As no one here knows what you are trying to achieve, we are limited in
    our ability to offer suggestions.
    code list below


    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]

    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {
    set new_lst [lindex $lst 0]
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    if {$tmp_val == 0} {set tmp_val 39}
    lappend new_lst $tmp_val
    }
    return $new_lst
    }




    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    }

    }

    ]

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    }

    }

    ]\n

    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}


    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration
    Unsurprising here. The first does 5 searches over a static list.

    The second makes 5 calls into a proc, that iterates the entire list via foreach, returning a new list, and that new list is then searched.

    Iterating the entire list, in Tcl, to generate a new list is where most
    of this time difference is being consumed.
    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration
    Also unsurprising, searches using the regex engine will be slower than searches using simple comparisons, because the regex engine does far
    more than simple comparisons.

    The time difference between the static list and the newly generated
    list via a proc comments above also apply here.
    could someone give me direction to overcome speed issue
    If your issue is lsearch speed, then:

    1) only search a static list (i.e., don't recreate the list first
    before searching it)

    2) if your desired results can also be obtained from searching a sorted list, then first sort the list (sort only once, then search plural
    times). Using the -sorted option to lsearch causes lsearch to
    perform a binary search of the list, which will be faster than
    without -sorted, which causes lsearch to perform a linear search
    (start at first element, look at each in sequence until found).

    But you've failed to describe your actual problem. You've shown a
    solution that does not work for you speed wise, but not described to us
    what you are trying to achieve by this solution you've given. It is
    quite possible there is some alternative way, without using lsearch, to achieve your desired result, but we can't read your mind over Usenet to
    know what your actual problem is to even be able to consider some
    alternate.

    Hi Rich

    thanks for your advice
    is possible relate memory occupy issue , see the running program get more an more memory after long time running...
    will try all your suggestion later in real running program and give your detail reply

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Thu Nov 3 12:19:16 2022
    heinrichmartin 在 2022年11月4日 星期五凌晨12:49:33 [UTC+13] 的信中寫道:
    On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
    On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
    3. single proc Vs combine proc the time consume not proportional .... still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
    5times faster than one line (-regexp ) code ....
    Let me comment with a quote:

    On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:
    [...] I do know that such benchmarks are notoriously difficult to get right. [...]
    Btw, here is my quick take on it (including a few cross-checks):
    expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
    expect:~$ ll $l
    10000
    expect:~$ lindex $l 25
    22
    expect:~$ ::tcl::mathfunc::max {*}$l
    38
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.131 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.889 microseconds per iteration
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.803 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.114 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000
    103.633 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000
    104.209 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000
    106.351 microseconds per iteration
    expect:~$


    hi heinrichmartin

    thank for your detail reply and try result

    1. the data come from mass data file over 10G, each data line prefix with tile-id , that is why not use [lindex 0] data..
    data must get form other source
    2. need learch -all to get all value position , alreay try -inline and -regexp , the total running perfromnce less than simple -all

    3. already view Tcl/coding/algorithms and do improve my program perviously , I think it the best solution , but customer not satisify with the speed
    need get best performce for each proc code ....
    lsearch can fit all request , or your have suggestion command to replace this with better perforamce.

    4. will try all your suggestion later and repport the result

    thanks
    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 02:43:41 2022
    Rich 在 2022年11月4日 星期五凌晨3:09:19 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    hope someone can give me some hint ,or other way search ....
    As no one here knows what you are trying to achieve, we are limited in
    our ability to offer suggestions.
    code list below


    set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]

    ##### each number +id %39
    proc adv_chg_no_range4 {id lst} {
    set new_lst [lindex $lst 0]
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    if {$tmp_val == 0} {set tmp_val 39}
    lappend new_lst $tmp_val
    }
    return $new_lst
    }




    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    }

    }

    ]

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    }

    }

    ]\n

    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts "###combine ##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}


    output:

    328 microseconds per iteration
    ###combine ##
    4928 microseconds per iteration
    Unsurprising here. The first does 5 searches over a static list.

    The second makes 5 calls into a proc, that iterates the entire list via foreach, returning a new list, and that new list is then searched.

    Iterating the entire list, in Tcl, to generate a new list is where most
    of this time difference is being consumed.
    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration
    Also unsurprising, searches using the regex engine will be slower than searches using simple comparisons, because the regex engine does far
    more than simple comparisons.

    The time difference between the static list and the newly generated
    list via a proc comments above also apply here.
    could someone give me direction to overcome speed issue
    If your issue is lsearch speed, then:

    1) only search a static list (i.e., don't recreate the list first
    before searching it)

    2) if your desired results can also be obtained from searching a sorted list, then first sort the list (sort only once, then search plural
    times). Using the -sorted option to lsearch causes lsearch to
    perform a binary search of the list, which will be faster than
    without -sorted, which causes lsearch to perform a linear search
    (start at first element, look at each in sequence until found).

    But you've failed to describe your actual problem. You've shown a
    solution that does not work for you speed wise, but not described to us
    what you are trying to achieve by this solution you've given. It is
    quite possible there is some alternative way, without using lsearch, to achieve your desired result, but we can't read your mind over Usenet to
    know what your actual problem is to even be able to consider some
    alternate.


    Hi Rich

    1. static list indeed faster , but need recreate lst for each analysis , how can improve this ... release memory ?

    2 this actual problem is speed issue , all analysis result can fit request except the consume time , customer want cut half ...
    adv_chg_no_range5 is the best performance
    5 time loop search is better than regexp , but not stable..
    (update code post pre , please refer)

    could you point me some better or alterntive way to achieve speed request

    thanks in advance
    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Fri Nov 4 10:48:46 2022
    * "[email protected]" <[email protected]>
    | ##### each number +id %39 first char title id
    | proc adv_chg_no_range4 {id lst} {
    | set new_lst [lindex $lst 0]
    | for {set i 1} {$i < [llength $lst]} {incr i} {
    | set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    | if {$tmp_val == 0} {set tmp_val 39}
    | lappend new_lst $tmp_val
    | }
    | return $new_lst
    | }
    --<snip-snip>--
    | proc adv_chg_no_range6 {id lst} {
    | set new_lst [lindex $lst 0]
    | incr id -1
    | for {set i 1} {$i < [llength $lst]} {incr i} {
    | lappend new_lst [expr {(([lindex $lst $i]+$id)%-39)+39} ]
    | }
    | return $new_lst
    | }
    --<snip-snip>--
    | could you point me where still can improve .

    Instead of traversing the list by index, use foreach:

    for {set i 1} {$i < [llength $lst]} {incr i} {
    set elt [lindex $lst $i]
    ...
    }

    foreach elt [lrange $lst 1 end] {
    ...
    }

    But I have to admit I did not follow this thread to understand what your ultimate goal is.

    HTH
    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 02:22:36 2022
    heinrichmartin 在 2022年11月4日 星期五凌晨12:49:33 [UTC+13] 的信中寫道:
    On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
    On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
    3. single proc Vs combine proc the time consume not proportional .... still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
    5times faster than one line (-regexp ) code ....
    Let me comment with a quote:

    On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:
    [...] I do know that such benchmarks are notoriously difficult to get right. [...]
    Btw, here is my quick take on it (including a few cross-checks):
    expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
    expect:~$ ll $l
    10000
    expect:~$ lindex $l 25
    22
    expect:~$ ::tcl::mathfunc::max {*}$l
    38
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.131 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.889 microseconds per iteration
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.803 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.114 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000
    103.633 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000
    104.209 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000
    106.351 microseconds per iteration
    expect:~$

    Hi heinrichmartin

    update code with your comment and test result below

    set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]

    ##### each number +id %39 first char title id
    proc adv_chg_no_range4 {id lst} {
    set new_lst [lindex $lst 0]
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    if {$tmp_val == 0} {set tmp_val 39}
    lappend new_lst $tmp_val
    }
    return $new_lst
    }



    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }


    proc adv_chg_no_range6 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    for {set i 1} {$i < [llength $lst]} {incr i} {
    lappend new_lst [expr {(([lindex $lst $i]+$id)%-39)+39} ]
    }
    return $new_lst
    }



    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    }

    }

    ]

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    }

    }

    ]\n

    puts "###combine static lst##"

    puts [time {
    set new [adv_chg_no_range4 1 $lst]
    foreach x {1 12 18 5 6} {
    lsearch -all $new $x
    }

    }

    ]\n


    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp $lst {^(?:1|12|18|5|6)$}}]
    puts "###combine proc 4##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] {^(?:1|12|18|5|6)$}}]
    puts "###combine proc 5##"
    puts [time {lsearch -all -regexp [adv_chg_no_range5 1 $lst] (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp [adv_chg_no_range5 1 $lst] {^(?:1|12|18|5|6)$}}]
    puts [time {lsearch -all -regexp $new {^(?:1|12|18|5|6)$}}]
    puts "###combine proc 6##"
    puts [time {lsearch -all -regexp [adv_chg_no_range6 1 $lst] (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp [adv_chg_no_range6 1 $lst] {^(?:1|12|18|5|6)$}}]


    output

    386 microseconds per iteration <<<<<
    ###combine ##
    4807 microseconds per iteration

    ###combine static lst##
    1069 microseconds per iteration <<<<<<<< 5 loop better than regexp but seem not stable ...

    #### regexp
    1311 microseconds per iteration
    703 microseconds per iteration
    ###combine proc 4##
    2550 microseconds per iteration
    1797 microseconds per iteration
    ###combine proc 5##
    1814 microseconds per iteration
    1426 microseconds per iteration <<<<<< lmap get bests perforamce
    942 microseconds per iteration <<<<<<<
    ###combine proc 6##
    2091 microseconds per iteration
    1633 microseconds per iteration


    could you point me where still can improve .

    thanks in advance

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 02:24:43 2022
    heinrichmartin 在 2022年11月4日 星期五凌晨12:49:33 [UTC+13] 的信中寫道:
    On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
    On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
    3. single proc Vs combine proc the time consume not proportional .... still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
    5times faster than one line (-regexp ) code ....
    Let me comment with a quote:

    On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:
    [...] I do know that such benchmarks are notoriously difficult to get right. [...]
    Btw, here is my quick take on it (including a few cross-checks):
    expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
    expect:~$ ll $l
    10000
    expect:~$ lindex $l 25
    22
    expect:~$ ::tcl::mathfunc::max {*}$l
    38
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.131 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.889 microseconds per iteration
    expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.803 microseconds per iteration
    expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
    193.114 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
    expect:~$ set s [lsort -integer $l]; puts foo
    foo
    expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000
    103.633 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000
    104.209 microseconds per iteration
    expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000
    106.351 microseconds per iteration
    expect:~$

    Hi heinrichmartin

    update code with your comment and test result below

    set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]

    ##### each number +id %39 first char title id
    proc adv_chg_no_range4 {id lst} {
    set new_lst [lindex $lst 0]
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    if {$tmp_val == 0} {set tmp_val 39}
    lappend new_lst $tmp_val
    }
    return $new_lst
    }



    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }


    proc adv_chg_no_range6 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    for {set i 1} {$i < [llength $lst]} {incr i} {
    lappend new_lst [expr {(([lindex $lst $i]+$id)%-39)+39} ]
    }
    return $new_lst
    }



    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all -exact $lst $x
    }

    }

    ]

    puts "###combine ##"

    puts [time {

    foreach x {1 12 18 5 6} {
    lsearch -all [adv_chg_no_range4 1 $lst] $x
    }

    }

    ]\n

    puts "###combine static lst##"

    puts [time {
    set new [adv_chg_no_range4 1 $lst]
    foreach x {1 12 18 5 6} {
    lsearch -all $new $x
    }

    }

    ]\n


    puts "#### regexp"
    puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp $lst {^(?:1|12|18|5|6)$}}]
    puts "###combine proc 4##"
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] {^(?:1|12|18|5|6)$}}]
    puts "###combine proc 5##"
    puts [time {lsearch -all -regexp [adv_chg_no_range5 1 $lst] (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp [adv_chg_no_range5 1 $lst] {^(?:1|12|18|5|6)$}}]
    puts [time {lsearch -all -regexp $new {^(?:1|12|18|5|6)$}}]
    puts "###combine proc 6##"
    puts [time {lsearch -all -regexp [adv_chg_no_range6 1 $lst] (^1$|^12$|^18$|^5$|^6$)}]
    puts [time {lsearch -all -regexp [adv_chg_no_range6 1 $lst] {^(?:1|12|18|5|6)$}}]


    output

    386 microseconds per iteration <<<<<
    ###combine ##
    4807 microseconds per iteration

    ###combine static lst##
    1069 microseconds per iteration <<<<<<<< 5 loop better than regexp but seem not stable ...

    #### regexp
    1311 microseconds per iteration
    703 microseconds per iteration
    ###combine proc 4##
    2550 microseconds per iteration
    1797 microseconds per iteration
    ###combine proc 5##
    1814 microseconds per iteration
    1426 microseconds per iteration <<<<<< lmap get bests perforamce
    942 microseconds per iteration <<<<<<<
    ###combine proc 6##
    2091 microseconds per iteration
    1633 microseconds per iteration


    could you point me where still can improve .

    thanks in advance

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 02:58:37 2022
    Ralf Fassel 在 2022年11月4日 星期五晚上10:48:51 [UTC+13] 的信中寫道:
    * "[email protected]" <[email protected]>
    | ##### each number +id %39 first char title id
    | proc adv_chg_no_range4 {id lst} {
    | set new_lst [lindex $lst 0]
    | for {set i 1} {$i < [llength $lst]} {incr i} {
    | set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
    | if {$tmp_val == 0} {set tmp_val 39}
    | lappend new_lst $tmp_val
    | }
    | return $new_lst
    | }
    --<snip-snip>--
    | proc adv_chg_no_range6 {id lst} {
    | set new_lst [lindex $lst 0]
    | incr id -1
    | for {set i 1} {$i < [llength $lst]} {incr i} {
    | lappend new_lst [expr {(([lindex $lst $i]+$id)%-39)+39} ]
    | }
    | return $new_lst
    | }
    --<snip-snip>--
    | could you point me where still can improve .
    Instead of traversing the list by index, use foreach:
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set elt [lindex $lst $i]
    ...
    }

    foreach elt [lrange $lst 1 end] {
    ...
    }

    But I have to admit I did not follow this thread to understand what your ultimate goal is.

    HTH
    R'

    Hi Ralf

    thank for your suggestion , my goal is best lsearch performance
    will try foreach will improve how much in multi analysis

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to rolance on Fri Nov 4 03:16:25 2022
    On Friday, November 4, 2022 at 10:58:40 AM UTC+1, rolance wrote:
    thank for your suggestion , my goal is best lsearch performance
    will try foreach will improve how much in multi analysis

    Your claims are inconsistent, lsearch has no foreach.

    Your initial problem ("Given a list of integers, find all matching integers fast.") is solved in this thread.
    But your issues are somewhere else, e.g. reading data, transforming data, memory management.

    Just one example: if you cannot accept multiple copies of your GBs of data, then you must transform the values in place (for & lindex & lset) or process them in a stream (i.e. do not create a list at all).

    https://www.youtube.com/watch?v=c33AZBnRHks might describe an unexpectedly close problem (but we still don't know your exact objective). Anyway, the video contains quite some lessons about data handling/analysis.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 03:50:54 2022
    heinrichmartin 在 2022年11月4日 星期五晚上11:16:27 [UTC+13] 的信中寫道:
    On Friday, November 4, 2022 at 10:58:40 AM UTC+1, rolance wrote:
    thank for your suggestion , my goal is best lsearch performance
    will try foreach will improve how much in multi analysis
    Your claims are inconsistent, lsearch has no foreach.

    Your initial problem ("Given a list of integers, find all matching integers fast.") is solved in this thread.
    But your issues are somewhere else, e.g. reading data, transforming data, memory management.

    Just one example: if you cannot accept multiple copies of your GBs of data, then you must transform the values in place (for & lindex & lset) or process them in a stream (i.e. do not create a list at all).

    https://www.youtube.com/watch?v=c33AZBnRHks might describe an unexpectedly close problem (but we still don't know your exact objective). Anyway, the video contains quite some lessons about data handling/analysis.

    Hi heinrichmartin

    thanks for your advice and reference video

    my exact objective is improve the lsearch speed and your suggestion already help to raise 20% performance than original code , the lsearch is the key factor for the analysis program
    in my statement contain some memory managemt may hace side effect , process them in a stream (over 1.6G have alloc issue ..) or multiple copies already try not better than existing code....
    or you have better ideal for improve speed , accept all suggestion ^^

    thanks

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Ralf Fassel on Fri Nov 4 13:19:58 2022
    Ralf Fassel <[email protected]> wrote:
    But I have to admit I did not follow this thread to understand what your ultimate goal is.

    The OP has been asked what they are trying to achieve, multiple times.

    No reply has ever done more than post yet another block of hard to
    follow code with a faint "help me make this faster" request.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Fri Nov 4 13:23:00 2022
    [email protected] <[email protected]> wrote:
    Ralf Fassel ? 2022?11?4? ?????10:48:51 [UTC+13] ??????
    * "[email protected]" <[email protected]>
    | could you point me where still can improve .
    Instead of traversing the list by index, use foreach:
    for {set i 1} {$i < [llength $lst]} {incr i} {
    set elt [lindex $lst $i]
    ...
    }

    foreach elt [lrange $lst 1 end] {
    ...
    }

    But I have to admit I did not follow this thread to understand what your
    ultimate goal is.

    HTH
    R'

    Hi Ralf

    thank for your suggestion , my goal is best lsearch performance
    will try foreach will improve how much in multi analysis

    No, that is not the answer to Ralf's question.

    That is your current half-baked solution, but tells us nothing about
    what you are trying to do beyond "faster lsearch". If we knew why you
    were doing the lsearch (i.e., what the outer algorithm that is
    searching this list is attempting to do) we might be able to suggest an alternative.

    As it is, you have a solution that does not work for you in your view,
    and are only asking us to help tweak your solution -- without even
    telling us enough to understand what that solution happens to be doing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Fri Nov 4 13:29:57 2022
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?4? ?????3:09:19 [UTC+13] ??????
    [email protected] <[email protected]> wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    Unsurprising here. The first does 5 searches over a static list.

    The second makes 5 calls into a proc, that iterates the entire list via
    foreach, returning a new list, and that new list is then searched.

    Iterating the entire list, in Tcl, to generate a new list is where most
    of this time difference is being consumed.
    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration
    Also unsurprising, searches using the regex engine will be slower than
    searches using simple comparisons, because the regex engine does far
    more than simple comparisons.

    The time difference between the static list and the newly generated
    list via a proc comments above also apply here.
    could someone give me direction to overcome speed issue
    If your issue is lsearch speed, then:

    1) only search a static list (i.e., don't recreate the list first
    before searching it)

    2) if your desired results can also be obtained from searching a sorted
    list, then first sort the list (sort only once, then search plural
    times). Using the -sorted option to lsearch causes lsearch to
    perform a binary search of the list, which will be faster than
    without -sorted, which causes lsearch to perform a linear search
    (start at first element, look at each in sequence until found).

    But you've failed to describe your actual problem. You've shown a
    solution that does not work for you speed wise, but not described to us
    what you are trying to achieve by this solution you've given. It is
    quite possible there is some alternative way, without using lsearch, to
    achieve your desired result, but we can't read your mind over Usenet to
    know what your actual problem is to even be able to consider some
    alternate.


    Hi Rich

    1. static list indeed faster , but need recreate lst for each
    analysis , how can improve this ... release memory ?

    Please explain to us why you need to recreate the list for each
    analysis. We see no reason yet why this is necessary (and don't simply
    say "because I need to do so" -- that will not be an acceptable
    answer).

    You can't 'release memory' manually with Tcl, that is all automatic.

    2 this actual problem is speed issue , all analysis result can fit
    request except the consume time , customer want cut half ...

    No, the actual problem is whatever it is you are trying to do. You've
    never told us that, other than a meaningless "analysis". If you had
    told us the actual problem, we would then bee ablle

    adv_chg_no_range5 is the best performance
    5 time loop search is better than regexp , but not stable..
    (update code post pre , please refer)

    could you point me some better or alterntive way to achieve speed
    request

    Not unless you explain why you need to recreate the list, and what
    outer reason you need to iterate the list each time to recreate it
    instead of re-creating one copy then searching that plural times.

    And, if you'd ever bother to tell us what you are really trying to do,
    there might be a better faster way with a change in how you represent
    the data. But you've never given us anything but your vague
    statements, so there is no way for us to recognize a superior way of representing the data that would increase speed (assuming such a way
    does exist).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 13:13:33 2022
    Rich 在 2022年11月5日 星期六凌晨2:30:01 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?4? ?????3:09:19 [UTC+13] ??????
    [email protected] <[email protected]> wrote:
    Hi all

    mass numbers analysis , need improve speed ...

    Unsurprising here. The first does 5 searches over a static list.

    The second makes 5 calls into a proc, that iterates the entire list via >> foreach, returning a new list, and that new list is then searched.

    Iterating the entire list, in Tcl, to generate a new list is where most >> of this time difference is being consumed.
    #### regexp
    1341 microseconds per iteration
    ###combine ##
    2525 microseconds per iteration
    Also unsurprising, searches using the regex engine will be slower than
    searches using simple comparisons, because the regex engine does far
    more than simple comparisons.

    The time difference between the static list and the newly generated
    list via a proc comments above also apply here.
    could someone give me direction to overcome speed issue
    If your issue is lsearch speed, then:

    1) only search a static list (i.e., don't recreate the list first
    before searching it)

    2) if your desired results can also be obtained from searching a sorted >> list, then first sort the list (sort only once, then search plural
    times). Using the -sorted option to lsearch causes lsearch to
    perform a binary search of the list, which will be faster than
    without -sorted, which causes lsearch to perform a linear search
    (start at first element, look at each in sequence until found).

    But you've failed to describe your actual problem. You've shown a
    solution that does not work for you speed wise, but not described to us >> what you are trying to achieve by this solution you've given. It is
    quite possible there is some alternative way, without using lsearch, to >> achieve your desired result, but we can't read your mind over Usenet to >> know what your actual problem is to even be able to consider some
    alternate.


    Hi Rich

    1. static list indeed faster , but need recreate lst for each
    analysis , how can improve this ... release memory ?
    Please explain to us why you need to recreate the list for each
    analysis. We see no reason yet why this is necessary (and don't simply
    say "because I need to do so" -- that will not be an acceptable
    answer).

    You can't 'release memory' manually with Tcl, that is all automatic.
    2 this actual problem is speed issue , all analysis result can fit
    request except the consume time , customer want cut half ...
    No, the actual problem is whatever it is you are trying to do. You've
    never told us that, other than a meaningless "analysis". If you had
    told us the actual problem, we would then bee ablle
    adv_chg_no_range5 is the best performance
    5 time loop search is better than regexp , but not stable..
    (update code post pre , please refer)

    could you point me some better or alterntive way to achieve speed
    request
    Not unless you explain why you need to recreate the list, and what
    outer reason you need to iterate the list each time to recreate it
    instead of re-creating one copy then searching that plural times.

    And, if you'd ever bother to tell us what you are really trying to do,
    there might be a better faster way with a change in how you represent
    the data. But you've never given us anything but your vague
    statements, so there is no way for us to recognize a superior way of representing the data that would increase speed (assuming such a way
    does exist).

    Hi Rich

    thanks for your advice .

    ("Given a list of integers, find all matching integers fast." with position ) is what I look for.
    this main program is customize lottary prediction and analysis system , customer made some rule to recreate list number , and find the best position hit the target , also reference history record up to "N" previous hit.

    that is what i need recreate it each time.... original program for N phase 3125x3125x39x1 time comparison upto 7hr..
    with some shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...

    previousily he tell me other C++ program can achieve (but different rule, my code more rules)

    already separate data create with mass file , only source file and change each number +1 for 39 sift range...

    do it's limitation of tcl , or some way can faster calculate .

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Fri Nov 4 21:30:15 2022
    [email protected] <[email protected]> wrote:
    Hi Rich

    thanks for your advice .

    ("Given a list of integers, find all matching integers fast." with
    position ) is what I look for. this main program is customize
    lottary prediction and analysis system , customer made some rule to
    recreate list number , and find the best position hit the target ,
    also reference history record up to "N" previous hit.

    that is what i need recreate it each time.... original program for N
    phase 3125x3125x39x1 time comparison upto 7hr.. with some
    shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...

    Do you mean it searches for the positions in 228 515 625 000 (228
    billion) total numbers?

    For a list that takes 396 sec, what is the 'llength' result of running
    llength on that list?

    previousily he tell me other C++ program can achieve (but different
    rule, my code more rules)

    C++ is also a compiled program. That very likely gives it a
    performance edge from the start.

    already separate data create with mass file , only source file and
    change each number +1 for 39 sift range...

    do it's limitation of tcl , or some way can faster calculate .

    So you have a list of length X (what is a typical X value?) of numbers
    from 1 to 39, and you want to find all occurrences of a given number
    and return the positions of all those numbers in that list?

    I.e, consider this code:

    $ rlwrap tclsh
    % package require sqlite3
    3.35.5
    % sqlite3 db :memory:
    % db eval {create table list (id integer primary key, num integer);}

    This below inserts five million numbers that are randomly chosen
    between 1 and 39:

    % for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}

    Create an index on the "num" column:

    % db eval {create index idx on list(num);}

    Do some "searching", the 'result' of these queries will be the
    "position" of each value being requested:

    % time {db eval {select id from list where num = 20;}}
    22530 microseconds per iteration
    % time {db eval {select id from list where num = 20;}}
    22827 microseconds per iteration
    % time {db eval {select id from list where num = 27;}}
    22856 microseconds per iteration
    % time {db eval {select id from list where num = 5;}}
    22711 microseconds per iteration

    Pretty consistent, about 22.5ms to find the "positions" of all the
    numbers equal to 20 (or 27 or 5) in the "list" of five million
    possibilities. Memory consumption, as reported by top, is 332Meg.

    This is what I meant when I asked you to tell us the problem you were
    trying to solve, instead of giving us your "solution", keeping the
    actual problem secret, and asking us to help fix the solution. You do
    yourself no favors that way, and prevent us from being able to suggest
    superior alternatives.

    In the code above I am finding the "positions" of numbers in an
    "ordered list" of five million entries in about 22.5ms above. Now,
    granted, I am not doing anything with those positions, but the
    alternate "lsearch" above is extremely fast.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 15:31:25 2022
    Rich 在 2022年11月5日 星期六上午10:30:18 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Hi Rich

    thanks for your advice .

    ("Given a list of integers, find all matching integers fast." with position ) is what I look for. this main program is customize
    lottary prediction and analysis system , customer made some rule to recreate list number , and find the best position hit the target ,
    also reference history record up to "N" previous hit.

    that is what i need recreate it each time.... original program for N
    phase 3125x3125x39x1 time comparison upto 7hr.. with some
    shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...
    Do you mean it searches for the positions in 228 515 625 000 (228
    billion) total numbers?

    For a list that takes 396 sec, what is the 'llength' result of running llength on that list?
    previousily he tell me other C++ program can achieve (but different
    rule, my code more rules)
    C++ is also a compiled program. That very likely gives it a
    performance edge from the start.
    already separate data create with mass file , only source file and
    change each number +1 for 39 sift range...

    do it's limitation of tcl , or some way can faster calculate .
    So you have a list of length X (what is a typical X value?) of numbers
    from 1 to 39, and you want to find all occurrences of a given number
    and return the positions of all those numbers in that list?

    I.e, consider this code:

    $ rlwrap tclsh
    % package require sqlite3
    3.35.5
    % sqlite3 db :memory:
    % db eval {create table list (id integer primary key, num integer);}

    This below inserts five million numbers that are randomly chosen
    between 1 and 39:

    % for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}

    Create an index on the "num" column:

    % db eval {create index idx on list(num);}

    Do some "searching", the 'result' of these queries will be the
    "position" of each value being requested:

    % time {db eval {select id from list where num = 20;}}
    22530 microseconds per iteration
    % time {db eval {select id from list where num = 20;}}
    22827 microseconds per iteration
    % time {db eval {select id from list where num = 27;}}
    22856 microseconds per iteration
    % time {db eval {select id from list where num = 5;}}
    22711 microseconds per iteration

    Pretty consistent, about 22.5ms to find the "positions" of all the
    numbers equal to 20 (or 27 or 5) in the "list" of five million possibilities. Memory consumption, as reported by top, is 332Meg.

    This is what I meant when I asked you to tell us the problem you were
    trying to solve, instead of giving us your "solution", keeping the
    actual problem secret, and asking us to help fix the solution. You do yourself no favors that way, and prevent us from being able to suggest superior alternatives.

    In the code above I am finding the "positions" of numbers in an
    "ordered list" of five million entries in about 22.5ms above. Now,
    granted, I am not doing anything with those positions, but the
    alternate "lsearch" above is extremely fast.

    hi Rich

    great thanks for your advice

    will change data structure to sqlite get faster speed .


    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Fri Nov 4 17:28:26 2022
    [email protected] 在 2022年11月5日 星期六上午11:31:28 [UTC+13] 的信中寫道:
    Rich 在 2022年11月5日 星期六上午10:30:18 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Hi Rich

    thanks for your advice .

    ("Given a list of integers, find all matching integers fast." with position ) is what I look for. this main program is customize
    lottary prediction and analysis system , customer made some rule to recreate list number , and find the best position hit the target ,
    also reference history record up to "N" previous hit.

    that is what i need recreate it each time.... original program for N phase 3125x3125x39x1 time comparison upto 7hr.. with some
    shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...
    Do you mean it searches for the positions in 228 515 625 000 (228
    billion) total numbers?

    For a list that takes 396 sec, what is the 'llength' result of running llength on that list?
    previousily he tell me other C++ program can achieve (but different rule, my code more rules)
    C++ is also a compiled program. That very likely gives it a
    performance edge from the start.
    already separate data create with mass file , only source file and change each number +1 for 39 sift range...

    do it's limitation of tcl , or some way can faster calculate .
    So you have a list of length X (what is a typical X value?) of numbers from 1 to 39, and you want to find all occurrences of a given number
    and return the positions of all those numbers in that list?

    I.e, consider this code:

    $ rlwrap tclsh
    % package require sqlite3
    3.35.5
    % sqlite3 db :memory:
    % db eval {create table list (id integer primary key, num integer);}

    This below inserts five million numbers that are randomly chosen
    between 1 and 39:

    % for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}

    Create an index on the "num" column:

    % db eval {create index idx on list(num);}

    Do some "searching", the 'result' of these queries will be the
    "position" of each value being requested:

    % time {db eval {select id from list where num = 20;}}
    22530 microseconds per iteration
    % time {db eval {select id from list where num = 20;}}
    22827 microseconds per iteration
    % time {db eval {select id from list where num = 27;}}
    22856 microseconds per iteration
    % time {db eval {select id from list where num = 5;}}
    22711 microseconds per iteration

    Pretty consistent, about 22.5ms to find the "positions" of all the
    numbers equal to 20 (or 27 or 5) in the "list" of five million possibilities. Memory consumption, as reported by top, is 332Meg.

    This is what I meant when I asked you to tell us the problem you were trying to solve, instead of giving us your "solution", keeping the
    actual problem secret, and asking us to help fix the solution. You do yourself no favors that way, and prevent us from being able to suggest superior alternatives.

    In the code above I am finding the "positions" of numbers in an
    "ordered list" of five million entries in about 22.5ms above. Now, granted, I am not doing anything with those positions, but the
    alternate "lsearch" above is extremely fast.
    hi Rich

    great thanks for your advice

    will change data structure to sqlite get faster speed .


    BR
    Rolance

    hi Rich

    below code with my test result
    could you point me what wrong same code with different output



    package require sqlite3

    console show

    set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]


    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }


    puts "#### db ####"


    sqlite3 db :memory:

    db eval {create table list (id integer primary key, num integer);}

    for {set i 0} {$i < 3125} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}

    db eval {create index idx on list(num);}

    puts [time {db eval {select id from list where num = 20;}}]
    puts [time {db eval {select id from list where num = 1;}}]
    puts [time {db eval {select id from list where num = 36;}}]
    puts [time {db eval {select id from list where num = 6;}}]
    puts [time {db eval {select id from list where num = 23;}}]

    db close

    puts "#### db1 ####"

    sqlite3 db1 :memory:
    db1 eval {create table list (id integer primary key, num integer);}

    #### use $lst performance not ideal ---> mark use same data source
    #foreach x [adv_chg_no_range5 1 $lst] {
    # db1 eval "insert into list (num) values ($x);"
    #}

    for {set i 0} {$i < 3125} {incr i} { db1 eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}


    puts [time {db1 eval {select id from list where num = 20;}}]
    puts [time {db1 eval {select id from list where num = 1;}}]
    puts [time {db1 eval {select id from list where num = 36;}}]
    puts [time {db1 eval {select id from list where num = 6;}}]
    puts [time {db1 eval {select id from list where num = 23;}}]

    db1 close


    ####output ###

    #### db ####
    36 microseconds per iteration
    86 microseconds per iteration
    89 microseconds per iteration
    92 microseconds per iteration
    93 microseconds per iteration
    #### db1 #### <<<<<< over 3 times 195 microseconds per iteration
    240 microseconds per iteration
    219 microseconds per iteration
    221 microseconds per iteration
    234 microseconds per iteration


    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Sat Nov 5 02:06:46 2022
    [email protected] <[email protected]> wrote:
    [email protected] ? 2022?11?5? ?????11:31:28 [UTC+13] ??????
    Rich ? 2022?11?5? ?????10:30:18 [UTC+13] ??????
    I.e, consider this code:

    $ rlwrap tclsh
    % package require sqlite3
    3.35.5
    % sqlite3 db :memory:
    % db eval {create table list (id integer primary key, num integer);}

    This below inserts five million numbers that are randomly chosen
    between 1 and 39:

    % for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}

    Create an index on the "num" column:

    % db eval {create index idx on list(num);}

    Do some "searching", the 'result' of these queries will be the
    "position" of each value being requested:

    % time {db eval {select id from list where num = 20;}}
    22530 microseconds per iteration
    % time {db eval {select id from list where num = 20;}}
    22827 microseconds per iteration
    % time {db eval {select id from list where num = 27;}}
    22856 microseconds per iteration
    % time {db eval {select id from list where num = 5;}}
    22711 microseconds per iteration

    Pretty consistent, about 22.5ms to find the "positions" of all the
    numbers equal to 20 (or 27 or 5) in the "list" of five million
    possibilities. Memory consumption, as reported by top, is 332Meg.

    hi Rich

    below code with my test result
    could you point me what wrong same code with different output

    db1 and db2 are not the same code.

    Your db2 code omits creating an index on the num column.

    Without the index sqlite has to scan all the rows in the table.

    With the index it can refer to the index and (effectively) directly
    pull out the matching rows.

    That is the reason for the time difference. The index consumes some
    up-front time to build, in order to significantly accelerate lookups
    later.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Sat Nov 5 02:52:48 2022
    Rich 在 2022年11月5日 星期六下午3:06:51 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    [email protected] ? 2022?11?5? ?????11:31:28 [UTC+13] ??????
    Rich ? 2022?11?5? ?????10:30:18 [UTC+13] ??????
    I.e, consider this code:

    $ rlwrap tclsh
    % package require sqlite3
    3.35.5
    % sqlite3 db :memory:
    % db eval {create table list (id integer primary key, num integer);}

    This below inserts five million numbers that are randomly chosen
    between 1 and 39:

    % for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}

    Create an index on the "num" column:

    % db eval {create index idx on list(num);}

    Do some "searching", the 'result' of these queries will be the
    "position" of each value being requested:

    % time {db eval {select id from list where num = 20;}}
    22530 microseconds per iteration
    % time {db eval {select id from list where num = 20;}}
    22827 microseconds per iteration
    % time {db eval {select id from list where num = 27;}}
    22856 microseconds per iteration
    % time {db eval {select id from list where num = 5;}}
    22711 microseconds per iteration

    Pretty consistent, about 22.5ms to find the "positions" of all the
    numbers equal to 20 (or 27 or 5) in the "list" of five million
    possibilities. Memory consumption, as reported by top, is 332Meg.

    hi Rich

    below code with my test result
    could you point me what wrong same code with different output
    db1 and db2 are not the same code.

    Your db2 code omits creating an index on the num column.

    Without the index sqlite has to scan all the rows in the table.

    With the index it can refer to the index and (effectively) directly
    pull out the matching rows.

    That is the reason for the time difference. The index consumes some
    up-front time to build, in order to significantly accelerate lookups
    later.

    hi Rich

    thank for the detail explanation ,
    single test program have great performance , but in real multi-program implement speed not better than previous Lsearch code .....
    still debug what factor will affect speed ...

    thanks for great help , if you have some hint about program interactive affect , please point me

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Sat Nov 5 13:11:13 2022
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????
    [email protected] <[email protected]> wrote:
    Your db2 code omits creating an index on the num column.

    Without the index sqlite has to scan all the rows in the table.

    With the index it can refer to the index and (effectively) directly
    pull out the matching rows.

    That is the reason for the time difference. The index consumes some
    up-front time to build, in order to significantly accelerate lookups
    later.

    hi Rich

    thank for the detail explanation ,
    single test program have great performance , but in real
    multi-program implement speed not better than previous Lsearch code
    ..... still debug what factor will affect speed ...

    I also asked how long the list was in the real world program, and as
    you have done so many times in this thread, you ignored that question
    entirely. If you continue ignoring direct questions that would allow
    us to help you I will plonk you.

    In the real program, how long is the list that is being searched?

    thanks for great help , if you have some hint about program
    interactive affect , please point me

    We can't help if you do not give us the information we need to be able
    to effectively help.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Sat Nov 5 14:21:04 2022
    Rich 在 2022年11月6日 星期日凌晨2:11:17 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????
    [email protected] <[email protected]> wrote:
    Your db2 code omits creating an index on the num column.

    Without the index sqlite has to scan all the rows in the table.

    With the index it can refer to the index and (effectively) directly
    pull out the matching rows.

    That is the reason for the time difference. The index consumes some
    up-front time to build, in order to significantly accelerate lookups
    later.

    hi Rich

    thank for the detail explanation ,
    single test program have great performance , but in real
    multi-program implement speed not better than previous Lsearch code
    ..... still debug what factor will affect speed ...
    I also asked how long the list was in the real world program, and as
    you have done so many times in this thread, you ignored that question entirely. If you continue ignoring direct questions that would allow
    us to help you I will plonk you.

    In the real program, how long is the list that is being searched?
    thanks for great help , if you have some hint about program
    interactive affect , please point me
    We can't help if you do not give us the information we need to be able
    to effectively help.

    hi Rich

    thanks for your advice
    only analysis partical list :3125 range 1
    code and result below



    ###############
    package require sqlite3

    console show

    set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]


    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }



    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }
    }
    return $res
    }



    proc mark_lott_loc_row_no_sort3_p {lott lst} {
    sqlite3 db1 :memory:
    db1 eval {create table list (id integer primary key, num integer);}
    foreach x [lrange $lst 1 end] {
    db1 eval "insert into list (num) values ($x);"
    }
    db1 eval {create index idx on list(num);}

    set res ""
    foreach cp $lott {
    lappend res {*}[db1 eval {select id from list where num = $cp;}]
    }

    db1 eval { delete from list; }
    db1 close

    return $res
    }


    puts "#### lsearch 3125 range 1 ####"
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]

    puts "#### sqlite3 3125 range 1####"
    puts [time {mark_lott_loc_row_no_sort3_p {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]


    ##### output #####

    #### lsearch 3125 range 1 ####
    720.2 microseconds per iteration
    #### sqlite3 3125 range 1####
    17818.6 microseconds per iteration


    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Sun Nov 6 01:40:10 2022
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?6? ?????2:11:17 [UTC+13] ??????
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????
    I also asked how long the list was in the real world program, and as
    you have done so many times in this thread, you ignored that
    question entirely. If you continue ignoring direct questions that
    would allow us to help you I will plonk you.

    In the real program, how long is the list that is being searched?
    thanks for great help , if you have some hint about program
    interactive affect , please point me
    We can't help if you do not give us the information we need to be
    able to effectively help.

    hi Rich

    thanks for your advice
    only analysis partical list :3125 range 1
    code and result below

    Contining to evade the question. This is your last chance.

    In the real program, how long is the list that is being searched?

    ###############
    package require sqlite3

    console show

    set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]


    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }

    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }
    }
    return $res
    }

    proc mark_lott_loc_row_no_sort3_p {lott lst} {
    sqlite3 db1 :memory:
    db1 eval {create table list (id integer primary key, num integer);}
    foreach x [lrange $lst 1 end] {
    db1 eval "insert into list (num) values ($x);"
    }
    db1 eval {create index idx on list(num);}

    set res ""
    foreach cp $lott {
    lappend res {*}[db1 eval {select id from list where num = $cp;}]
    }

    db1 eval { delete from list; }
    db1 close

    return $res
    }

    puts "#### lsearch 3125 range 1 ####"
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]

    puts "#### sqlite3 3125 range 1####"
    puts [time {mark_lott_loc_row_no_sort3_p {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]

    ##### output #####

    #### lsearch 3125 range 1 ####
    720.2 microseconds per iteration
    #### sqlite3 3125 range 1####
    17818.6 microseconds per iteration

    I made a small code change to your example, and I get the following
    timing on my machine after the change:

    #### lsearch 3125 range 1 ####
    1180.9 microseconds per iteration
    #### sqlite3 3125 range 1####
    638.2 microseconds per iteration

    However, because you are being so evasive, and failing to answer
    direect questions with straightforward answers, I am not going to show
    you the change I made.

    I will, however, give you a hint. Look at what is *very different*
    between the data searched in your mark_lott_loc_row_no_sort1 proc vs.
    the data searched in the mark_lott_loc_row_no_sort3_p.

    What does one of those proc's do that the other one does not.

    When you see what one of them does, that the other does not, then
    consider changing the code so the one that is presently doing the
    different work no longer is performing that different work. When you
    make the change, you'll get simlar timings to what I just got.

    However, fail to be straighforward in your next answer, and you will be
    plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your
    last and final warning.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Sat Nov 5 21:37:53 2022
    Rich 在 2022年11月6日 星期日下午2:41:35 [UTC+13] 的信中寫道:
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?6? ?????2:11:17 [UTC+13] ??????
    [email protected] <[email protected]> wrote:
    Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????
    I also asked how long the list was in the real world program, and as
    you have done so many times in this thread, you ignored that
    question entirely. If you continue ignoring direct questions that
    would allow us to help you I will plonk you.

    In the real program, how long is the list that is being searched?
    thanks for great help , if you have some hint about program
    interactive affect , please point me
    We can't help if you do not give us the information we need to be
    able to effectively help.

    hi Rich

    thanks for your advice
    only analysis partical list :3125 range 1
    code and result below
    Contining to evade the question. This is your last chance.
    In the real program, how long is the list that is being searched?
    ###############
    package require sqlite3

    console show

    set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]


    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }

    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }
    }
    return $res
    }

    proc mark_lott_loc_row_no_sort3_p {lott lst} {
    sqlite3 db1 :memory:
    db1 eval {create table list (id integer primary key, num integer);} foreach x [lrange $lst 1 end] {
    db1 eval "insert into list (num) values ($x);"
    }
    db1 eval {create index idx on list(num);}

    set res ""
    foreach cp $lott {
    lappend res {*}[db1 eval {select id from list where num = $cp;}]
    }

    db1 eval { delete from list; }
    db1 close

    return $res
    }

    puts "#### lsearch 3125 range 1 ####"
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]

    puts "#### sqlite3 3125 range 1####"
    puts [time {mark_lott_loc_row_no_sort3_p {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]

    ##### output #####

    #### lsearch 3125 range 1 ####
    720.2 microseconds per iteration
    #### sqlite3 3125 range 1####
    17818.6 microseconds per iteration
    I made a small code change to your example, and I get the following
    timing on my machine after the change:
    #### lsearch 3125 range 1 ####
    1180.9 microseconds per iteration
    #### sqlite3 3125 range 1####
    638.2 microseconds per iteration

    However, because you are being so evasive, and failing to answer
    direect questions with straightforward answers, I am not going to show
    you the change I made.

    I will, however, give you a hint. Look at what is *very different*
    between the data searched in your mark_lott_loc_row_no_sort1 proc vs.
    the data searched in the mark_lott_loc_row_no_sort3_p.

    What does one of those proc's do that the other one does not.

    When you see what one of them does, that the other does not, then
    consider changing the code so the one that is presently doing the
    different work no longer is performing that different work. When you
    make the change, you'll get simlar timings to what I just got.

    However, fail to be straighforward in your next answer, and you will be plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your
    last and final warning.

    Hi Rich

    let me explain the detail
    one section data list in the array 3125x3125x39x1 the total consume time is 396 sec.
    (with reverse pyramid optimization only calaulate hit position , if not the total consume upto 7hr)
    customer want 3125x3125x39x 600 time not over 12hr ... (the multi-tread the the last option ...)
    so the target goal is 3125x3125x39x1 --> within 60 sec (single thread)

    the demo proc is "the real program" for reverse pyramid first layer (base) , calculate each time with fix rule change no ....

    beacuse customer use different machine with different OS try to get best perforamce , buy latest machine is the last option (but according to experience new noly up 30%~ 40%)

    #### lsearch 3125 range 1 ####
    720.2 microseconds per iteration

    #### sqlite3 3125 range 1####
    638.2 microseconds per iteration

    the new core speed up around 12%~15% , I am not sure after multi-time use the memory occupy if stable or go raise.... some time get alloc error ...

    the non-static list affect search time so big ....

    or you have other way to show me get faster speed in combine program (var pass , let speed low or memory occupy issue)

    hope you can show you have achieved , will implement in real main program

    thank in advance


    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to rolance on Mon Nov 7 01:02:43 2022
    On Sunday, November 6, 2022 at 5:39:28 AM UTC+1, rolance wrote:
    Rich 在 2022年11月6日 星期日下午2:41:35 [UTC+13] 的信中寫道:
    rolance wrote:
    #### lsearch 3125 range 1 ####
    720.2 microseconds per iteration
    #### sqlite3 3125 range 1####
    17818.6 microseconds per iteration
    I made a small code change to your example, and I get the following
    timing on my machine after the change:
    #### lsearch 3125 range 1 ####
    1180.9 microseconds per iteration
    #### sqlite3 3125 range 1####
    638.2 microseconds per iteration

    However, because you are being so evasive, and failing to answer
    direect questions with straightforward answers, I am not going to show
    you the change I made.

    Rich, congrats for your patience and that cool way to draw attention to the message :-)

    However, fail to be straighforward in your next answer, and you will be plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your
    last and final warning.

    Rolance, I cannot/will not plonk user, but I have been ignoring messages without progress - and I will continue to do so.
    But I want to encourage you to *rethink/rephrase* your questions and post them again when you do not see an answer within a few days - writing down the actual problem helped me a lot quite some times, even before ever sending it.
    In the end, you can see on clt - even in this thread - that the community is willing to help, even with questions that are not strictly related to Tcl. (We are just not doing your job entirely.)

    Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:

    * At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
    * Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
    * Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
    algorithms? (Note that brute force is not the smartest approach when time matters.)
    * What data must be calculated live? What parts can be pre-calculated?

    On your way from being a coder to becoming a developer, you might want to re-watch the video that I posted earlier - and work out techniques to improve the implementation that were mentioned.
    Then compare it to your current problem.

    Or think of a chess computer: it cannot calculate all possible moves to any depth. Efficient algorithms must e.g. have thresholds to cut of search branches, reuse sub-trees, use an efficient representation, order possible moves, ... Modern chess
    computers also use databases for openings and end games, i.e. they do not recalculate everything, but they know many winning/losing positions in advance - and they have efficient ways to look these up.

    What next?
    If any of my statements was unclear, ask a precise, direct question.
    Reconsider your problem statement/requirements and elaborate the overall solution - not just one particular step of one possible approach.
    If you get stuck while working things out, give the context and ask a precise question.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to All on Mon Nov 7 03:41:05 2022
    heinrichmartin 在 2022年11月7日 星期一晚上10:02:45 [UTC+13] 的信中寫道:
    On Sunday, November 6, 2022 at 5:39:28 AM UTC+1, rolance wrote:
    Rich 在 2022年11月6日 星期日下午2:41:35 [UTC+13] 的信中寫道:
    rolance wrote:
    #### lsearch 3125 range 1 ####
    720.2 microseconds per iteration
    #### sqlite3 3125 range 1####
    17818.6 microseconds per iteration
    I made a small code change to your example, and I get the following timing on my machine after the change:
    #### lsearch 3125 range 1 ####
    1180.9 microseconds per iteration
    #### sqlite3 3125 range 1####
    638.2 microseconds per iteration

    However, because you are being so evasive, and failing to answer
    direect questions with straightforward answers, I am not going to show you the change I made.
    Rich, congrats for your patience and that cool way to draw attention to the message :-)
    However, fail to be straighforward in your next answer, and you will be plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your last and final warning.
    Rolance, I cannot/will not plonk user, but I have been ignoring messages without progress - and I will continue to do so.
    But I want to encourage you to *rethink/rephrase* your questions and post them again when you do not see an answer within a few days - writing down the actual problem helped me a lot quite some times, even before ever sending it.
    In the end, you can see on clt - even in this thread - that the community is willing to help, even with questions that are not strictly related to Tcl. (We are just not doing your job entirely.)

    Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:

    * At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
    * Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
    * Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
    algorithms? (Note that brute force is not the smartest approach when time matters.)
    * What data must be calculated live? What parts can be pre-calculated?

    On your way from being a coder to becoming a developer, you might want to re-watch the video that I posted earlier - and work out techniques to improve the implementation that were mentioned.
    Then compare it to your current problem.

    Or think of a chess computer: it cannot calculate all possible moves to any depth. Efficient algorithms must e.g. have thresholds to cut of search branches, reuse sub-trees, use an efficient representation, order possible moves, ... Modern chess
    computers also use databases for openings and end games, i.e. they do not recalculate everything, but they know many winning/losing positions in advance - and they have efficient ways to look these up.

    What next?
    If any of my statements was unclear, ask a precise, direct question. Reconsider your problem statement/requirements and elaborate the overall solution - not just one particular step of one possible approach.
    If you get stuck while working things out, give the context and ask a precise question.


    Hi heinrichmartin

    thanks for your great detail explainion
    your previous suggestion already improve program speed 10%~15%


    Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:

    * At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
    * Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
    * Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
    algorithms? (Note that brute force is not the smartest approach when time matters.)
    * What data must be calculated live? What parts can be pre-calculated?

    1.yes , the list with tile-id in first char , and tail with 3125 (customize rule generate integers) and EIAS mean ?

    2.all use unset to clear var when each run get target position , yes increaseing memory will low speed and will get alloc error ...

    3. already simpify only calculate target data , is why original time upto 7hr --> optimizate 396 sec
    below two program is the key factor for the main program , if simplify adv_chg_no_range5 (by pass or no change lst ) , I can get "14 sec" hit customer target one thread
    bast get more and more info within 12hr ( lottery 24 hr open one time ) and the calculate base 3125x3125x39x600x13(ref 13phase)

    since I am not family with sqlite3 , need sometime to research as Rich show part .. up performance 15~20%

    lsearch command is best solution what i can get now. didn't know if sqlite3 will low speed after multi-time excute ....


    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }

    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }
    }
    return $res
    }

    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
    720.2 microseconds per iteration

    4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targeted
    generate target position need calculate alive for customer reference , up 2 program is the key for this part.


    On your way from being a coder to becoming a developer, you might want to re-watch the video that I posted earlier - and work out techniques to improve the implementation that were mentioned.
    Then compare it to your current problem.

    Or think of a chess computer: it cannot calculate all possible moves to any depth. Efficient algorithms must e.g. have thresholds to cut of search branches, reuse sub-trees, use an efficient representation, order possible moves, ... Modern chess
    computers also use databases for openings and end games, i.e. they do not recalculate everything, but they know many winning/losing positions in advance - and they have efficient ways to look these up.
    What next?
    If any of my statements was unclear, ask a precise, direct question. Reconsider your problem statement/requirements and elaborate the overall solution - not just one particular step of one possible approach.
    If you get stuck while working things out, give the context and ask a precise question.

    i know search in an arbitrary list takes times , alrady try several algorithm to test it
    in single run it may better , but in combine real program the speed not ideal...
    so it may the system code area affect and I didn't how to handle and look for your help ..

    this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergers
    or some algorithm can achieve it ?


    thanks in advance

    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to rolance on Mon Nov 7 05:33:11 2022
    On Monday, November 7, 2022 at 12:41:08 PM UTC+1, rolance wrote:
    heinrichmartin 在 2022年11月7日 星期一晚上10:02:45 [UTC+13] 的信中寫道:
    Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:

    * At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
    * Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
    * Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
    algorithms? (Note that brute force is not the smartest approach when time matters.)
    * What data must be calculated live? What parts can be pre-calculated?
    1.yes , the list with tile-id in first char , and tail with 3125 (customize rule generate integers) and EIAS mean ?

    In my reader, it looks like some text is missing from the middle of this line; I am going to comment on keywords.

    If your list size was 3126, then you would most likely not have an issue, i.e. you are talking about a several (thousands of) iterations - you should look closely what else happens in that loop (explicitly or implicitly!).

    Tcl is specified without data types, i.e. everything is a string (EIAS). Every proc may interpret a value in a different way, e.g. as number, as text, as code, ... Internally (on C level), Tcl generates and caches representations. E.g. Tcl will
    implicitly parse the number for you and it will stick to the number representation when you do calculations only.

    https://wiki.tcl-lang.org/page/everything+is+a+string https://wiki.tcl-lang.org/page/shimmering

    2.all use unset to clear var when each run get target position , yes increaseing memory will low speed and will get alloc error ...

    There are hidden catches, but also not-so-obvious options, e.g. https://wiki.tcl-lang.org/page/K#c2a6014c2d129837889d8a8000d05e5c3b44e8f6b46cab777c04df8a927bfad2

    3. already simpify only calculate target data , is why original time upto 7hr --> optimizate 396 sec
    below two program is the key factor for the main program , if simplify adv_chg_no_range5 (by pass or no change lst ) , I can get "14 sec" hit customer target one thread
    bast get more and more info within 12hr ( lottery 24 hr open one time ) and the calculate base 3125x3125x39x600x13(ref 13phase)

    since I am not family with sqlite3 , need sometime to research as Rich show part .. up performance 15~20%

    lsearch command is best solution what i can get now. didn't know if sqlite3 will low speed after multi-time excute ....
    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }

    That requires three list objects (at least temporarily). Note that this does not mean that all list elements are duplicated in memory, just the list of pointers.

    But before digging into Tcl internals, why not consider to no have index 0 in the same list? Then maybe https://wiki.tcl-lang.org/page/VecTcl can help. Do you see how we are _not_ talking about lsearch?

    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }

    lappend res {*}[lsearch -all $lst $cp]

    }
    return $res

    How will you distinguish which element of res is the position of which cp? Are you sure you want lists, not sets?

    }
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
    720.2 microseconds per iteration

    To get the timing of (loop over) lsearch, try
    set lst2 [adv_chg_no_range5 1 $lst]
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} $lst2} 10]

    4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targeted
    generate target position need calculate alive for customer reference , up 2 program is the key for this part.

    I am not going to put effort in trying to understand this. There were several hints that you are working on lottery data.
    Maybe you want to share the original problem. Just guessing: the lottery draws 5 from 39, millions of sold tickets, a dozen winning ranks, calculate the winners.

    this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergers
    or some algorithm can achieve it ?

    Not a precise question.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From [email protected]@21:1/5 to will on Tue Nov 8 02:25:50 2022
    heinrichmartin 在 2022年11月8日 星期二凌晨2:33:13 [UTC+13] 的信中寫道:
    On Monday, November 7, 2022 at 12:41:08 PM UTC+1, rolance wrote:
    heinrichmartin 在 2022年11月7日 星期一晚上10:02:45 [UTC+13] 的信中寫道:
    Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:

    * At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
    * Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
    * Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more
    efficient algorithms? (Note that brute force is not the smartest approach when time matters.)
    * What data must be calculated live? What parts can be pre-calculated?
    1.yes , the list with tile-id in first char , and tail with 3125 (customize rule generate integers) and EIAS mean ?
    In my reader, it looks like some text is missing from the middle of this line; I am going to comment on keywords.

    If your list size was 3126, then you would most likely not have an issue, i.e. you are talking about a several (thousands of) iterations - you should look closely what else happens in that loop (explicitly or implicitly!).

    Tcl is specified without data types, i.e. everything is a string (EIAS). Every proc may interpret a value in a different way, e.g. as number, as text, as code, ... Internally (on C level), Tcl generates and caches representations. E.g. Tcl will
    implicitly parse the number for you and it will stick to the number representation when you do calculations only.

    https://wiki.tcl-lang.org/page/everything+is+a+string https://wiki.tcl-lang.org/page/shimmering
    2.all use unset to clear var when each run get target position , yes increaseing memory will low speed and will get alloc error ...
    There are hidden catches, but also not-so-obvious options, e.g. https://wiki.tcl-lang.org/page/K#c2a6014c2d129837889d8a8000d05e5c3b44e8f6b46cab777c04df8a927bfad2
    3. already simpify only calculate target data , is why original time upto 7hr --> optimizate 396 sec
    below two program is the key factor for the main program , if simplify adv_chg_no_range5 (by pass or no change lst ) , I can get "14 sec" hit customer target one thread
    bast get more and more info within 12hr ( lottery 24 hr open one time ) and the calculate base 3125x3125x39x600x13(ref 13phase)

    since I am not family with sqlite3 , need sometime to research as Rich show part .. up performance 15~20%

    lsearch command is best solution what i can get now. didn't know if sqlite3 will low speed after multi-time excute ....
    proc adv_chg_no_range5 {id lst} {
    set new_lst [lindex $lst 0]
    incr id -1
    lappend new_lst {*}[lmap v [lrange $lst 1 end] {
    expr {(($v+$id)%-39)+39}
    }]
    return $new_lst
    }
    That requires three list objects (at least temporarily). Note that this does not mean that all list elements are duplicated in memory, just the list of pointers.

    But before digging into Tcl internals, why not consider to no have index 0 in the same list? Then maybe https://wiki.tcl-lang.org/page/VecTcl can help. Do you see how we are _not_ talking about lsearch?
    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }
    lappend res {*}[lsearch -all $lst $cp]

    }
    return $res

    How will you distinguish which element of res is the position of which cp? Are you sure you want lists, not sets?
    }
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
    720.2 microseconds per iteration
    To get the timing of (loop over) lsearch, try
    set lst2 [adv_chg_no_range5 1 $lst]
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} $lst2} 10]
    4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targeted
    generate target position need calculate alive for customer reference , up 2 program is the key for this part.
    I am not going to put effort in trying to understand this. There were several hints that you are working on lottery data.
    Maybe you want to share the original problem. Just guessing: the lottery draws 5 from 39, millions of sold tickets, a dozen winning ranks, calculate the winners.
    this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergers
    or some algorithm can achieve it ?
    Not a precise question.



    Hi heinrichmartin

    thanks for great help for deatil explainion and sufficient reference link
    will do deep research in those links

    That requires three list objects (at least temporarily). Note that this does not mean that all list elements are duplicated in memory, just the list of pointers.
    But before digging into Tcl internals, why not consider to no have index 0 in the same list? Then maybe https://wiki.tcl-lang.org/page/VecTcl can help. Do you see how we are _not_ talking about lsearch?

    due to the lindex 0 tilte-id use for mass column data verification use , customer need use this index to input his verification program and database
    will consider if the speed up obviously , will write transfer program for customer use.

    proc mark_lott_loc_row_no_sort1 {lott lst} {
    set res ""
    foreach cp $lott {
    foreach x [lsearch -all $lst "$cp"] {
    lappend res $x
    }

    lappend res {*}[lsearch -all $lst $cp]

    }
    return $res

    proc updated, thanks for correct performance improve

    How will you distinguish which element of res is the position of which cp? Are you sure you want lists, not sets?

    only need combine positions result , will try sets if speed up

    > > puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
    720.2 microseconds per iteration

    To get the timing of (loop over) lsearch, try
    set lst2 [adv_chg_no_range5 1 $lst]
    puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} $lst2} 10]

    performance better original one

    716.0 microseconds per iteration
    699.0 microseconds per iteration (lappend res {*}[lsearch -all $lst $cp])
    327 microseconds per iteration (set lst2 [adv_chg_no_range5 1 $lst]) 235.5 microseconds per iteration (puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} $lst2} 10])


    4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targeted
    generate target position need calculate alive for customer reference , up 2 program is the key for this part.
    I am not going to put effort in trying to understand this. There were several hints that you are working on lottery data.
    Maybe you want to share the original problem. Just guessing: the lottery draws 5 from 39, millions of sold tickets, a dozen winning ranks, calculate the winners.

    this customize rule provide by his 20year experience , may not fit other one , also different country have different parameter and reference history hit

    this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergers
    or some algorithm can achieve it ?
    Not a precise question

    speed is the key factor , buy latest machine or use mult-machine is the last option , if the code effect already limit ...
    customer tell me C++ may have better performance , i tell him tcl also can achieve ..
    this is why need" below 120 microseconds per iteration" (1/6) > > > 720.2 microseconds per iteration (original : puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10])

    or need open new topic to discuss "best algorithm for search list speed" , achieve 6 times speed compare original (search program)


    BR
    Rolance

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to rolance on Tue Nov 8 04:10:21 2022
    On Tuesday, November 8, 2022 at 11:25:53 AM UTC+1, rolance wrote:
    customer tell me C++ may have better performance , i tell him tcl also can achieve ..
    this is why need" below 120 microseconds per iteration" (1/6) > > > 720.2 microseconds per iteration (original : puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10])

    Just don't blame Tcl in the end, i.e. make sure that you will be comparing the same algorithm in both languages!

    Anyway, when considering Tcl a layer on top of C, then its value-to-impact-ratio is probably best described as "convenience that tries not to impact speed too much"; but you could also code assembly to remove the overhead of C ...

    or need open new topic to discuss "best algorithm for search list speed" , achieve 6 times speed compare original (search program)

    No. https://en.wikipedia.org/wiki/Computational_complexity

    Before leaving: I am quite sure that this thread (also) has a language barrier - I am not so sure whether the language is mathematics, computer science, or English ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)