On Thursday, November 3, 2022 at 9:28:30 AM UTC+1, Rolance wrote:
Hi all
mass numbers analysis , need improve speed ...
hope someone can give me some hint ,or other way search ....Are you looking for the fastest code to solve a problem or for a better understanding of the observed timing.
If it is the first, then please also state the actual problem.
Anyway, here are a few comments (hopefully correct, but most likely not covering all aspects).
set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]* $v is always 1 and that factor can be removed from expr.
* The argument to expr should be braced.
* "set v" is unnecessary; {expr {int([::tcl::mathfunc::rand]*39)}} will do. * You could try whether a simple for-loop is faster than lrepeat+lmap. Tcl increases the internal list size progressively.
##### each number +id %39I guess there is a reason for a proc, i.e. why are you not creating these values straight away.
proc adv_chg_no_range4 {id lst} {
set new_lst [lindex $lst 0]The comment says "each number", but you are not processing the first one.
for {set i 1} {$i < [llength $lst]} {incr i} {You seem to know about lmap, why not here?
set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]You could [incr id -1] once in advance (inside the proc before the loop).
if {$tmp_val == 0} {set tmp_val 39}You could time whether [expr {(($v+$id)%-39)+39}] is faster to obtain the range 1~39.
Or the other way round, perform $id%39 in advance and do a conditional range shift in the loop, i.e.
if {$tmp_val > 39} {incr tmp_val -39} {set tmp_val} ;# assuming lmap
Please cross-check correctness on your own, i.e. do not blindly trust me who does not know your actual problem statement.
puts [time {
foreach x {1 12 18 5 6} {Should that be -integer instead of -exact?
lsearch -all -exact $lst $x
Internally, you are creating string reps of the list items here.
puts "###combine ##"
puts [time {
foreach x {1 12 18 5 6} {Is there a reason for calling the proc in every loop?
lsearch -all [adv_chg_no_range4 1 $lst] $x
puts "#### regexp"Here, you definitely have a single call to the proc.
puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
puts "###combine ##"
puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
output:
328 microseconds per iterationThis could include compile time of the proc. Do the timing more than once (or call the proc once in advance).
###combine ##
4928 microseconds per iteration
#### regexp
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
as the output use single proc -regexp less perfomance ....Exact string comparison is way faster. Even glob-matching is faster, e.g. Expect creates glob-gatekeepers for regexp-matching.
You could try whether {^(?:x1|x2|...)$} matches faster (i.e. if the state-machine is simpler), but in the end the above statement will stand.
but combine with other proc seem better .. Why....Double the time for 5 times the calls to the proc seems to proof different. You will find out in a cross-check.
could someone give me direction to overcome speed issue
In the end, most of my comments turn out to be irrelevant for that final finding ... Maybe they help anyway.
Hi all
mass numbers analysis , need improve speed ...
hope someone can give me some hint ,or other way search ....
set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]
##### each number +id %39
proc adv_chg_no_range4 {id lst} {
set new_lst [lindex $lst 0]
for {set i 1} {$i < [llength $lst]} {incr i} {
set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
if {$tmp_val == 0} {set tmp_val 39}
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all -exact $lst $x
puts "###combine ##"
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all [adv_chg_no_range4 1 $lst] $x
puts "#### regexp"
puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
puts "###combine ##"
puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
output:
328 microseconds per iteration
###combine ##
4928 microseconds per iteration
#### regexp
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
as the output use single proc -regexp less perfomance ....
but combine with other proc seem better .. Why....
could someone give me direction to overcome speed issue
heinrichmartin 在 2022年11月3日 星期四晚上11:25:02 [UTC+13] 的信中寫道:
On Thursday, November 3, 2022 at 9:28:30 AM UTC+1, Rolance wrote:
Hi all
mass numbers analysis , need improve speed ...
hope someone can give me some hint ,or other way search ....Are you looking for the fastest code to solve a problem or for a better understanding of the observed timing.
If it is the first, then please also state the actual problem.
Anyway, here are a few comments (hopefully correct, but most likely not covering all aspects).
set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]* $v is always 1 and that factor can be removed from expr.
* The argument to expr should be braced.
* "set v" is unnecessary; {expr {int([::tcl::mathfunc::rand]*39)}} will do.
* You could try whether a simple for-loop is faster than lrepeat+lmap. Tcl increases the internal list size progressively.
##### each number +id %39I guess there is a reason for a proc, i.e. why are you not creating these values straight away.
proc adv_chg_no_range4 {id lst} {
set new_lst [lindex $lst 0]The comment says "each number", but you are not processing the first one.
for {set i 1} {$i < [llength $lst]} {incr i} {You seem to know about lmap, why not here?
set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]You could [incr id -1] once in advance (inside the proc before the loop).
if {$tmp_val == 0} {set tmp_val 39}You could time whether [expr {(($v+$id)%-39)+39}] is faster to obtain the range 1~39.
Or the other way round, perform $id%39 in advance and do a conditional range shift in the loop, i.e.
if {$tmp_val > 39} {incr tmp_val -39} {set tmp_val} ;# assuming lmap
Please cross-check correctness on your own, i.e. do not blindly trust me who does not know your actual problem statement.
puts [time {
foreach x {1 12 18 5 6} {Should that be -integer instead of -exact?
lsearch -all -exact $lst $x
Internally, you are creating string reps of the list items here.
puts "###combine ##"
puts [time {
foreach x {1 12 18 5 6} {Is there a reason for calling the proc in every loop?
lsearch -all [adv_chg_no_range4 1 $lst] $x
puts "#### regexp"Here, you definitely have a single call to the proc.
puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
puts "###combine ##"
puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
output:
328 microseconds per iterationThis could include compile time of the proc. Do the timing more than once (or call the proc once in advance).
###combine ##
4928 microseconds per iteration
#### regexp
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
hi heinrichmartinas the output use single proc -regexp less perfomance ....Exact string comparison is way faster. Even glob-matching is faster, e.g. Expect creates glob-gatekeepers for regexp-matching.
You could try whether {^(?:x1|x2|...)$} matches faster (i.e. if the state-machine is simpler), but in the end the above statement will stand.
but combine with other proc seem better .. Why....Double the time for 5 times the calls to the proc seems to proof different. You will find out in a cross-check.
could someone give me direction to overcome speed issue
In the end, most of my comments turn out to be irrelevant for that final finding ... Maybe they help anyway.
thanks for your advice
1. looking for the fastest code to mass data , speed issue need cut down consume time as least half..
this code is part of project , the most affect effience part -- > lsearch command
already try serveal way to get best search time .....
or you have other best way to achieve like dict .. or array search code for reference?
2. (a).why use --> foreach x {1 12 18 5 6} {
lsearch -all -exact $lst $x
(b) not ---> Should that be -integer instead of -exact?
(a): 351 microseconds per iteration <<<<
(b): 583 microseconds per iteration
3. single proc Vs combine proc the time consume not proportional ....
still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
5times faster than one line (-regexp ) code ....
[...] I do know that such benchmarks are notoriously difficult to get right. [...]
On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
3. single proc Vs combine proc the time consume not proportional ....Let me comment with a quote:
still try to find out which affect momory occupy , not the simply code can get the best perforamce ...
5times faster than one line (-regexp ) code ....
On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:
[...] I do know that such benchmarks are notoriously difficult to get right. [...]
Hi all
mass numbers analysis , need improve speed ...
hope someone can give me some hint ,or other way search ....
code list below
set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]
##### each number +id %39
proc adv_chg_no_range4 {id lst} {
set new_lst [lindex $lst 0]
for {set i 1} {$i < [llength $lst]} {incr i} {
set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
if {$tmp_val == 0} {set tmp_val 39}
lappend new_lst $tmp_val
}
return $new_lst
}
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all -exact $lst $x
}
}
]
puts "###combine ##"
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all [adv_chg_no_range4 1 $lst] $x
}
}
]\n
puts "#### regexp"
puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
puts "###combine ##"
puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
output:
328 microseconds per iteration
###combine ##
4928 microseconds per iteration
#### regexp
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
could someone give me direction to overcome speed issue
[email protected] <[email protected]> wrote:
Hi all
mass numbers analysis , need improve speed ...
hope someone can give me some hint ,or other way search ....As no one here knows what you are trying to achieve, we are limited in
our ability to offer suggestions.
code list below
set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]
##### each number +id %39
proc adv_chg_no_range4 {id lst} {
set new_lst [lindex $lst 0]
for {set i 1} {$i < [llength $lst]} {incr i} {
set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
if {$tmp_val == 0} {set tmp_val 39}
lappend new_lst $tmp_val
}
return $new_lst
}
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all -exact $lst $x
}
}
]
puts "###combine ##"
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all [adv_chg_no_range4 1 $lst] $x
}
}
]\n
puts "#### regexp"
puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
puts "###combine ##"
puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
output:
328 microseconds per iterationUnsurprising here. The first does 5 searches over a static list.
###combine ##
4928 microseconds per iteration
The second makes 5 calls into a proc, that iterates the entire list via foreach, returning a new list, and that new list is then searched.
Iterating the entire list, in Tcl, to generate a new list is where most
of this time difference is being consumed.
#### regexpAlso unsurprising, searches using the regex engine will be slower than searches using simple comparisons, because the regex engine does far
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
more than simple comparisons.
The time difference between the static list and the newly generated
list via a proc comments above also apply here.
could someone give me direction to overcome speed issueIf your issue is lsearch speed, then:
1) only search a static list (i.e., don't recreate the list first
before searching it)
2) if your desired results can also be obtained from searching a sorted list, then first sort the list (sort only once, then search plural
times). Using the -sorted option to lsearch causes lsearch to
perform a binary search of the list, which will be faster than
without -sorted, which causes lsearch to perform a linear search
(start at first element, look at each in sequence until found).
But you've failed to describe your actual problem. You've shown a
solution that does not work for you speed wise, but not described to us
what you are trying to achieve by this solution you've given. It is
quite possible there is some alternative way, without using lsearch, to achieve your desired result, but we can't read your mind over Usenet to
know what your actual problem is to even be able to consider some
alternate.
On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
3. single proc Vs combine proc the time consume not proportional .... still try to find out which affect momory occupy , not the simply code can get the best perforamce ...Let me comment with a quote:
5times faster than one line (-regexp ) code ....
On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:Btw, here is my quick take on it (including a few cross-checks):
[...] I do know that such benchmarks are notoriously difficult to get right. [...]
expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
expect:~$ ll $l
10000
expect:~$ lindex $l 25
22
expect:~$ ::tcl::mathfunc::max {*}$l
38
expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.131 microseconds per iteration
expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
193.889 microseconds per iteration
expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.803 microseconds per iteration
expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
193.114 microseconds per iteration
expect:~$ set s [lsort -integer $l]; puts foo
foo
expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
expect:~$ set s [lsort -integer $l]; puts foo
foo
expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000
103.633 microseconds per iteration
expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000
104.209 microseconds per iteration
expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000
106.351 microseconds per iteration
expect:~$
[email protected] <[email protected]> wrote:
Hi all
mass numbers analysis , need improve speed ...
hope someone can give me some hint ,or other way search ....As no one here knows what you are trying to achieve, we are limited in
our ability to offer suggestions.
code list below
set lst [lmap v [lrepeat 3125 1] {set v [expr int($v * [::tcl::mathfunc::rand]*39)]} ]
##### each number +id %39
proc adv_chg_no_range4 {id lst} {
set new_lst [lindex $lst 0]
for {set i 1} {$i < [llength $lst]} {incr i} {
set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
if {$tmp_val == 0} {set tmp_val 39}
lappend new_lst $tmp_val
}
return $new_lst
}
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all -exact $lst $x
}
}
]
puts "###combine ##"
puts [time {
foreach x {1 12 18 5 6} {
lsearch -all [adv_chg_no_range4 1 $lst] $x
}
}
]\n
puts "#### regexp"
puts [time {lsearch -all -regexp $lst (^1$|^12$|^18$|^5$|^6$)}]
puts "###combine ##"
puts [time {lsearch -all -regexp [adv_chg_no_range4 1 $lst] (^1$|^12$|^18$|^5$|^6$)}
output:
328 microseconds per iterationUnsurprising here. The first does 5 searches over a static list.
###combine ##
4928 microseconds per iteration
The second makes 5 calls into a proc, that iterates the entire list via foreach, returning a new list, and that new list is then searched.
Iterating the entire list, in Tcl, to generate a new list is where most
of this time difference is being consumed.
#### regexpAlso unsurprising, searches using the regex engine will be slower than searches using simple comparisons, because the regex engine does far
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
more than simple comparisons.
The time difference between the static list and the newly generated
list via a proc comments above also apply here.
could someone give me direction to overcome speed issueIf your issue is lsearch speed, then:
1) only search a static list (i.e., don't recreate the list first
before searching it)
2) if your desired results can also be obtained from searching a sorted list, then first sort the list (sort only once, then search plural
times). Using the -sorted option to lsearch causes lsearch to
perform a binary search of the list, which will be faster than
without -sorted, which causes lsearch to perform a linear search
(start at first element, look at each in sequence until found).
But you've failed to describe your actual problem. You've shown a
solution that does not work for you speed wise, but not described to us
what you are trying to achieve by this solution you've given. It is
quite possible there is some alternative way, without using lsearch, to achieve your desired result, but we can't read your mind over Usenet to
know what your actual problem is to even be able to consider some
alternate.
On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
3. single proc Vs combine proc the time consume not proportional .... still try to find out which affect momory occupy , not the simply code can get the best perforamce ...Let me comment with a quote:
5times faster than one line (-regexp ) code ....
On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:Btw, here is my quick take on it (including a few cross-checks):
[...] I do know that such benchmarks are notoriously difficult to get right. [...]
expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
expect:~$ ll $l
10000
expect:~$ lindex $l 25
22
expect:~$ ::tcl::mathfunc::max {*}$l
38
expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.131 microseconds per iteration
expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
193.889 microseconds per iteration
expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.803 microseconds per iteration
expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
193.114 microseconds per iteration
expect:~$ set s [lsort -integer $l]; puts foo
foo
expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
expect:~$ set s [lsort -integer $l]; puts foo
foo
expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000
103.633 microseconds per iteration
expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000
104.209 microseconds per iteration
expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000
106.351 microseconds per iteration
expect:~$
On Thursday, November 3, 2022 at 12:41:53 PM UTC+1, heinrichmartin wrote:
On Thursday, November 3, 2022 at 12:05:37 PM UTC+1, Rolance wrote:
3. single proc Vs combine proc the time consume not proportional .... still try to find out which affect momory occupy , not the simply code can get the best perforamce ...Let me comment with a quote:
5times faster than one line (-regexp ) code ....
On Thursday, August 13, 2020 at 9:07:54 AM UTC+2, Arjen wrote:Btw, here is my quick take on it (including a few cross-checks):
[...] I do know that such benchmarks are notoriously difficult to get right. [...]
expect:~$ set l [lmap x [lrepeat 10000 1] {expr {int(rand()*39)}}]; puts foo foo
expect:~$ ll $l
10000
expect:~$ lindex $l 25
22
expect:~$ ::tcl::mathfunc::max {*}$l
38
expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.131 microseconds per iteration
expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
193.889 microseconds per iteration
expect:~$ time {lsearch -all -integer $l [expr {int(rand()*39)}]} 1000 230.803 microseconds per iteration
expect:~$ time {lsearch -all -exact $l [expr {int(rand()*39)}]} 1000
193.114 microseconds per iteration
expect:~$ set s [lsort -integer $l]; puts foo
foo
expect:~$ time {lsearch -all -sorted -exact $s [expr {int(rand()*39)}]} 1000 170.89 microseconds per iteration
expect:~$ set s [lsort -integer $l]; puts foo
foo
expect:~$ time {lsearch -all -sorted -integer $s [expr {int(rand()*39)}]} 1000
103.633 microseconds per iteration
expect:~$ time {lsearch -all -integer -exact $s [expr {int(rand()*39)}]} 1000
104.209 microseconds per iteration
expect:~$ time {lsearch -all -integer -exact $l [expr {int(rand()*39)}]} 1000
106.351 microseconds per iteration
expect:~$
* "[email protected]" <[email protected]>
| ##### each number +id %39 first char title id
| proc adv_chg_no_range4 {id lst} {
| set new_lst [lindex $lst 0]
| for {set i 1} {$i < [llength $lst]} {incr i} {
| set tmp_val [expr {([lindex $lst $i] + $id -1) % 39} ]
| if {$tmp_val == 0} {set tmp_val 39}
| lappend new_lst $tmp_val
| }
| return $new_lst
| }
--<snip-snip>--
| proc adv_chg_no_range6 {id lst} {
| set new_lst [lindex $lst 0]
| incr id -1
| for {set i 1} {$i < [llength $lst]} {incr i} {
| lappend new_lst [expr {(([lindex $lst $i]+$id)%-39)+39} ]
| }
| return $new_lst
| }
--<snip-snip>--
| could you point me where still can improve .
Instead of traversing the list by index, use foreach:
for {set i 1} {$i < [llength $lst]} {incr i} {
set elt [lindex $lst $i]
...
}
foreach elt [lrange $lst 1 end] {
...
}
But I have to admit I did not follow this thread to understand what your ultimate goal is.
HTH
R'
thank for your suggestion , my goal is best lsearch performance
will try foreach will improve how much in multi analysis
On Friday, November 4, 2022 at 10:58:40 AM UTC+1, rolance wrote:
thank for your suggestion , my goal is best lsearch performanceYour claims are inconsistent, lsearch has no foreach.
will try foreach will improve how much in multi analysis
Your initial problem ("Given a list of integers, find all matching integers fast.") is solved in this thread.
But your issues are somewhere else, e.g. reading data, transforming data, memory management.
Just one example: if you cannot accept multiple copies of your GBs of data, then you must transform the values in place (for & lindex & lset) or process them in a stream (i.e. do not create a list at all).
https://www.youtube.com/watch?v=c33AZBnRHks might describe an unexpectedly close problem (but we still don't know your exact objective). Anyway, the video contains quite some lessons about data handling/analysis.
But I have to admit I did not follow this thread to understand what your ultimate goal is.
Ralf Fassel ? 2022?11?4? ?????10:48:51 [UTC+13] ??????
* "[email protected]" <[email protected]>
| could you point me where still can improve .
Instead of traversing the list by index, use foreach:
for {set i 1} {$i < [llength $lst]} {incr i} {
set elt [lindex $lst $i]
...
}
foreach elt [lrange $lst 1 end] {
...
}
But I have to admit I did not follow this thread to understand what your
ultimate goal is.
HTH
R'
Hi Ralf
thank for your suggestion , my goal is best lsearch performance
will try foreach will improve how much in multi analysis
Rich ? 2022?11?4? ?????3:09:19 [UTC+13] ??????
[email protected] <[email protected]> wrote:
Hi allUnsurprising here. The first does 5 searches over a static list.
mass numbers analysis , need improve speed ...
The second makes 5 calls into a proc, that iterates the entire list via
foreach, returning a new list, and that new list is then searched.
Iterating the entire list, in Tcl, to generate a new list is where most
of this time difference is being consumed.
#### regexpAlso unsurprising, searches using the regex engine will be slower than
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
searches using simple comparisons, because the regex engine does far
more than simple comparisons.
The time difference between the static list and the newly generated
list via a proc comments above also apply here.
could someone give me direction to overcome speed issueIf your issue is lsearch speed, then:
1) only search a static list (i.e., don't recreate the list first
before searching it)
2) if your desired results can also be obtained from searching a sorted
list, then first sort the list (sort only once, then search plural
times). Using the -sorted option to lsearch causes lsearch to
perform a binary search of the list, which will be faster than
without -sorted, which causes lsearch to perform a linear search
(start at first element, look at each in sequence until found).
But you've failed to describe your actual problem. You've shown a
solution that does not work for you speed wise, but not described to us
what you are trying to achieve by this solution you've given. It is
quite possible there is some alternative way, without using lsearch, to
achieve your desired result, but we can't read your mind over Usenet to
know what your actual problem is to even be able to consider some
alternate.
Hi Rich
1. static list indeed faster , but need recreate lst for each
analysis , how can improve this ... release memory ?
2 this actual problem is speed issue , all analysis result can fit
request except the consume time , customer want cut half ...
adv_chg_no_range5 is the best performance
5 time loop search is better than regexp , but not stable..
(update code post pre , please refer)
could you point me some better or alterntive way to achieve speed
request
[email protected] <[email protected]> wrote:
Rich ? 2022?11?4? ?????3:09:19 [UTC+13] ??????
[email protected] <[email protected]> wrote:
Hi allUnsurprising here. The first does 5 searches over a static list.
mass numbers analysis , need improve speed ...
The second makes 5 calls into a proc, that iterates the entire list via >> foreach, returning a new list, and that new list is then searched.
Iterating the entire list, in Tcl, to generate a new list is where most >> of this time difference is being consumed.
#### regexpAlso unsurprising, searches using the regex engine will be slower than
1341 microseconds per iteration
###combine ##
2525 microseconds per iteration
searches using simple comparisons, because the regex engine does far
more than simple comparisons.
The time difference between the static list and the newly generated
list via a proc comments above also apply here.
could someone give me direction to overcome speed issueIf your issue is lsearch speed, then:
1) only search a static list (i.e., don't recreate the list first
before searching it)
2) if your desired results can also be obtained from searching a sorted >> list, then first sort the list (sort only once, then search plural
times). Using the -sorted option to lsearch causes lsearch to
perform a binary search of the list, which will be faster than
without -sorted, which causes lsearch to perform a linear search
(start at first element, look at each in sequence until found).
But you've failed to describe your actual problem. You've shown a
solution that does not work for you speed wise, but not described to us >> what you are trying to achieve by this solution you've given. It is
quite possible there is some alternative way, without using lsearch, to >> achieve your desired result, but we can't read your mind over Usenet to >> know what your actual problem is to even be able to consider some
alternate.
Hi Rich
1. static list indeed faster , but need recreate lst for eachPlease explain to us why you need to recreate the list for each
analysis , how can improve this ... release memory ?
analysis. We see no reason yet why this is necessary (and don't simply
say "because I need to do so" -- that will not be an acceptable
answer).
You can't 'release memory' manually with Tcl, that is all automatic.
2 this actual problem is speed issue , all analysis result can fitNo, the actual problem is whatever it is you are trying to do. You've
request except the consume time , customer want cut half ...
never told us that, other than a meaningless "analysis". If you had
told us the actual problem, we would then bee ablle
adv_chg_no_range5 is the best performance
5 time loop search is better than regexp , but not stable..
(update code post pre , please refer)
could you point me some better or alterntive way to achieve speedNot unless you explain why you need to recreate the list, and what
request
outer reason you need to iterate the list each time to recreate it
instead of re-creating one copy then searching that plural times.
And, if you'd ever bother to tell us what you are really trying to do,
there might be a better faster way with a change in how you represent
the data. But you've never given us anything but your vague
statements, so there is no way for us to recognize a superior way of representing the data that would increase speed (assuming such a way
does exist).
Hi Rich
thanks for your advice .
("Given a list of integers, find all matching integers fast." with
position ) is what I look for. this main program is customize
lottary prediction and analysis system , customer made some rule to
recreate list number , and find the best position hit the target ,
also reference history record up to "N" previous hit.
that is what i need recreate it each time.... original program for N
phase 3125x3125x39x1 time comparison upto 7hr.. with some
shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...
previousily he tell me other C++ program can achieve (but different
rule, my code more rules)
already separate data create with mass file , only source file and
change each number +1 for 39 sift range...
do it's limitation of tcl , or some way can faster calculate .
[email protected] <[email protected]> wrote:
Hi Rich
thanks for your advice .
("Given a list of integers, find all matching integers fast." with position ) is what I look for. this main program is customize
lottary prediction and analysis system , customer made some rule to recreate list number , and find the best position hit the target ,
also reference history record up to "N" previous hit.
that is what i need recreate it each time.... original program for NDo you mean it searches for the positions in 228 515 625 000 (228
phase 3125x3125x39x1 time comparison upto 7hr.. with some
shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...
billion) total numbers?
For a list that takes 396 sec, what is the 'llength' result of running llength on that list?
previousily he tell me other C++ program can achieve (but differentC++ is also a compiled program. That very likely gives it a
rule, my code more rules)
performance edge from the start.
already separate data create with mass file , only source file and
change each number +1 for 39 sift range...
do it's limitation of tcl , or some way can faster calculate .So you have a list of length X (what is a typical X value?) of numbers
from 1 to 39, and you want to find all occurrences of a given number
and return the positions of all those numbers in that list?
I.e, consider this code:
$ rlwrap tclsh
% package require sqlite3
3.35.5
% sqlite3 db :memory:
% db eval {create table list (id integer primary key, num integer);}
This below inserts five million numbers that are randomly chosen
between 1 and 39:
% for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}
Create an index on the "num" column:
% db eval {create index idx on list(num);}
Do some "searching", the 'result' of these queries will be the
"position" of each value being requested:
% time {db eval {select id from list where num = 20;}}
22530 microseconds per iteration
% time {db eval {select id from list where num = 20;}}
22827 microseconds per iteration
% time {db eval {select id from list where num = 27;}}
22856 microseconds per iteration
% time {db eval {select id from list where num = 5;}}
22711 microseconds per iteration
Pretty consistent, about 22.5ms to find the "positions" of all the
numbers equal to 20 (or 27 or 5) in the "list" of five million possibilities. Memory consumption, as reported by top, is 332Meg.
This is what I meant when I asked you to tell us the problem you were
trying to solve, instead of giving us your "solution", keeping the
actual problem secret, and asking us to help fix the solution. You do yourself no favors that way, and prevent us from being able to suggest superior alternatives.
In the code above I am finding the "positions" of numbers in an
"ordered list" of five million entries in about 22.5ms above. Now,
granted, I am not doing anything with those positions, but the
alternate "lsearch" above is extremely fast.
Rich 在 2022年11月5日 星期六上午10:30:18 [UTC+13] 的信中寫道:
[email protected] <[email protected]> wrote:
Hi Rich
thanks for your advice .
("Given a list of integers, find all matching integers fast." with position ) is what I look for. this main program is customize
lottary prediction and analysis system , customer made some rule to recreate list number , and find the best position hit the target ,
also reference history record up to "N" previous hit.
that is what i need recreate it each time.... original program for N phase 3125x3125x39x1 time comparison upto 7hr.. with someDo you mean it searches for the positions in 228 515 625 000 (228
shortcut and optimize code already down to 396 sec. but customer not satisify ---> need 60 sec with single thread... his ideal cal qty 3125x3125x39x600 for one time...
billion) total numbers?
For a list that takes 396 sec, what is the 'llength' result of running llength on that list?
previousily he tell me other C++ program can achieve (but different rule, my code more rules)C++ is also a compiled program. That very likely gives it a
performance edge from the start.
already separate data create with mass file , only source file and change each number +1 for 39 sift range...
do it's limitation of tcl , or some way can faster calculate .So you have a list of length X (what is a typical X value?) of numbers from 1 to 39, and you want to find all occurrences of a given number
and return the positions of all those numbers in that list?
I.e, consider this code:
$ rlwrap tclsh
% package require sqlite3
3.35.5
% sqlite3 db :memory:
% db eval {create table list (id integer primary key, num integer);}
This below inserts five million numbers that are randomly chosen
between 1 and 39:
% for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}
Create an index on the "num" column:
% db eval {create index idx on list(num);}
Do some "searching", the 'result' of these queries will be the
"position" of each value being requested:
% time {db eval {select id from list where num = 20;}}
22530 microseconds per iteration
% time {db eval {select id from list where num = 20;}}
22827 microseconds per iteration
% time {db eval {select id from list where num = 27;}}
22856 microseconds per iteration
% time {db eval {select id from list where num = 5;}}
22711 microseconds per iteration
Pretty consistent, about 22.5ms to find the "positions" of all the
numbers equal to 20 (or 27 or 5) in the "list" of five million possibilities. Memory consumption, as reported by top, is 332Meg.
This is what I meant when I asked you to tell us the problem you were trying to solve, instead of giving us your "solution", keeping the
actual problem secret, and asking us to help fix the solution. You do yourself no favors that way, and prevent us from being able to suggest superior alternatives.
In the code above I am finding the "positions" of numbers in anhi Rich
"ordered list" of five million entries in about 22.5ms above. Now, granted, I am not doing anything with those positions, but the
alternate "lsearch" above is extremely fast.
great thanks for your advice
will change data structure to sqlite get faster speed .
BR
Rolance
[email protected] ? 2022?11?5? ?????11:31:28 [UTC+13] ??????
Rich ? 2022?11?5? ?????10:30:18 [UTC+13] ??????
I.e, consider this code:
$ rlwrap tclsh
% package require sqlite3
3.35.5
% sqlite3 db :memory:
% db eval {create table list (id integer primary key, num integer);}
This below inserts five million numbers that are randomly chosen
between 1 and 39:
% for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}
Create an index on the "num" column:
% db eval {create index idx on list(num);}
Do some "searching", the 'result' of these queries will be the
"position" of each value being requested:
% time {db eval {select id from list where num = 20;}}
22530 microseconds per iteration
% time {db eval {select id from list where num = 20;}}
22827 microseconds per iteration
% time {db eval {select id from list where num = 27;}}
22856 microseconds per iteration
% time {db eval {select id from list where num = 5;}}
22711 microseconds per iteration
Pretty consistent, about 22.5ms to find the "positions" of all the
numbers equal to 20 (or 27 or 5) in the "list" of five million
possibilities. Memory consumption, as reported by top, is 332Meg.
hi Rich
below code with my test result
could you point me what wrong same code with different output
[email protected] <[email protected]> wrote:
[email protected] ? 2022?11?5? ?????11:31:28 [UTC+13] ??????
Rich ? 2022?11?5? ?????10:30:18 [UTC+13] ??????
I.e, consider this code:
$ rlwrap tclsh
% package require sqlite3
3.35.5
% sqlite3 db :memory:
% db eval {create table list (id integer primary key, num integer);}
This below inserts five million numbers that are randomly chosen
between 1 and 39:
% for {set i 0} {$i < 5000000} {incr i} { db eval "insert into list (num) values ([expr {int(rand()*39+1)}]);"}
Create an index on the "num" column:
% db eval {create index idx on list(num);}
Do some "searching", the 'result' of these queries will be the
"position" of each value being requested:
% time {db eval {select id from list where num = 20;}}
22530 microseconds per iteration
% time {db eval {select id from list where num = 20;}}
22827 microseconds per iteration
% time {db eval {select id from list where num = 27;}}
22856 microseconds per iteration
% time {db eval {select id from list where num = 5;}}
22711 microseconds per iteration
Pretty consistent, about 22.5ms to find the "positions" of all the
numbers equal to 20 (or 27 or 5) in the "list" of five million
possibilities. Memory consumption, as reported by top, is 332Meg.
hi Rich
below code with my test resultdb1 and db2 are not the same code.
could you point me what wrong same code with different output
Your db2 code omits creating an index on the num column.
Without the index sqlite has to scan all the rows in the table.
With the index it can refer to the index and (effectively) directly
pull out the matching rows.
That is the reason for the time difference. The index consumes some
up-front time to build, in order to significantly accelerate lookups
later.
Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????
[email protected] <[email protected]> wrote:
Your db2 code omits creating an index on the num column.
Without the index sqlite has to scan all the rows in the table.
With the index it can refer to the index and (effectively) directly
pull out the matching rows.
That is the reason for the time difference. The index consumes some
up-front time to build, in order to significantly accelerate lookups
later.
hi Rich
thank for the detail explanation ,
single test program have great performance , but in real
multi-program implement speed not better than previous Lsearch code
..... still debug what factor will affect speed ...
thanks for great help , if you have some hint about program
interactive affect , please point me
[email protected] <[email protected]> wrote:
Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????
[email protected] <[email protected]> wrote:
Your db2 code omits creating an index on the num column.
Without the index sqlite has to scan all the rows in the table.
With the index it can refer to the index and (effectively) directly
pull out the matching rows.
That is the reason for the time difference. The index consumes some
up-front time to build, in order to significantly accelerate lookups
later.
hi Rich
thank for the detail explanation ,I also asked how long the list was in the real world program, and as
single test program have great performance , but in real
multi-program implement speed not better than previous Lsearch code
..... still debug what factor will affect speed ...
you have done so many times in this thread, you ignored that question entirely. If you continue ignoring direct questions that would allow
us to help you I will plonk you.
In the real program, how long is the list that is being searched?
thanks for great help , if you have some hint about programWe can't help if you do not give us the information we need to be able
interactive affect , please point me
to effectively help.
Rich ? 2022?11?6? ?????2:11:17 [UTC+13] ??????
[email protected] <[email protected]> wrote:
Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????I also asked how long the list was in the real world program, and as
you have done so many times in this thread, you ignored that
question entirely. If you continue ignoring direct questions that
would allow us to help you I will plonk you.
In the real program, how long is the list that is being searched?
thanks for great help , if you have some hint about programWe can't help if you do not give us the information we need to be
interactive affect , please point me
able to effectively help.
hi Rich
thanks for your advice
only analysis partical list :3125 range 1
code and result below
###############
package require sqlite3
console show
set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]
proc adv_chg_no_range5 {id lst} {
set new_lst [lindex $lst 0]
incr id -1
lappend new_lst {*}[lmap v [lrange $lst 1 end] {
expr {(($v+$id)%-39)+39}
}]
return $new_lst
}
proc mark_lott_loc_row_no_sort1 {lott lst} {
set res ""
foreach cp $lott {
foreach x [lsearch -all $lst "$cp"] {
lappend res $x
}
}
return $res
}
proc mark_lott_loc_row_no_sort3_p {lott lst} {
sqlite3 db1 :memory:
db1 eval {create table list (id integer primary key, num integer);}
foreach x [lrange $lst 1 end] {
db1 eval "insert into list (num) values ($x);"
}
db1 eval {create index idx on list(num);}
set res ""
foreach cp $lott {
lappend res {*}[db1 eval {select id from list where num = $cp;}]
}
db1 eval { delete from list; }
db1 close
return $res
}
puts "#### lsearch 3125 range 1 ####"
puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
puts "#### sqlite3 3125 range 1####"
puts [time {mark_lott_loc_row_no_sort3_p {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
##### output #####
#### lsearch 3125 range 1 ####
720.2 microseconds per iteration
#### sqlite3 3125 range 1####
17818.6 microseconds per iteration
[email protected] <[email protected]> wrote:
Rich ? 2022?11?6? ?????2:11:17 [UTC+13] ??????
[email protected] <[email protected]> wrote:
Rich ? 2022?11?5? ?????3:06:51 [UTC+13] ??????I also asked how long the list was in the real world program, and as
you have done so many times in this thread, you ignored that
question entirely. If you continue ignoring direct questions that
would allow us to help you I will plonk you.
In the real program, how long is the list that is being searched?
thanks for great help , if you have some hint about programWe can't help if you do not give us the information we need to be
interactive affect , please point me
able to effectively help.
hi Rich
thanks for your adviceContining to evade the question. This is your last chance.
only analysis partical list :3125 range 1
code and result below
In the real program, how long is the list that is being searched?
###############
package require sqlite3
console show
set lst [lmap v [lrepeat 3125 1] {expr {int($v * [::tcl::mathfunc::rand]*39)}} ]
proc adv_chg_no_range5 {id lst} {
set new_lst [lindex $lst 0]
incr id -1
lappend new_lst {*}[lmap v [lrange $lst 1 end] {
expr {(($v+$id)%-39)+39}
}]
return $new_lst
}
proc mark_lott_loc_row_no_sort1 {lott lst} {
set res ""
foreach cp $lott {
foreach x [lsearch -all $lst "$cp"] {
lappend res $x
}
}
return $res
}
proc mark_lott_loc_row_no_sort3_p {lott lst} {
sqlite3 db1 :memory:
db1 eval {create table list (id integer primary key, num integer);} foreach x [lrange $lst 1 end] {
db1 eval "insert into list (num) values ($x);"
}
db1 eval {create index idx on list(num);}
set res ""
foreach cp $lott {
lappend res {*}[db1 eval {select id from list where num = $cp;}]
}
db1 eval { delete from list; }
db1 close
return $res
}
puts "#### lsearch 3125 range 1 ####"
puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
puts "#### sqlite3 3125 range 1####"
puts [time {mark_lott_loc_row_no_sort3_p {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
##### output #####
#### lsearch 3125 range 1 ####I made a small code change to your example, and I get the following
720.2 microseconds per iteration
#### sqlite3 3125 range 1####
17818.6 microseconds per iteration
timing on my machine after the change:
#### lsearch 3125 range 1 ####
1180.9 microseconds per iteration
#### sqlite3 3125 range 1####
638.2 microseconds per iteration
However, because you are being so evasive, and failing to answer
direect questions with straightforward answers, I am not going to show
you the change I made.
I will, however, give you a hint. Look at what is *very different*
between the data searched in your mark_lott_loc_row_no_sort1 proc vs.
the data searched in the mark_lott_loc_row_no_sort3_p.
What does one of those proc's do that the other one does not.
When you see what one of them does, that the other does not, then
consider changing the code so the one that is presently doing the
different work no longer is performing that different work. When you
make the change, you'll get simlar timings to what I just got.
However, fail to be straighforward in your next answer, and you will be plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your
last and final warning.
#### lsearch 3125 range 1 ####
720.2 microseconds per iteration
#### sqlite3 3125 range 1####
638.2 microseconds per iteration
Rich 在 2022年11月6日 星期日下午2:41:35 [UTC+13] 的信中寫道:
rolance wrote:
#### lsearch 3125 range 1 ####I made a small code change to your example, and I get the following
720.2 microseconds per iteration
#### sqlite3 3125 range 1####
17818.6 microseconds per iteration
timing on my machine after the change:
#### lsearch 3125 range 1 ####
1180.9 microseconds per iteration
#### sqlite3 3125 range 1####
638.2 microseconds per iteration
However, because you are being so evasive, and failing to answer
direect questions with straightforward answers, I am not going to show
you the change I made.
However, fail to be straighforward in your next answer, and you will be plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your
last and final warning.
On Sunday, November 6, 2022 at 5:39:28 AM UTC+1, rolance wrote:algorithms? (Note that brute force is not the smartest approach when time matters.)
Rich 在 2022年11月6日 星期日下午2:41:35 [UTC+13] 的信中寫道:
rolance wrote:
#### lsearch 3125 range 1 ####I made a small code change to your example, and I get the following timing on my machine after the change:
720.2 microseconds per iteration
#### sqlite3 3125 range 1####
17818.6 microseconds per iteration
#### lsearch 3125 range 1 ####
1180.9 microseconds per iteration
#### sqlite3 3125 range 1####
638.2 microseconds per iteration
Rich, congrats for your patience and that cool way to draw attention to the message :-)However, because you are being so evasive, and failing to answer
direect questions with straightforward answers, I am not going to show you the change I made.
Rolance, I cannot/will not plonk user, but I have been ignoring messages without progress - and I will continue to do so.However, fail to be straighforward in your next answer, and you will be plonked ( https://en.everybodywiki.com/Plonk_(Usenet) ). This is your last and final warning.
But I want to encourage you to *rethink/rephrase* your questions and post them again when you do not see an answer within a few days - writing down the actual problem helped me a lot quite some times, even before ever sending it.
In the end, you can see on clt - even in this thread - that the community is willing to help, even with questions that are not strictly related to Tcl. (We are just not doing your job entirely.)
Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:
* At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
* Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
* Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
* What data must be calculated live? What parts can be pre-calculated?computers also use databases for openings and end games, i.e. they do not recalculate everything, but they know many winning/losing positions in advance - and they have efficient ways to look these up.
On your way from being a coder to becoming a developer, you might want to re-watch the video that I posted earlier - and work out techniques to improve the implementation that were mentioned.
Then compare it to your current problem.
Or think of a chess computer: it cannot calculate all possible moves to any depth. Efficient algorithms must e.g. have thresholds to cut of search branches, reuse sub-trees, use an efficient representation, order possible moves, ... Modern chess
What next?
If any of my statements was unclear, ask a precise, direct question. Reconsider your problem statement/requirements and elaborate the overall solution - not just one particular step of one possible approach.
If you get stuck while working things out, give the context and ask a precise question.
Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:algorithms? (Note that brute force is not the smartest approach when time matters.)
* At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
* Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
* Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
* What data must be calculated live? What parts can be pre-calculated?
720.2 microseconds per iteration
On your way from being a coder to becoming a developer, you might want to re-watch the video that I posted earlier - and work out techniques to improve the implementation that were mentioned.computers also use databases for openings and end games, i.e. they do not recalculate everything, but they know many winning/losing positions in advance - and they have efficient ways to look these up.
Then compare it to your current problem.
Or think of a chess computer: it cannot calculate all possible moves to any depth. Efficient algorithms must e.g. have thresholds to cut of search branches, reuse sub-trees, use an efficient representation, order possible moves, ... Modern chess
What next?
If any of my statements was unclear, ask a precise, direct question. Reconsider your problem statement/requirements and elaborate the overall solution - not just one particular step of one possible approach.
If you get stuck while working things out, give the context and ask a precise question.
heinrichmartin 在 2022年11月7日 星期一晚上10:02:45 [UTC+13] 的信中寫道:algorithms? (Note that brute force is not the smartest approach when time matters.)
Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:
* At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
* Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
* Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more efficient
* What data must be calculated live? What parts can be pre-calculated?1.yes , the list with tile-id in first char , and tail with 3125 (customize rule generate integers) and EIAS mean ?
2.all use unset to clear var when each run get target position , yes increaseing memory will low speed and will get alloc error ...
3. already simpify only calculate target data , is why original time upto 7hr --> optimizate 396 sec
below two program is the key factor for the main program , if simplify adv_chg_no_range5 (by pass or no change lst ) , I can get "14 sec" hit customer target one thread
bast get more and more info within 12hr ( lottery 24 hr open one time ) and the calculate base 3125x3125x39x600x13(ref 13phase)
since I am not family with sqlite3 , need sometime to research as Rich show part .. up performance 15~20%
lsearch command is best solution what i can get now. didn't know if sqlite3 will low speed after multi-time excute ....
proc adv_chg_no_range5 {id lst} {
set new_lst [lindex $lst 0]
incr id -1
lappend new_lst {*}[lmap v [lrange $lst 1 end] {
expr {(($v+$id)%-39)+39}
}]
return $new_lst
}
proc mark_lott_loc_row_no_sort1 {lott lst} {
set res ""
foreach cp $lott {
foreach x [lsearch -all $lst "$cp"] {
lappend res $x
}
}
return $res
}
puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
720.2 microseconds per iteration
4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targeted
generate target position need calculate alive for customer reference , up 2 program is the key for this part.
this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergers
or some algorithm can achieve it ?
On Monday, November 7, 2022 at 12:41:08 PM UTC+1, rolance wrote:efficient algorithms? (Note that brute force is not the smartest approach when time matters.)
heinrichmartin 在 2022年11月7日 星期一晚上10:02:45 [UTC+13] 的信中寫道:
Then again, you must start accepting that a full search in an arbitrary list takes O(N) time and O(log N) in a sorted list. Here are a few pointers how to improve the overall timing:
* At the time of lsearch, does the list contain integers already? Note that Tcl has object types under the hood, i.e. EIAS is correct, but Tcl may improve performance where possible.
* Are you unnecessarily keeping references to data? Tcl copies on write, i.e. "freeing" resources might help not only memory but also speed.
* Are you performing identical operations repeatedly/unnecessarily? This does not only refer to the extra work in the loop (that Rich has pointed out), but also think at large scale, i.e. can the problem statement be rewritten to allow more
implicitly parse the number for you and it will stick to the number representation when you do calculations only.In my reader, it looks like some text is missing from the middle of this line; I am going to comment on keywords.* What data must be calculated live? What parts can be pre-calculated?1.yes , the list with tile-id in first char , and tail with 3125 (customize rule generate integers) and EIAS mean ?
If your list size was 3126, then you would most likely not have an issue, i.e. you are talking about a several (thousands of) iterations - you should look closely what else happens in that loop (explicitly or implicitly!).
Tcl is specified without data types, i.e. everything is a string (EIAS). Every proc may interpret a value in a different way, e.g. as number, as text, as code, ... Internally (on C level), Tcl generates and caches representations. E.g. Tcl will
https://wiki.tcl-lang.org/page/everything+is+a+string https://wiki.tcl-lang.org/page/shimmering
2.all use unset to clear var when each run get target position , yes increaseing memory will low speed and will get alloc error ...There are hidden catches, but also not-so-obvious options, e.g. https://wiki.tcl-lang.org/page/K#c2a6014c2d129837889d8a8000d05e5c3b44e8f6b46cab777c04df8a927bfad2
3. already simpify only calculate target data , is why original time upto 7hr --> optimizate 396 sec
below two program is the key factor for the main program , if simplify adv_chg_no_range5 (by pass or no change lst ) , I can get "14 sec" hit customer target one thread
bast get more and more info within 12hr ( lottery 24 hr open one time ) and the calculate base 3125x3125x39x600x13(ref 13phase)
since I am not family with sqlite3 , need sometime to research as Rich show part .. up performance 15~20%
lsearch command is best solution what i can get now. didn't know if sqlite3 will low speed after multi-time excute ....That requires three list objects (at least temporarily). Note that this does not mean that all list elements are duplicated in memory, just the list of pointers.
proc adv_chg_no_range5 {id lst} {
set new_lst [lindex $lst 0]
incr id -1
lappend new_lst {*}[lmap v [lrange $lst 1 end] {
expr {(($v+$id)%-39)+39}
}]
return $new_lst
}
But before digging into Tcl internals, why not consider to no have index 0 in the same list? Then maybe https://wiki.tcl-lang.org/page/VecTcl can help. Do you see how we are _not_ talking about lsearch?
proc mark_lott_loc_row_no_sort1 {lott lst} {lappend res {*}[lsearch -all $lst $cp]
set res ""
foreach cp $lott {
foreach x [lsearch -all $lst "$cp"] {
lappend res $x
}
}
return $res
How will you distinguish which element of res is the position of which cp? Are you sure you want lists, not sets?
}To get the timing of (loop over) lsearch, try
puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10]
720.2 microseconds per iteration
set lst2 [adv_chg_no_range5 1 $lst]
puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} $lst2} 10]
4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targetedI am not going to put effort in trying to understand this. There were several hints that you are working on lottery data.
generate target position need calculate alive for customer reference , up 2 program is the key for this part.
Maybe you want to share the original problem. Just guessing: the lottery draws 5 from 39, millions of sold tickets, a dozen winning ranks, calculate the winners.
this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergersNot a precise question.
or some algorithm can achieve it ?
That requires three list objects (at least temporarily). Note that this does not mean that all list elements are duplicated in memory, just the list of pointers.
But before digging into Tcl internals, why not consider to no have index 0 in the same list? Then maybe https://wiki.tcl-lang.org/page/VecTcl can help. Do you see how we are _not_ talking about lsearch?
proc mark_lott_loc_row_no_sort1 {lott lst} {
set res ""
foreach cp $lott {
foreach x [lsearch -all $lst "$cp"] {
lappend res $x
}
lappend res {*}[lsearch -all $lst $cp]
}
return $res
How will you distinguish which element of res is the position of which cp? Are you sure you want lists, not sets?
720.2 microseconds per iteration
To get the timing of (loop over) lsearch, try
set lst2 [adv_chg_no_range5 1 $lst]
puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} $lst2} 10]
4. base row data already separate calculated and each phase raw data 26mb x 13 not all source in memory only targetedI am not going to put effort in trying to understand this. There were several hints that you are working on lottery data.
generate target position need calculate alive for customer reference , up 2 program is the key for this part.
Maybe you want to share the original problem. Just guessing: the lottery draws 5 from 39, millions of sold tickets, a dozen winning ranks, calculate the winners.
this two program combine run time is possible below 120 microseconds per iteration ? with 3125 intergersNot a precise question
or some algorithm can achieve it ?
customer tell me C++ may have better performance , i tell him tcl also can achieve ..
this is why need" below 120 microseconds per iteration" (1/6) > > > 720.2 microseconds per iteration (original : puts [time {mark_lott_loc_row_no_sort1 {12 16 3 8 36} [adv_chg_no_range5 1 $lst]} 10])
or need open new topic to discuss "best algorithm for search list speed" , achieve 6 times speed compare original (search program)
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (3 / 13) |
| Uptime: | 19:18:05 |
| Calls: | 12,104 |
| Calls today: | 4 |
| Files: | 15,004 |
| Messages: | 6,518,092 |