Forum: >>> Magnum BBS <<<

speed improve wanted

From [email protected]@21:1/5 to All on Fri Apr 21 01:11:26 2023

Hi all

Could some one give me some hit or direction to improve the speed?

same code by different language in same machine....
the program find the minimum target 1 appear times in 100 row by 3125 column

ex matrix: 0 1 0 0 1 0 1 0 1...
1 0 1 0 0 0 0 1 1...
1 1 1 0 0 1 1 0 1..
....
...

the speed gap almost 4 times

tcl(8.6.12) :
total_time: 266766 microseconds 266ms 0.266766s
Average execution time per iteration: 85 microseconds

python(3.9):
total_time: 59829800 nanoseconds 59.8298ms 0.0598298s
Average execution time per iteration: 19145.536 nanoseconds

###tcl code ###

set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
set matrix([expr {$i + 1}],$j) [expr {int(rand()*2)}]
}
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
set start_time [clock microseconds]
for {set i 0} {$i < $num_rows} {incr i} {
if {$matrix([expr {$i + 1}],$j) == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
set end_time [clock microseconds]
set iteration_time [expr {$end_time - $start_time}]
set total_time [expr {$total_time + $iteration_time}]
}
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"

####python code ###
import random
import time

num_iterations = 1000
num_columns = 3125
num_rows = 100
num_hits = 25

# Generate random matrix
matrix = []
for i in range(num_rows):
row = []
for j in range(num_columns):
row.append(random.randint(0, 1))
matrix.append(row)

# Find shortest sequence of 1's with given number of hits
shortest_seq = ""
shortest_len = num_rows
shortest_pos = []
total_time = 0

for j in range(num_columns):
hits = 0
seq = ""
pos = []
start_time = time.perf_counter_ns()
for i in range(num_rows):
if matrix[i+1][j] == 1:
hits += 1
seq += "1"
pos.append(i)
else:
seq += "0"
if hits >= num_hits:
seq_len = len(seq)
if seq_len < shortest_len:
shortest_len = seq_len
shortest_seq = seq
shortest_pos = pos
break
end_time = time.perf_counter_ns()
iteration_time = end_time - start_time
total_time += iteration_time

avg_time = total_time / num_columns
print(f"total_time: {total_time} nanoseconds {total_time/1000000.0}ms {total_time/1000000000.0}s")
print(f"Average execution time per iteration: {avg_time} nanoseconds")

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ralf Fassel@21:1/5 to All on Fri Apr 21 10:52:27 2023

* "[email protected]" <[email protected]>
| Could some one give me some hit or direction to improve the speed?

Putting the TCL code in a proc so it gets byte-compiled would be my
first attempt on speedup.

R'

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Cecil Westerhof@21:1/5 to Ralf Fassel on Fri Apr 21 11:29:39 2023

Ralf Fassel <[email protected]> writes:

* "[email protected]" <[email protected]>
| Could some one give me some hit or direction to improve the speed?

Putting the TCL code in a proc so it gets byte-compiled would be my
first attempt on speedup.

I did not know that. Good to know.

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 04:13:59 2023

Cecil Westerhof 在 2023年4月21日星期五晚上9:44:06 [UTC+12] 的信中寫道：

Ralf Fassel <[email protected]> writes:

* "[email protected]" <[email protected]>
| Could some one give me some hit or direction to improve the speed?

Putting the TCL code in a proc so it gets byte-compiled would be my
first attempt on speedup.

I did not know that. Good to know.

--
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

Hi Ralf
thanks for your advice , in proc the consume time
163413 microseconds per iteration reduce to half ,
due to the proc is main analysis program key code , need exetcute x3125x3125x100 times in target period
have some extension or method to speed up without change to other lang?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 05:11:24 2023

Harald Oehlmann 在 2023年4月21日星期五晚上11:18:37 [UTC+12] 的信中寫道：

Am 21.04.2023 um 10:11 schrieb [email protected]:

Hi all

Could some one give me some hit or direction to improve the speed?

same code by different language in same machine....
the program find the minimum target 1 appear times in 100 row by 3125 column

ex matrix: 0 1 0 0 1 0 1 0 1...
1 0 1 0 0 0 0 1 1...
1 1 1 0 0 1 1 0 1..
....
...

the speed gap almost 4 times

tcl(8.6.12) :
total_time: 266766 microseconds 266ms 0.266766s
Average execution time per iteration: 85 microseconds

python(3.9):
total_time: 59829800 nanoseconds 59.8298ms 0.0598298s
Average execution time per iteration: 19145.536 nanoseconds

###tcl code ###

set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
set matrix([expr {$i + 1}],$j) [expr {int(rand()*2)}]
}
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
set start_time [clock microseconds]
for {set i 0} {$i < $num_rows} {incr i} {
if {$matrix([expr {$i + 1}],$j) == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
set end_time [clock microseconds]
set iteration_time [expr {$end_time - $start_time}]
set total_time [expr {$total_time + $iteration_time}]
}
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"

####python code ###
import random
import time

num_iterations = 1000
num_columns = 3125
num_rows = 100
num_hits = 25

# Generate random matrix
matrix = []
for i in range(num_rows):
row = []
for j in range(num_columns):
row.append(random.randint(0, 1))
matrix.append(row)

# Find shortest sequence of 1's with given number of hits
shortest_seq = ""
shortest_len = num_rows
shortest_pos = []
total_time = 0

for j in range(num_columns):
hits = 0
seq = ""
pos = []
start_time = time.perf_counter_ns()
for i in range(num_rows):
if matrix[i+1][j] == 1:
hits += 1
seq += "1"
pos.append(i)
else:
seq += "0"
if hits >= num_hits:
seq_len = len(seq)
if seq_len < shortest_len:
shortest_len = seq_len
shortest_seq = seq
shortest_pos = pos
break
end_time = time.perf_counter_ns()
iteration_time = end_time - start_time
total_time += iteration_time

avg_time = total_time / num_columns
print(f"total_time: {total_time} nanoseconds {total_time/1000000.0}ms {total_time/1000000000.0}s")
print(f"Average execution time per iteration: {avg_time} nanoseconds")

BR
Rolance

Rolance,
thank you for the question. The principle difference I see between the programs is the way, the matrix is done:
TCL: matrix([expr {$i + 1}],$j)
Python: matrix[i+1][j]

So, on TCL, a hash is used to access the elements. I suppose, on Python,
you have index access.
So, it might be advisable to use a list of lists on TCL to also have
index access.

But I am sure, numeric wizards will answer here and give solutions using VecTCL, which has a native matrix type.

Harald

Hi Harald
thanks for your advice , will try it and report later.
BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Harald Oehlmann@21:1/5 to All on Fri Apr 21 13:18:33 2023

Am 21.04.2023 um 10:11 schrieb [email protected]:

Hi all

Could some one give me some hit or direction to improve the speed?

same code by different language in same machine....
the program find the minimum target 1 appear times in 100 row by 3125 column

ex matrix: 0 1 0 0 1 0 1 0 1...
1 0 1 0 0 0 0 1 1...
1 1 1 0 0 1 1 0 1..
....
...

the speed gap almost 4 times

tcl(8.6.12) :
total_time: 266766 microseconds 266ms 0.266766s
Average execution time per iteration: 85 microseconds

python(3.9):
total_time: 59829800 nanoseconds 59.8298ms 0.0598298s
Average execution time per iteration: 19145.536 nanoseconds

###tcl code ###

set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
set matrix([expr {$i + 1}],$j) [expr {int(rand()*2)}]
}
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
set start_time [clock microseconds]
for {set i 0} {$i < $num_rows} {incr i} {
if {$matrix([expr {$i + 1}],$j) == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
set end_time [clock microseconds]
set iteration_time [expr {$end_time - $start_time}]
set total_time [expr {$total_time + $iteration_time}]
}
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"

####python code ###
import random
import time

num_iterations = 1000
num_columns = 3125
num_rows = 100
num_hits = 25

# Generate random matrix
matrix = []
for i in range(num_rows):
row = []
for j in range(num_columns):
row.append(random.randint(0, 1))
matrix.append(row)

# Find shortest sequence of 1's with given number of hits
shortest_seq = ""
shortest_len = num_rows
shortest_pos = []
total_time = 0

for j in range(num_columns):
hits = 0
seq = ""
pos = []
start_time = time.perf_counter_ns()
for i in range(num_rows):
if matrix[i+1][j] == 1:
hits += 1
seq += "1"
pos.append(i)
else:
seq += "0"
if hits >= num_hits:
seq_len = len(seq)
if seq_len < shortest_len:
shortest_len = seq_len
shortest_seq = seq
shortest_pos = pos
break
end_time = time.perf_counter_ns()
iteration_time = end_time - start_time
total_time += iteration_time

avg_time = total_time / num_columns
print(f"total_time: {total_time} nanoseconds {total_time/1000000.0}ms {total_time/1000000000.0}s")
print(f"Average execution time per iteration: {avg_time} nanoseconds")

BR
Rolance

Rolance,
thank you for the question. The principle difference I see between the
programs is the way, the matrix is done:
TCL: matrix([expr {$i + 1}],$j)
Python: matrix[i+1][j]

So, on TCL, a hash is used to access the elements. I suppose, on Python,
you have index access.
So, it might be advisable to use a list of lists on TCL to also have
index access.

But I am sure, numeric wizards will answer here and give solutions using VecTCL, which has a native matrix type.

Harald

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ralf Fassel@21:1/5 to All on Fri Apr 21 16:22:23 2023

* "[email protected]" <[email protected]>
| Cecil Westerhof 在 2023年4月21日星期五晚上9:44:06 [UTC+12] 的信中寫道：
| > Ralf Fassel <[email protected]> writes:
| >
| > > * "[email protected]" <[email protected]>
| > > | Could some one give me some hit or direction to improve the speed?
| > >
| > > Putting the TCL code in a proc so it gets byte-compiled would be my
| > > first attempt on speedup.
--<snip-snip>--
| Hi Ralf
| thanks for your advice , in proc the consume time
| 163413 microseconds per iteration reduce to half ,

On my machine, using a list-based approach as Harald pointed out (and
inside a proc) I get comparable speeds for the TCL and python code:

your original code
total_time: 193136 microseconds 193ms 0.193136s
Average execution time per iteration: 61 microseconds

your original code inside proc
total_time: 94736 microseconds 94ms 0.094736s
Average execution time per iteration: 30 microseconds

using list ("lappend rows $row" instead of set matrix([expr]) and
"lindex $row $column" instead of $matrix([]))
total_time: 69328 microseconds 69ms 0.069328s
Average execution time per iteration: 22 microseconds

python as-is (had to change 'perf_counter_ns' to 'perf_counter' since my
python did not know perf_counter_ns):
python3 t.py
total_time: 0.0770318127470091 seconds [...]
Average execution time per iteration: 2.465018007904291e-05 seconds

HTH
R'

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 08:23:20 2023

Ralf Fassel 在 2023年4月22日星期六凌晨2:22:29 [UTC+12] 的信中寫道：

* "[email protected]" <[email protected]>
| Cecil Westerhof 在 2023年4月21日星期五晚上9:44:06 [UTC+12] 的信中寫道：
| > Ralf Fassel <[email protected]> writes:
| >
| > > * "[email protected]" <[email protected]>
| > > | Could some one give me some hit or direction to improve the speed?
| > >
| > > Putting the TCL code in a proc so it gets byte-compiled would be my
| > > first attempt on speedup.
--<snip-snip>--
| Hi Ralf
| thanks for your advice , in proc the consume time
| 163413 microseconds per iteration reduce to half ,
On my machine, using a list-based approach as Harald pointed out (and
inside a proc) I get comparable speeds for the TCL and python code:

your original code
total_time: 193136 microseconds 193ms 0.193136s
Average execution time per iteration: 61 microseconds

your original code inside proc
total_time: 94736 microseconds 94ms 0.094736s
Average execution time per iteration: 30 microseconds

using list ("lappend rows $row" instead of set matrix([expr]) and
"lindex $row $column" instead of $matrix([]))
total_time: 69328 microseconds 69ms 0.069328s
Average execution time per iteration: 22 microseconds

python as-is (had to change 'perf_counter_ns' to 'perf_counter' since my python did not know perf_counter_ns):
python3 t.py
total_time: 0.0770318127470091 seconds [...]
Average execution time per iteration: 2.465018007904291e-05 seconds

HTH
R'

Hi Ralf
thanks for your hint and test result
change to "lappend rows $row" method can be
total_time: 66394 microseconds per iteration

in multi proc implement in same time , the tcl still cost twice time than python (generate same result)
will do more analysis , which part will affect most .. data passing , file reading , array transfer etc..

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to [email protected] on Fri Apr 21 13:02:54 2023

On 4/21/2023 4:11 AM, [email protected] wrote:

Hi all

Could some one give me some hit or direction to improve the speed?

same code by different language in same machine....
the program find the minimum target 1 appear times in 100 row by 3125 column

How does this version compare?

set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
set matrix([expr {$i + 1}],$j) [expr {int(rand()*2)}]
}
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0

# put the main calculation in a proc
proc calculate {num_rows num_columns num_hits shortest_len} {
global matrix
global timings

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
for {set i 0} {$i < $num_rows} {incr i} {
if {$matrix([expr {$i + 1}],$j) == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
# no need to do time calculations here
lappend timings [clock microseconds]
}
}

# put the timer in a proc so we can run multiple tests
proc time_it {num_rows num_columns num_hits shortest_len} {
global matrix
global timings

set timings [list [clock microseconds]]

calculate $num_rows $num_columns $num_hits $shortest_len

# add up the timing differentials
set total_time [lindex $timings 0]
for {set i 1} {$i < [llength $timings]} {incr i} {
set delta [expr {[lindex $timings $i] - [lindex $timings [expr {$i -
1}]]}]
set total_time [expr {$total_time + $delta}]
}

# print it
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms
[expr $total_time/1000000.0]s \nAverage execution time per iteration:
$avg_time microseconds"

}

puts "time_it $num_rows $num_columns $num_hits $shortest_len"
puts "RUN 1: "
puts "================"
time_it $num_rows $num_columns $num_hits $shortest_len
puts "\n\n"
puts "RUN 2:"
puts "================"
time_it $num_rows $num_columns $num_hits $shortest_len

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to All on Fri Apr 21 13:06:41 2023

On 4/21/2023 1:02 PM, saitology9 wrote:

# add up the timing differentials
set total_time [lindex $timings 0]

There is a typo in this line which is in time_it.
Please change it as follows:

set total_time 0

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Alex Bochannek@21:1/5 to [email protected] on Fri Apr 21 10:52:27 2023

"[email protected]" <[email protected]> writes:

Could some one give me some hit or direction to improve the speed?

same code by different language in same machine....
the program find the minimum target 1 appear times in 100 row by 3125 column

ex matrix: 0 1 0 0 1 0 1 0 1...
1 0 1 0 0 0 0 1 1...
1 1 1 0 0 1 1 0 1..
....
...

I don't know if this is a synthetic problem or something you are going
to have to maintain. If it is the latter, it seems like an array
programming language would be a better fit (hope that's OK to point out
on c.l.tcl) I did not measure performance, which was your original
question, but maybe this is still a useful perspective.

In APL, finding the lowest numbers of 1s (no less than 25) in your
random array you can do like this: 25+⌊/(+/1-⍨?3125 100⍴2)-25
I am using Dyalog APL in these examples and the default ⎕IO of 1. I am
sure there are better, more efficient ways.

If you need to know the column, an easy to maintain solution is this
(comments inline):

⍝ 3125 by 100 array of random 0s and 1s
array←1-⍨?3125 100⍴2
⍝ Sum up columns
ones←+/array
⍝ Minimum number above 25
lowest←25+⌊/ones-25
⍝ Index of lowest numbers
⍸lowest⍷ones

As a one-liner it could look like this:

⍸({⍵+⌊/A-⍵}25)⍷A←+/1-⍨?3125 100⍴2

Again, apologies for a non-Tcl answer! I haven't written Tcl in many
years, but continue to follow this group, because I am curious about
progress of the Tcl 9.0 release.

--
Alex.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 14:18:57 2023

saitology9 在 2023年4月22日星期六清晨5:06:47 [UTC+12] 的信中寫道：

On 4/21/2023 1:02 PM, saitology9 wrote:

# add up the timing differentials
set total_time [lindex $timings 0]

There is a typo in this line which is in time_it.
Please change it as follows:

set total_time 0

Hi saitology9

thanks for your advice
test result :
time_it 100 3125 25 100
RUN 1:
================
total_time: 115791 microseconds 115ms
0.115791s
Average execution time per iteration:
37 microseconds

RUN 2:
================
total_time: 105553 microseconds 105ms
0.105553s
Average execution time per iteration:
33 microseconds

as now below method still be the best solution:
change to "lappend rows $row" method can be
total_time: 66394 microseconds per iteration

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 16:04:38 2023

[email protected] 在 2023年4月22日星期六上午9:19:00 [UTC+12] 的信中寫道：

saitology9 在 2023年4月22日星期六清晨5:06:47 [UTC+12] 的信中寫道：

On 4/21/2023 1:02 PM, saitology9 wrote:

# add up the timing differentials
set total_time [lindex $timings 0]

There is a typo in this line which is in time_it.
Please change it as follows:

set total_time 0

Hi saitology9

thanks for your advice
test result :
time_it 100 3125 25 100
RUN 1:
================
total_time: 115791 microseconds 115ms
0.115791s
Average execution time per iteration:
37 microseconds

RUN 2:
================
total_time: 105553 microseconds 105ms
0.105553s
Average execution time per iteration:
33 microseconds

as now below method still be the best solution:
change to "lappend rows $row" method can be
total_time: 66394 microseconds per iteration
BR
Rolance

hi all

already transfer to proc in two lang.

tcl : total time 0.0968s (twice....)
python: total time: 0.0496496

Could someone point me the key affect factor ? data passing issue ?

#### tcl ####

#set num_iterations 1000
set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
lappend matrix([expr {$i + 1}]) [expr {int(rand()*2)}]
}
}

proc find_shortest_hit_org {num_hits mx} {
array set matrix $mx

set num_columns [llength $matrix(1)]
set num_rows [llength [array names matrix]]

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos ""

set seq_lst ""

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos_list ""
set act_hit 0
for {set i 0} {$i < $num_rows} {incr i} {
if {[lindex $matrix([expr {$i + 1}]) $j] == 1} {
incr hits
append seq "1"
lappend pos_list $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos [join $pos_list ","]
}
if {[string index $seq end] == 0} {set act_hit 1 ; break}

}
}
if {$act_hit == 1} {
lappend seq_lst "$j [string length $seq]"
}
}

return [lrange [lsort -index 1 -increasing -integer $seq_lst] 0 9]

}

set be [clock microseconds]
find_shortest_hit_org 25 [array get matrix]
puts "total time [expr {([clock microseconds] -$be)/1000000.0}]s"

### python ####

import random
import time

num_iterations = 1000
num_columns = 3125
num_rows = 100
num_hits = 25

# Generate random matrix
matrix = []
for i in range(num_rows):
row = []
for j in range(num_columns):
row.append(random.randint(0, 1))
matrix.append(row)

def find_shortest_hit(num_hits, mx):
#matrix = dict(mx)
matrix = mx
#print(matrix[1])
num_columns = len(matrix[1])
num_rows = len(matrix)

shortest_seq = ''
shortest_len = num_rows
shortest_pos = ''

seq_lst = []

for j in range(num_columns):
hits = 0
seq = ''
pos_list = []
act_hit = 0

for i in range(num_rows):
#print(matrix[i+1][j])
if matrix[i+1][j] == 1:
hits += 1
seq += '1'
pos_list.append(str(i))
else:
seq += '0'

if hits >= num_hits:
seq_len = len(seq)
if seq_len < shortest_len:
shortest_len = seq_len
shortest_seq = seq
shortest_pos = ','.join(pos_list)

if seq[-1] == '0':
act_hit = 1
break

if act_hit == 1:
seq_lst.append((j, len(seq)))

return sorted(seq_lst, key=lambda x: x[1])[:10]

start_time = time.perf_counter_ns()

find_shortest_hit(25, matrix)

end_time = time.perf_counter_ns()
iteration_time = end_time - start_time
print(f"total time: {iteration_time/1000000000}")

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to [email protected] on Fri Apr 21 17:44:44 2023

On 4/21/2023 4:04 PM, [email protected] wrote:

[email protected] 在 2023年4月22日星期六上午9:19:00 [UTC+12] 的信中寫道：

saitology9 在 2023年4月22日星期六清晨5:06:47 [UTC+12] 的信中寫道：

On 4/21/2023 1:02 PM, saitology9 wrote:

# add up the timing differentials
set total_time [lindex $timings 0]

There is a typo in this line which is in time_it.
Please change it as follows:

set total_time 0

Hi saitology9

thanks for your advice
test result :
time_it 100 3125 25 100
RUN 1:
================
total_time: 115791 microseconds 115ms
0.115791s
Average execution time per iteration:
37 microseconds

RUN 2:
================
total_time: 105553 microseconds 105ms
0.105553s
Average execution time per iteration:
33 microseconds

as now below method still be the best solution:
change to "lappend rows $row" method can be
total_time: 66394 microseconds per iteration
BR
Rolance

hi all

already transfer to proc in two lang.

tcl : total time 0.0968s (twice....)
python: total time: 0.0496496

Could someone point me the key affect factor ? data passing issue ?

#### tcl ####

#set num_iterations 1000
set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
lappend matrix([expr {$i + 1}]) [expr {int(rand()*2)}]
}
}

proc find_shortest_hit_org {num_hits mx} {
array set matrix $mx

set num_columns [llength $matrix(1)]
set num_rows [llength [array names matrix]]

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos ""

set seq_lst ""

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos_list ""
set act_hit 0
for {set i 0} {$i < $num_rows} {incr i} {
if {[lindex $matrix([expr {$i + 1}]) $j] == 1} {
incr hits
append seq "1"
lappend pos_list $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos [join $pos_list ","]
}
if {[string index $seq end] == 0} {set act_hit 1 ; break}

}
}
if {$act_hit == 1} {
lappend seq_lst "$j [string length $seq]"
}
}

return [lrange [lsort -index 1 -increasing -integer $seq_lst] 0 9]

}

set be [clock microseconds]
find_shortest_hit_org 25 [array get matrix]
puts "total time [expr {([clock microseconds] -$be)/1000000.0}]s"

### python ####

import random
import time

num_iterations = 1000
num_columns = 3125
num_rows = 100
num_hits = 25

# Generate random matrix
matrix = []
for i in range(num_rows):
row = []
for j in range(num_columns):
row.append(random.randint(0, 1))
matrix.append(row)

def find_shortest_hit(num_hits, mx):
#matrix = dict(mx)
matrix = mx
#print(matrix[1])
num_columns = len(matrix[1])
num_rows = len(matrix)

shortest_seq = ''
shortest_len = num_rows
shortest_pos = ''

seq_lst = []

for j in range(num_columns):
hits = 0
seq = ''
pos_list = []
act_hit = 0

for i in range(num_rows):
#print(matrix[i+1][j])
if matrix[i+1][j] == 1:
hits += 1
seq += '1'
pos_list.append(str(i))
else:
seq += '0'

if hits >= num_hits:
seq_len = len(seq)
if seq_len < shortest_len:
shortest_len = seq_len
shortest_seq = seq
shortest_pos = ','.join(pos_list)

if seq[-1] == '0':
act_hit = 1
break

if act_hit == 1:
seq_lst.append((j, len(seq)))

return sorted(seq_lst, key=lambda x: x[1])[:10]

start_time = time.perf_counter_ns()

find_shortest_hit(25, matrix)

end_time = time.perf_counter_ns()
iteration_time = end_time - start_time
print(f"total time: {iteration_time/1000000000}")

BR
Rolance

You could consider using tcl threads.

It would appear that the calculation can be factored
into several parts that could be done concurrently.

You could setup the matrix using tsv::set one time
and call on several threads to each search some
set of rows or columns, and then collate the results.

Even my 2 year old intel computer has 12 cores.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to [email protected] on Fri Apr 21 21:53:50 2023

On 4/21/2023 5:18 PM, [email protected] wrote:

as now below method still be the best solution:
change to "lappend rows $row" method can be
total_time: 66394 microseconds per iteration

You can definitely combine the two suggestions. Switching from the
matrix as an array to a lists/sub-lists version, we get a slight
improvement. See three sample runs below:

% time_it 100 3125 25 100
total_time: 30661 microseconds 30ms 0.030661s
Average execution time per iteration: 9 microseconds

% time_it 100 3125 25 100
total_time: 35541 microseconds 35ms 0.035541s
Average execution time per iteration: 11 microseconds

% time_it 100 3125 25 100
total_time: 31111 microseconds 31ms 0.031111s
Average execution time per iteration: 9 microseconds

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to [email protected] on Fri Apr 21 21:37:44 2023

On 4/21/2023 5:18 PM, [email protected] wrote:

Hi saitology9

thanks for your advice
test result :
time_it 100 3125 25 100
RUN 1:
================
total_time: 115791 microseconds 115ms
0.115791s
Average execution time per iteration:
37 microseconds

RUN 2:
================
total_time: 105553 microseconds 105ms
0.105553s
Average execution time per iteration:
33 microseconds

as now below method still be the best solution:
change to "lappend rows $row" method can be
total_time: 66394 microseconds per iteration

Are you sure you ran the version that I posted? Because on a mid range
laptop, I am getting number that are about 1/3rd of what you posted.

time_it 100 3125 25 100
RUN 1:
================
total_time: 43239 microseconds 43ms 0.043239s
Average execution time per iteration: 13 microseconds

RUN 2:
================
total_time: 41028 microseconds 41ms 0.041028s
Average execution time per iteration: 13 microseconds

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to All on Fri Apr 21 22:05:31 2023

On 4/21/2023 8:44 PM, et99 wrote:

You could consider using tcl threads.

It would appear that the calculation can be factored
into several parts that could be done concurrently.

You could setup the matrix using tsv::set one time
and call on several threads to each search some
set of rows or columns, and then collate the results.

Even my 2 year old intel computer has 12 cores.

The OP is concerned about a run-time difference of less than 0.05 of a
second. No reason to time it now but I would suspect that the thread
setup would eat a good portion of any savings it may generate at this threshold.

The question could be why does this difference matter?

@Rolance:
Another potential reason for the difference: Tcl's random number
generator may be better than Python's so the algorithm spends more time
on each row/column, whereas the Python version may be hitting the break
quite early in each loop. I would suggest keeping the data set
identical for both programs. You can save the matrix to a file and have
both programs read it in before doing their calculations.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to All on Fri Apr 21 22:57:05 2023

On 4/21/2023 7:05 PM, saitology9 wrote:

On 4/21/2023 8:44 PM, et99 wrote:

You could consider using tcl threads.

It would appear that the calculation can be factored
into several parts that could be done concurrently.

You could setup the matrix using tsv::set one time
and call on several threads to each search some
set of rows or columns, and then collate the results.

Even my 2 year old intel computer has 12 cores.

The OP is concerned about a run-time difference of less than 0.05 of a second. No reason to time it now but I would suspect that the thread setup would eat a good portion of any savings it may generate at this threshold.

The question could be why does this difference matter?

@Rolance:
Another potential reason for the difference: Tcl's random number generator may be better than Python's so the algorithm spends more time on each row/column, whereas the Python version may be hitting the break quite early in each loop. I would suggest

keeping the data set identical for both programs. You can save the matrix to a file and have both programs read it in before doing their calculations.

Yes, you are right, it depends on whether the OP's example was a
small test case or not.

I guessed if OP was concerned with mere 100s of microseconds,
it was small and/or one of many to be computed.

I substituted in a tsv shared matrix to see what overhead was there,
which was only about 15% when I bumped up the rows/columns.

This is also an example of where an efficient bit/bitfield array
mechanism would be useful. Maybe Brian's lseq technology might
be adapted in the future.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 23:26:47 2023

saitology9 在 2023年4月22日星期六下午2:05:36 [UTC+12] 的信中寫道：

On 4/21/2023 8:44 PM, et99 wrote:

You could consider using tcl threads.

It would appear that the calculation can be factored
into several parts that could be done concurrently.

You could setup the matrix using tsv::set one time
and call on several threads to each search some
set of rows or columns, and then collate the results.

Even my 2 year old intel computer has 12 cores.

The OP is concerned about a run-time difference of less than 0.05 of a second. No reason to time it now but I would suspect that the thread
setup would eat a good portion of any savings it may generate at this threshold.

The question could be why does this difference matter?

@Rolance:
Another potential reason for the difference: Tcl's random number
generator may be better than Python's so the algorithm spends more time
on each row/column, whereas the Python version may be hitting the break quite early in each loop. I would suggest keeping the data set
identical for both programs. You can save the matrix to a file and have
both programs read it in before doing their calculations.

Hi saitology9
thanks for your advice , in single run test program ,
the result can be ---> total_time: 46189 microseconds
but transfer to proc run load data form file in real situation , python more stable and faster (twice ) than tcl
this is why post real two program , and search for help ,
change several different data structure still not ideal ...

the random data for test proc running , and the result speed same as real data Could point me the last proc I post , where can improve.

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Fri Apr 21 23:34:49 2023

et99 在 2023年4月22日星期六下午5:57:12 [UTC+12] 的信中寫道：

On 4/21/2023 7:05 PM, saitology9 wrote:

On 4/21/2023 8:44 PM, et99 wrote:

You could consider using tcl threads.

It would appear that the calculation can be factored
into several parts that could be done concurrently.

You could setup the matrix using tsv::set one time
and call on several threads to each search some
set of rows or columns, and then collate the results.

Even my 2 year old intel computer has 12 cores.

The OP is concerned about a run-time difference of less than 0.05 of a second. No reason to time it now but I would suspect that the thread setup would eat a good portion of any savings it may generate at this threshold.

The question could be why does this difference matter?

@Rolance:
Another potential reason for the difference: Tcl's random number generator may be better than Python's so the algorithm spends more time on each row/column, whereas the Python version may be hitting the break quite early in each loop. I would

suggest keeping the data set identical for both programs. You can save the matrix to a file and have both programs read it in before doing their calculations.

Yes, you are right, it depends on whether the OP's example was a
small test case or not.

I guessed if OP was concerned with mere 100s of microseconds,
it was small and/or one of many to be computed.

I substituted in a tsv shared matrix to see what overhead was there,
which was only about 15% when I bumped up the rows/columns.

This is also an example of where an efficient bit/bitfield array
mechanism would be useful. Maybe Brian's lseq technology might
be adapted in the future.

Hi et99
thanks for your advice ,
in real situation , already run multi-thread to accreate the calculate time the last post proc is the base unit need implement in single thread
if still not find the faster way , may replace to python in the base unit

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sat Apr 22 08:04:20 2023

You can improve speed in TCL (and probably python as well) by:

stop searching down any column once the position is past the current shortest sequence found.

when a new short sequence is found, save only enough information to generate the result. Only when all columns are searched, do the work to generate the result. Saving copies of even large structures is very fast in TCL (the C code just copies of a
pointer, and increments a reference count).

In TCL if the data is organized as a list of sequences (which are lists of 0 and 1) the foreach command can be used to process the list of columns and the list of bits in each column.

I'm seeing about a 10x speed up over the original code by using these suggestions, where both are procedures (to get them compiled).

Dave B

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sat Apr 22 02:28:03 2023

[email protected] 在 2023年4月22日星期六晚上8:04:24 [UTC+12] 的信中寫道：

You can improve speed in TCL (and probably python as well) by:

stop searching down any column once the position is past the current shortest sequence found.

when a new short sequence is found, save only enough information to generate the result. Only when all columns are searched, do the work to generate the result. Saving copies of even large structures is very fast in TCL (the C code just copies of a

pointer, and increments a reference count).

In TCL if the data is organized as a list of sequences (which are lists of 0 and 1) the foreach command can be used to process the list of columns and the list of bits in each column.

I'm seeing about a 10x speed up over the original code by using these suggestions, where both are procedures (to get them compiled).

Dave B

Hi Dave

thanks for your advice , already change the search method ,as you mention get the ideal speed
what I post the topic is search for more help , if the search method no room to improve , how to improve tcl performance ...
different lang have performance gap..
Could you point out my code where to improve with same search method.

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to [email protected] on Sat Apr 22 10:59:31 2023

On 4/22/2023 2:26 AM, [email protected] wrote:

the random data for test proc running , and the result speed same as real data
Could point me the last proc I post , where can improve.

You can squeeze out a few microseconds from these changes:

- instead of sending the whole matrix to find_shortest_hit_org, you can
use a global or upvar statement.

- you can improve the looping over the rows and elimintate the expr on
the row index. Your statement would look like this:

---
for {set i 1} {$i <= $num_rows} {incr i} {
if {[lindex $matrix($i) $j] == 1} {
---

Other than that, I am not sure what the code does. Others have asked
for a description and have gotten none. So, don't know what else to say.

Finally, you may want to check out a solution based on regexp. Maybe
someone with more regexp chops can help.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to [email protected] on Sat Apr 22 09:39:50 2023

On 4/22/2023 2:28 AM, [email protected] wrote:

[email protected] 在 2023年4月22日星期六晚上8:04:24 [UTC+12] 的信中寫道：

You can improve speed in TCL (and probably python as well) by:

stop searching down any column once the position is past the current shortest sequence found.

when a new short sequence is found, save only enough information to generate the result. Only when all columns are searched, do the work to generate the result. Saving copies of even large structures is very fast in TCL (the C code just copies of a

pointer, and increments a reference count).

In TCL if the data is organized as a list of sequences (which are lists of 0 and 1) the foreach command can be used to process the list of columns and the list of bits in each column.

I'm seeing about a 10x speed up over the original code by using these suggestions, where both are procedures (to get them compiled).

Dave B

Hi Dave

thanks for your advice , already change the search method ,as you mention get the ideal speed
what I post the topic is search for more help , if the search method no room to improve , how to improve tcl performance ...
different lang have performance gap..
Could you point out my code where to improve with same search method.

BR
Rolance

As Dave suggested, using a list of lists is likely the best bet. Then
you could use [lindex $matrix $i $j] for the test, and don't test against
being = to 1, just do a boolean test.

you can see the bytecode with

tcl::unsupported::disassemble proc find_shortest_hit_org

or

tcl::unsupported::disassemble script {if {[lindex $lst 1 1]} {set x 1}}

where one would see (partially shown):
...
Command 1: "if {[lindex $lst 1 1]} {set x 1}"
Command 2: "lindex $lst 1 1..."
(0) push1 0 # "lst"
(2) loadStk
(3) push1 1 # "1"
(5) push1 1 # "1"
(7) lindexMulti 3
(12) nop
(13) jumpFalse1 +9 # pc 22

so there is a specific byte code for an lindex with more than 1 index
and you can use the [time] command to just test small script code, say
to compare an array lookup vs. a list lookup.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sat Apr 22 14:50:39 2023

saitology9 在 2023年4月23日星期日凌晨2:59:37 [UTC+12] 的信中寫道：

On 4/22/2023 2:26 AM, [email protected] wrote:

the random data for test proc running , and the result speed same as real data
Could point me the last proc I post , where can improve.

You can squeeze out a few microseconds from these changes:

- instead of sending the whole matrix to find_shortest_hit_org, you can
use a global or upvar statement.

- you can improve the looping over the rows and elimintate the expr on
the row index. Your statement would look like this:

---
for {set i 1} {$i <= $num_rows} {incr i} {
if {[lindex $matrix($i) $j] == 1} {
---

Other than that, I am not sure what the code does. Others have asked
for a description and have gotten none. So, don't know what else to say.

Finally, you may want to check out a solution based on regexp. Maybe
someone with more regexp chops can help.

Hi saitology9

thanks for advice , this is a big data analysis program
the data already pre-scan by customer 's rule , 1 : hit , 0 : non-hit
already separate several part to save time..
orginal program develop by Tcl , the search speed may less than other langurage (C++) customer tell me...
by the budget , no value to rewrite all program to other to improve
as i post before , same data structure by diferent lang. have obviously speed gap...

see the ram usage tcl may use more than others , and less machine overload .... may the org code have room to improve.... if have other extension to speed up ...

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sat Apr 22 14:54:30 2023

et99 在 2023年4月23日星期日凌晨4:39:55 [UTC+12] 的信中寫道：

On 4/22/2023 2:28 AM, [email protected] wrote:

[email protected] 在 2023年4月22日星期六晚上8:04:24 [UTC+12] 的信中寫道：

You can improve speed in TCL (and probably python as well) by:

stop searching down any column once the position is past the current shortest sequence found.

when a new short sequence is found, save only enough information to generate the result. Only when all columns are searched, do the work to generate the result. Saving copies of even large structures is very fast in TCL (the C code just copies of a

pointer, and increments a reference count).

In TCL if the data is organized as a list of sequences (which are lists of 0 and 1) the foreach command can be used to process the list of columns and the list of bits in each column.

I'm seeing about a 10x speed up over the original code by using these suggestions, where both are procedures (to get them compiled).

Dave B

Hi Dave

thanks for your advice , already change the search method ,as you mention get the ideal speed
what I post the topic is search for more help , if the search method no room to improve , how to improve tcl performance ...
different lang have performance gap..
Could you point out my code where to improve with same search method.

BR
Rolance

As Dave suggested, using a list of lists is likely the best bet. Then
you could use [lindex $matrix $i $j] for the test, and don't test against being = to 1, just do a boolean test.

you can see the bytecode with

tcl::unsupported::disassemble proc find_shortest_hit_org

or

tcl::unsupported::disassemble script {if {[lindex $lst 1 1]} {set x 1}}

where one would see (partially shown):
...
Command 1: "if {[lindex $lst 1 1]} {set x 1}"
Command 2: "lindex $lst 1 1..."
(0) push1 0 # "lst"
(2) loadStk
(3) push1 1 # "1"
(5) push1 1 # "1"
(7) lindexMulti 3
(12) nop
(13) jumpFalse1 +9 # pc 22

so there is a specific byte code for an lindex with more than 1 index
and you can use the [time] command to just test small script code, say
to compare an array lookup vs. a list lookup.

hi et99

thanks your suggestion , will try and report later.

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to [email protected] on Sat Apr 22 15:23:26 2023

On 4/22/2023 2:54 PM, [email protected] wrote:

et99 在 2023年4月23日星期日凌晨4:39:55 [UTC+12] 的信中寫道：

On 4/22/2023 2:28 AM, [email protected] wrote:

[email protected] 在 2023年4月22日星期六晚上8:04:24 [UTC+12] 的信中寫道：

You can improve speed in TCL (and probably python as well) by:

stop searching down any column once the position is past the current shortest sequence found.

when a new short sequence is found, save only enough information to generate the result. Only when all columns are searched, do the work to generate the result. Saving copies of even large structures is very fast in TCL (the C code just copies of a

pointer, and increments a reference count).

In TCL if the data is organized as a list of sequences (which are lists of 0 and 1) the foreach command can be used to process the list of columns and the list of bits in each column.

I'm seeing about a 10x speed up over the original code by using these suggestions, where both are procedures (to get them compiled).

Dave B

Hi Dave

thanks for your advice , already change the search method ,as you mention get the ideal speed
what I post the topic is search for more help , if the search method no room to improve , how to improve tcl performance ...
different lang have performance gap..
Could you point out my code where to improve with same search method.

BR
Rolance

As Dave suggested, using a list of lists is likely the best bet. Then
you could use [lindex $matrix $i $j] for the test, and don't test against
being = to 1, just do a boolean test.

you can see the bytecode with

tcl::unsupported::disassemble proc find_shortest_hit_org

or

tcl::unsupported::disassemble script {if {[lindex $lst 1 1]} {set x 1}}

where one would see (partially shown):
...
Command 1: "if {[lindex $lst 1 1]} {set x 1}"
Command 2: "lindex $lst 1 1..."
(0) push1 0 # "lst"
(2) loadStk
(3) push1 1 # "1"
(5) push1 1 # "1"
(7) lindexMulti 3
(12) nop
(13) jumpFalse1 +9 # pc 22

so there is a specific byte code for an lindex with more than 1 index
and you can use the [time] command to just test small script code, say
to compare an array lookup vs. a list lookup.

hi et99

thanks your suggestion , will try and report later.

BR
Rolance

I did a test using the [lindex $matrix $i $j] and found about a 40% reduction in time.

It appears that the overhead is beginning to get down to the "if test" and the "for loop".

This is why Dave is suggesting you use foreach, which has less to do on each iteration, since the test for the end and incrementing the index is effectively done in the C code whereas a for loop does that in script code:

% tcl::unsupported::disassemble script {for {set i 1} {$i < $num_rows} {incr i} {doit}}
ByteCode 0x05902750, refCt 1, epoch 17, interp 0x00939060 (epoch 17)
Source "for {set i 1} {$i < $num_rows} {incr i} {doit}"
Cmds 4, src 46, inst 39, litObjs 5, aux 0, stkDepth 2, code/src 0.00
Exception ranges 2, depth 1:
0: level 0, loop, pc 8-11, continue 13, break 36
1: level 0, loop, pc 13-25, continue -1, break 36
Commands 4:
1: pc 0-37, src 0-45 2: pc 0-4, src 5-11
3: pc 8-11, src 41-44 4: pc 13-25, src 32-37
Command 1: "for {set i 1} {$i < $num_rows} {incr i} {doit}"
Command 2: "set i 1..."
(0) push1 0 # "i"
(2) push1 1 # "1"
(4) storeStk
(5) pop
(6) jump1 +21 # pc 27
Command 3: "doit..."
(8) push1 2 # "doit"
(10) invokeStk1 1
(12) pop
Command 4: "incr i..."
(13) startCommand +13 1 # next cmd at pc 26, 1 cmds start here
(22) push1 0 # "i"
(24) incrStkImm +1
(26) pop
(27) push1 0 # "i"
(29) loadStk
(30) push1 3 # "num_rows"
(32) loadStk
(33) lt
(34) jumpTrue1 -26 # pc 8
(36) push1 4 # ""
(38) done

% tcl::unsupported::disassemble script {foreach i $list {doit}}
ByteCode 0x05902650, refCt 1, epoch 17, interp 0x00939060 (epoch 17)
Source "foreach i $list {doit}"
Cmds 1, src 22, inst 12, litObjs 4, aux 0, stkDepth 4, code/src 0.00
Commands 1:
1: pc 0-10, src 0-21
Command 1: "foreach i $list {doit}"
(0) push1 0 # "foreach"
(2) push1 1 # "i"
(4) push1 2 # "list"
(6) loadStk
(7) push1 3 # "doit"
(9) invokeStk1 4
(11) done

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to All on Sat Apr 22 18:05:22 2023

On 4/22/2023 3:23 PM, et99 wrote:

On 4/22/2023 2:54 PM, [email protected] wrote:

et99 在 2023年4月23日星期日凌晨4:39:55 [UTC+12] 的信中寫道： >>> On 4/22/2023 2:28 AM, [email protected] wrote:

[email protected] 在 2023年4月22日星期六晚上8:04:24 [UTC+12] 的信中寫道：

You can improve speed in TCL (and probably python as well) by:

stop searching down any column once the position is past the current shortest sequence found.

when a new short sequence is found, save only enough information to generate the result. Only when all columns are searched, do the work to generate the result. Saving copies of even large structures is very fast in TCL (the C code just copies of a

pointer, and increments a reference count).

In TCL if the data is organized as a list of sequences (which are lists of 0 and 1) the foreach command can be used to process the list of columns and the list of bits in each column.

I'm seeing about a 10x speed up over the original code by using these suggestions, where both are procedures (to get them compiled).

Dave B

Hi Dave

thanks for your advice , already change the search method ,as you mention get the ideal speed
what I post the topic is search for more help , if the search method no room to improve , how to improve tcl performance ...
different lang have performance gap..
Could you point out my code where to improve with same search method.

BR
Rolance

As Dave suggested, using a list of lists is likely the best bet. Then
you could use [lindex $matrix $i $j] for the test, and don't test against >>> being = to 1, just do a boolean test.

you can see the bytecode with

tcl::unsupported::disassemble proc find_shortest_hit_org

or

tcl::unsupported::disassemble script {if {[lindex $lst 1 1]} {set x 1}}

where one would see (partially shown):
...
Command 1: "if {[lindex $lst 1 1]} {set x 1}"
Command 2: "lindex $lst 1 1..."
(0) push1 0 # "lst"
(2) loadStk
(3) push1 1 # "1"
(5) push1 1 # "1"
(7) lindexMulti 3
(12) nop
(13) jumpFalse1 +9 # pc 22

so there is a specific byte code for an lindex with more than 1 index
and you can use the [time] command to just test small script code, say
to compare an array lookup vs. a list lookup.

hi et99

thanks your suggestion , will try and report later.

BR
Rolance

I did a test using the [lindex $matrix $i $j] and found about a 40% reduction in time.

It appears that the overhead is beginning to get down to the "if test" and the
"for loop".

This is why Dave is suggesting you use foreach, which has less to do on each iteration, since the test for the end and incrementing the index is effectively
done in the C code whereas a for loop does that in script code:

% tcl::unsupported::disassemble script {for {set i 1} {$i < $num_rows} {incr i} {doit}}
ByteCode 0x05902750, refCt 1, epoch 17, interp 0x00939060 (epoch 17)
Source "for {set i 1} {$i < $num_rows} {incr i} {doit}"
Cmds 4, src 46, inst 39, litObjs 5, aux 0, stkDepth 2, code/src 0.00
Exception ranges 2, depth 1:
      0: level 0, loop, pc 8-11, continue 13, break 36
      1: level 0, loop, pc 13-25, continue -1, break 36
Commands 4:
      1: pc 0-37, src 0-45        2: pc 0-4, src 5-11
      3: pc 8-11, src 41-44        4: pc 13-25, src 32-37
Command 1: "for {set i 1} {$i < $num_rows} {incr i} {doit}"
Command 2: "set i 1..."
    (0) push1 0     # "i"
    (2) push1 1     # "1"
    (4) storeStk
    (5) pop
    (6) jump1 +21     # pc 27
Command 3: "doit..."
    (8) push1 2     # "doit"
    (10) invokeStk1 1
    (12) pop
Command 4: "incr i..."
    (13) startCommand +13 1     # next cmd at pc 26, 1 cmds start here
    (22) push1 0     # "i"
    (24) incrStkImm +1
    (26) pop
    (27) push1 0     # "i"
    (29) loadStk
    (30) push1 3     # "num_rows"
    (32) loadStk
    (33) lt
    (34) jumpTrue1 -26     # pc 8
    (36) push1 4     # ""
    (38) done

% tcl::unsupported::disassemble script {foreach i $list {doit}}
ByteCode 0x05902650, refCt 1, epoch 17, interp 0x00939060 (epoch 17)
Source "foreach i $list {doit}"
Cmds 1, src 22, inst 12, litObjs 4, aux 0, stkDepth 4, code/src 0.00
Commands 1:
      1: pc 0-10, src 0-21
Command 1: "foreach i $list {doit}"
    (0) push1 0     # "foreach"
    (2) push1 1     # "i"
    (4) push1 2     # "list"
    (6) loadStk
    (7) push1 3     # "doit"
    (9) invokeStk1 4
    (11) done

On last thought on using foreach, the index is the item itself, 0 or 1, and so the if test doesn't have to do an array or lindex list lookup.
Something like this:

foreach row {list of the rows}
foreach column $row
if {$column} ....

Actually, I couldn't really tell if you searching a row or a column,
so adjust as needed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sun Apr 23 05:48:10 2023

I've timed the execution from entry to the proc to just before writing the result to the screen.

On my machine for the original code as a proc:
proc ex time 184 ms
inner loop time 58 ms
overhead time 126 ms

For my optimized case (list of lists, foreach ans early out)
proc ex time 87 ms
inner loop time 8 ms
overhead time 79 ms

The overhead is creating the data to be analyzed, and the outer loops.
In the optimized case about 90% of the run time is overhead, formatting the data.
This likely applies to python as well

I suspect this is the case in your actual application too.
If you want more speed, you need to minimize the work done to process
the raw data into what you analyze.
The test code should be modified to use that raw data
(or at least minimize the processing of the raw data before analysis).

Tcl has very good string processing commands, and can handle byte strings as well.
Perhaps using those data formats directly in the analysis code would increase overall speed.

Trying to speed up the inner loop at this point is likely wasted effort.

Dave B

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sun Apr 23 02:47:09 2023

[email protected] 在 2023年4月23日星期日下午5:48:13 [UTC+12] 的信中寫道：

I've timed the execution from entry to the proc to just before writing the result to the screen.

On my machine for the original code as a proc:
proc ex time 184 ms
inner loop time 58 ms
overhead time 126 ms

For my optimized case (list of lists, foreach ans early out)
proc ex time 87 ms
inner loop time 8 ms
overhead time 79 ms

The overhead is creating the data to be analyzed, and the outer loops.
In the optimized case about 90% of the run time is overhead, formatting the data.
This likely applies to python as well

I suspect this is the case in your actual application too.
If you want more speed, you need to minimize the work done to process
the raw data into what you analyze.
The test code should be modified to use that raw data
(or at least minimize the processing of the raw data before analysis).

Tcl has very good string processing commands, and can handle byte strings as well.
Perhaps using those data formats directly in the analysis code would increase overall speed.

Trying to speed up the inner loop at this point is likely wasted effort.

Dave B

Hi Dave
thanks for your advice, will optimize my code by your suggestion

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to [email protected] on Sun Apr 23 07:54:04 2023

On 4/22/2023 5:50 PM, [email protected] wrote:

Hi saitology9

thanks for advice , this is a big data analysis program
the data already pre-scan by customer 's rule , 1 : hit , 0 : non-hit already separate several part to save time..
orginal program develop by Tcl , the search speed may less than other langurage (C++) customer tell me...
by the budget , no value to rewrite all program to other to improve
as i post before , same data structure by diferent lang. have obviously speed gap...

see the ram usage tcl may use more than others , and less machine overload ....
may the org code have room to improve.... if have other extension to speed up ...

Thanks. Alas, you are not describing what the program is trying to do.
Also the improvements we can suggest, we can only test againts the base version, and it looks like you are testing it in the real app which
introduces a lot of unknowns for us so we can't see what helps and what doesn't.

A few suggestions as the code you posted can use improvements:
1) See if you can switch the loop order to rows first. You could go from
3125 loops to 100.
2) Modify your loop indexes to start from 1 and avoid extra "expr"
statements.
3) No reason for "append seq 0/1". "seq" is simply the last character,
which is the value of the cell in the if-statement.
4) Likewise, no reason to do seq_len (twice!), as it is simply the row
index, which is "i".
5) No reason to calculate pos_list and shortest_pos.
6) No reason to do "string length $seq" after breaking out of the loop.
Just put the "lappend ..." right before the "break".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Sun Apr 23 05:47:11 2023

saitology9 在 2023年4月23日星期日晚上11:54:10 [UTC+12] 的信中寫道：

On 4/22/2023 5:50 PM, [email protected] wrote:

Hi saitology9

thanks for advice , this is a big data analysis program
the data already pre-scan by customer 's rule , 1 : hit , 0 : non-hit already separate several part to save time..
orginal program develop by Tcl , the search speed may less than other langurage (C++) customer tell me...
by the budget , no value to rewrite all program to other to improve
as i post before , same data structure by diferent lang. have obviously speed gap...

see the ram usage tcl may use more than others , and less machine overload ....
may the org code have room to improve.... if have other extension to speed up ...

Hi saitology9
thanks for your continue support

Thanks. Alas, you are not describing what the program is trying to do.
Also the improvements we can suggest, we can only test againts the base version, and it looks like you are testing it in the real app which introduces a lot of unknowns for us so we can't see what helps and what doesn't.

the real problem is analysis speed , the program i post is the finial key part to generate result
already follow your advice , and get more improve for speed
in this structure , no more other effort "Python >> twice than tcl ...." (i can transfer the part individual use other lang. but i intend use tcl )
look for root cause why the speed gap appear , if do same effort the Python performance more than 4 times?
list is a best data structure for tcl , do have some extension to faster analysis search " 0 1 0 1 0 1 1...." list ?

A few suggestions as the code you posted can use improvements:
1) See if you can switch the loop order to rows first. You could go from 3125 loops to 100.
2) Modify your loop indexes to start from 1 and avoid extra "expr" statements.
3) No reason for "append seq 0/1". "seq" is simply the last character,
which is the value of the cell in the if-statement.

this part is for customer verify

4) Likewise, no reason to do seq_len (twice!), as it is simply the row index, which is "i".
5) No reason to calculate pos_list and shortest_pos.

this part is for customer verify

6) No reason to do "string length $seq" after breaking out of the loop.
Just put the "lappend ..." right before the "break".

the part for customer rule , need record and check next item with non-hit

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From saitology9@21:1/5 to [email protected] on Sun Apr 23 09:35:40 2023

On 4/23/2023 8:47 AM, [email protected] wrote:

this part is for customer verify

:-)

the part for customer rule , need record and check next item with non-hit

The info is already available at that point. You can put it in the
result you return.

Final suggestion:

See if you can put your data in a list indexed by column.
So your data will look like this:
matrix(0) {0 1 0 0 ....}
matrix(1) {1 0 0 1 ....}

Here is the code to generate this:

for {set j 0} {$j < $num_columns} {incr j} {
for {set i 0} {$i < $num_rows} {incr i} {
lappend matrix($j) [expr {int(rand()*2)}]
}
}

Then you prepare a "key" - this is what you are looking for:

set key [string trim [string repeat "1 " $shortest_len]]

Now, we can simplify your search. This should give you the best results:

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}

set pos [string first $matrix($j) $key]
if {$pos >= 0} {
## we found the answer
## it is at: row=$pos, col=$j
## do whatever you need with it
break
}
}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to All on Sun Apr 23 08:02:15 2023

On 4/23/2023 6:35 AM, saitology9 wrote:

On 4/23/2023 8:47 AM, [email protected] wrote:

this part is for customer verify

:-)

the part for customer rule , need record and check next item with non-hit

The info is already available at that point. You can put it in the result you return.

Final suggestion:

See if you can put your data in a list indexed by column.
So your data will look like this:
matrix(0) {0 1 0 0 ....}
matrix(1) {1 0 0 1 ....}

Here is the code to generate this:

for {set j 0} {$j < $num_columns} {incr j} {
for {set i 0} {$i < $num_rows} {incr i} {
    lappend matrix($j) [expr {int(rand()*2)}]
    }
}

Then you prepare a "key" - this is what you are looking for:

set key [string trim [string repeat "1 " $shortest_len]]

Now, we can simplify your search. This should give you the best results:

for {set j 0} {$j < $num_columns} {incr j} {
   set hits 0
   set seq ""
   set pos {}

   set pos [string first $matrix($j) $key]
   if {$pos >= 0} {
        ## we found the answer
        ## it is at: row=$pos, col=$j
        ## do whatever you need with it
        break
   }
}

The below is my final version. I too don't really know what
the code is trying to do. My guess was that it is looking for
the tightest cluster of 1 bits in the rows. But all my results begin at position 0 or 1, so there's some bias to clusters beginning
earlier in the rows. I also swapped row/column at some point
so I don't know if it's still what was wanted, and I did
all 0..n-1 for the numerical indices.

This is a mod of saitology9's version using Daves list of lists idea
with results (the puts were few enough so didn't change the timing much)

matrix start 07:45:15.067
matrix done 07:45:15.442
time_it 100 3125 25 3125
RUN 1:
================
j= |0| i= |51| seq_len= |52| seq= |0100000011100000000110111010011100010101110110111011| pos= |1|
j= |1| i= |39| seq_len= |40| seq= |1101010110000111100111111011100101011011| pos= |0|
j= |20| i= |38| seq_len= |39| seq= |101101011001101101110111100000111101111| pos= |0|
j= |94| i= |35| seq_len= |36| seq= |111101010010010111010111111111011011| pos= |0|
j= |389| i= |34| seq_len= |35| seq= |11010111111101110011011100111111001| pos= |0|
j= |974| i= |33| seq_len= |34| seq= |1111111110010111111001011111001101| pos= |0|
j= |1000| i= |30| seq_len= |31| seq= |1100111111111111111111010010111| pos= |0| j= |2344| i= |29| seq_len= |30| seq= |111101111110111100110111111111| pos= |0| calc2
total_time: 35133 microseconds 35ms 0.035133s
Average execution time per iteration: 11 microseconds
shortest_len= |30| shortest_seq= |111101111110111100110111111111| shortest_pos= |0|

set num_columns 3125
set num_rows 100
set num_hits 25
puts "matrix start [Time]" ; update

# Generate random matrix but same each time
expr srand(1)
set lmatrix {{}}
for {set i 0} {$i < $num_rows} {incr i} {
for {set j 0} {$j < $num_columns} {incr j} {
set v [expr {int(rand()*2)}]
lset lmatrix $j $i $v
}
}
puts "matrix done [Time]" ; update
# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_columns
set shortest_pos {}
set total_time 0

# put the main calculation in a proc

proc calculate2 {num_rows num_columns num_hits shortest_len} {
global matrix lmatrix
global timings shortest_seq shortest_pos

set j -1
foreach row $lmatrix {
incr j
set hits 0
set seq ""
set pos {}
set i -1
foreach column $row {
incr i
if {$column} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set ::shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
puts "j= |$j| i= |$i| seq_len= |$seq_len| seq= |$seq| pos= |[lindex $pos 0 ]| "
}
break
}
}
# no need to do time calculations here
lappend timings [clock microseconds]
}
puts calc2
}

# put the timer in a proc so we can run multiple tests
proc time_it {num_rows num_columns num_hits shortest_len} {
global matrix
global timings

set timings [list [clock microseconds]]

# calculate $num_rows $num_columns $num_hits $shortest_len
calculate2 $num_rows $num_columns $num_hits $shortest_len

# add up the timing differentials
set total_time 0
for {set i 1} {$i < [llength $timings]} {incr i} {
set delta [expr {[lindex $timings $i] - [lindex $timings [expr {$i - 1}]]}]
set total_time [expr {$total_time + $delta}]
}

# print it
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"

}

puts "time_it $num_rows $num_columns $num_hits $shortest_len"
puts "RUN 1: "
puts "================"
time_it $num_rows $num_columns $num_hits $shortest_len

puts "shortest_len= |$shortest_len| shortest_seq= |$shortest_seq| shortest_pos= |[lindex $shortest_pos 0 ]| "

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ralf Fassel@21:1/5 to All on Mon Apr 24 12:07:26 2023

* "[email protected]" <[email protected]>
| already transfer to proc in two lang.

| tcl : total time 0.0968s (twice....)
| python: total time: 0.0496496

| Could someone point me the key affect factor ? data passing issue ? --<snip-snip>--

| # Generate random matrix
| for {set i 0} {$i < $num_rows} {incr i} {
| for {set j 0} {$j < $num_columns} {incr j} {
| lappend matrix([expr {$i + 1}]) [expr {int(rand()*2)}] --<snip-snip>--
| for {set i 0} {$i < $num_rows} {incr i} {
| if {[lindex $matrix([expr {$i + 1}]) $j] == 1} {

You are still using a hash-based data struct here (matrix()), this will
slow things down compared to plain list-of-lists (variable 'rows' in
below code):

proc do_it_2 {} {
set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
set row [list]
for {set j 0} {$j < $num_columns} {incr j} {
lappend row [expr {int(rand()*2)}]
}
lappend rows $row
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0

for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
set start_time [clock microseconds]
for {set i 0} {$i < $num_rows} {incr i} {
set elt [lindex $rows $i $j]
if {$elt == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
set end_time [clock microseconds]
set iteration_time [expr {$end_time - $start_time}]
set total_time [expr {$total_time + $iteration_time}]
}
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"
}

R'

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From [email protected]@21:1/5 to All on Mon Apr 24 05:02:25 2023

Ralf Fassel 在 2023年4月24日星期一晚上10:07:32 [UTC+12] 的信中寫道：

* "[email protected]" <[email protected]>
| already transfer to proc in two lang.

| tcl : total time 0.0968s (twice....)
| python: total time: 0.0496496

| Could someone point me the key affect factor ? data passing issue ? --<snip-snip>--

| # Generate random matrix
| for {set i 0} {$i < $num_rows} {incr i} {
| for {set j 0} {$j < $num_columns} {incr j} {
| lappend matrix([expr {$i + 1}]) [expr {int(rand()*2)}]
--<snip-snip>--
| for {set i 0} {$i < $num_rows} {incr i} {
| if {[lindex $matrix([expr {$i + 1}]) $j] == 1} {
You are still using a hash-based data struct here (matrix()), this will
slow things down compared to plain list-of-lists (variable 'rows' in
below code):

proc do_it_2 {} {
set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
set row [list]
for {set j 0} {$j < $num_columns} {incr j} {
lappend row [expr {int(rand()*2)}]
}
lappend rows $row
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0
for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
set start_time [clock microseconds]
for {set i 0} {$i < $num_rows} {incr i} {
set elt [lindex $rows $i $j]
if {$elt == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
set end_time [clock microseconds]
set iteration_time [expr {$end_time - $start_time}]
set total_time [expr {$total_time + $iteration_time}]
}
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"
}
R'

Hi Ralf

thanks for the code
it is the most efficient way , I will change the data structure to fit this the speed loss may in the raw data pass between multi proc
in real program still twice low than python , will do more research ...

@ saitology9 , et99
thanks for your detail explain and value suggestion
will optimize each proc

by the way the python version gererate by chatgpt and do some small modification
let me easy to compare different langure perforamce and resource usage

BR
Rolance

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From et99@21:1/5 to [email protected] on Mon Apr 24 09:10:34 2023

On 4/24/2023 5:02 AM, [email protected] wrote:

Ralf Fassel 在 2023年4月24日星期一晚上10:07:32 [UTC+12] 的信中寫道：

* "[email protected]" <[email protected]>
| already transfer to proc in two lang.

| tcl : total time 0.0968s (twice....)
| python: total time: 0.0496496

| Could someone point me the key affect factor ? data passing issue ?
--<snip-snip>--

| # Generate random matrix
| for {set i 0} {$i < $num_rows} {incr i} {
| for {set j 0} {$j < $num_columns} {incr j} {
| lappend matrix([expr {$i + 1}]) [expr {int(rand()*2)}]
--<snip-snip>--
| for {set i 0} {$i < $num_rows} {incr i} {
| if {[lindex $matrix([expr {$i + 1}]) $j] == 1} {
You are still using a hash-based data struct here (matrix()), this will
slow things down compared to plain list-of-lists (variable 'rows' in
below code):

proc do_it_2 {} {
set num_columns 3125
set num_rows 100
set num_hits 25

# Generate random matrix
for {set i 0} {$i < $num_rows} {incr i} {
set row [list]
for {set j 0} {$j < $num_columns} {incr j} {
lappend row [expr {int(rand()*2)}]
}
lappend rows $row
}

# Find shortest sequence of 1's with given number of hits
set shortest_seq ""
set shortest_len $num_rows
set shortest_pos {}
set total_time 0
for {set j 0} {$j < $num_columns} {incr j} {
set hits 0
set seq ""
set pos {}
set start_time [clock microseconds]
for {set i 0} {$i < $num_rows} {incr i} {
set elt [lindex $rows $i $j]
if {$elt == 1} {
incr hits
append seq "1"
lappend pos $i
} else {
append seq "0"
}
if {$hits >= $num_hits} {
set seq_len [string length $seq]
if {$seq_len < $shortest_len} {
set shortest_len $seq_len
set shortest_seq $seq
set shortest_pos $pos
}
break
}
}
set end_time [clock microseconds]
set iteration_time [expr {$end_time - $start_time}]
set total_time [expr {$total_time + $iteration_time}]
}
set avg_time [expr {$total_time / $num_columns}]
puts "total_time: $total_time microseconds [expr $total_time/1000]ms [expr $total_time/1000000.0]s \nAverage execution time per iteration: $avg_time microseconds"
}
R'

Hi Ralf

thanks for the code
it is the most efficient way , I will change the data structure to fit this the speed loss may in the raw data pass between multi proc
in real program still twice low than python , will do more research ...

@ saitology9 , et99
thanks for your detail explain and value suggestion
will optimize each proc

by the way the python version gererate by chatgpt and do some small modification
let me easy to compare different langure perforamce and resource usage

BR
Rolance

chatgpt, now that's interesting. If it can write a python
program, I wonder if it could write a C program, in particular
a tcl C extension. That would likely be the best solution
for performance and still stay within tcl and a single process.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Rixter
  Wed Jul 29 02:00:40 2026
  from Madison, Nc via Telnet
- Centurion
  Tue Jul 28 22:54:59 2026
  from Berea, Ohio via Telnet
- Bob Worm
  Tue Jul 28 16:01:18 2026
  from Wales, Uk via Telnet
- Rixter
  Tue Jul 28 13:42:46 2026
  from Madison, Nc via Telnet
- Krenn
  Tue Jul 28 11:59:57 2026
  from Sydney, Nsw via Telnet
- Rixter
  Tue Jul 28 01:23:48 2026
  from Madison, Nc via Telnet
- Centurion
  Mon Jul 27 22:50:42 2026
  from Berea, Ohio via Telnet
- Ataricrypt
  Mon Jul 27 19:19:17 2026
  from England via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	55:05:57
Calls:	12,446
Calls today:	1
Files:	15,192
Messages:	6,537,346

speed improve wanted

Who's Online

Recent Visitors

System Info