I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I
simulate rolling 2 dices seven times and registering the value of each
roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
El 12/03/2022 a las 17:09, Cecil Westerhof escribió:
I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I
simulate rolling 2 dices seven times and registering the value of each
roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
Each roll of 2 dices gives a value 1-12. You can consider each value as
an hexadecimal digit and pack the result of 7 rolls as a single 7 digit hexadecimal value. This would reduce the memory amount by a factor of 7.
At the cost of unpacking and retrieving the individual values when
needed.
El 12/03/2022 a las 17:09, Cecil Westerhof escribió:
I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I
simulate rolling 2 dices seven times and registering the value of each
roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
Each roll of 2 dices gives a value 1-12. You can consider each value as
an hexadecimal digit and pack the result of 7 rolls as a single 7 digit
hexadecimal value. This would reduce the memory amount by a factor of 7.
At the cost of unpacking and retrieving the individual values when
needed.
My info was not correct. :'-( After the rolls 8 GB was used. I then
use the data structure and the memory usage grew to 9.3 GB.
I rewrote it to:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
set roll [expr {[::math::random 1 7] + [::math::random 1 7]}]
switch $roll {
10 { set roll A }
11 { set roll B }
12 { set roll C }
}
append diceRoll $roll
}
dict incr diceRollDictSeq ${diceRoll}
}
Now the memory used after creating the data-structure is 2.75 GB. So
about a third (not a seventh). The needed time increases with a
quarter. That is acceptable I think.
When using the data-structure the memory usage becomes 4 GB. So the
increase is about the same. I would have expected it to also be a
third of what it was before.
The data-structure is used like this:
proc displayBest {diceRollDict} {
puts "Element count: [dict size ${diceRollDict}] distinct values"
set diceRollList {}
dict for {diceRoll count} ${diceRollDict} {
lappend diceRollList [list ${diceRoll} ${count}]
}
set diceRollList [lsort -index 1 -integer -decreasing ${diceRollList}]
set first [lindex [lindex ${diceRollList} 0] 1]
set second [lindex [lindex ${diceRollList} 1] 1]
set percFirst [expr {100. * ${first} / ${::nrOfTries} }]
set percGreater [expr {100. * ${first} / ${second} - 100}]
puts "First has ${percFirst}% chance"
puts "Chance first ${percGreater}% greater as second"
foreach diceRollElement [lrange ${diceRollList} 0 ${::nrOfResults}] {
set diceRoll [lindex ${diceRollElement} 0]
set count [lindex ${diceRollElement} 1]
puts "[split ${diceRoll} ""]: ${count}"
}
}
Why is the memory usage a third instead of a seventh?
Why does the usage of the smaller data-structure uses the same amount
of extra memory as the greater data-structure?
I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I
simulate rolling 2 dices seven times and registering the value of each
roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
On Saturday, March 12, 2022 at 5:14:07 PM UTC+1, Cecil Westerhof wrote:
I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I
simulate rolling 2 dices seven times and registering the value of each
roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
Just an incomplete thought: do you really need all of them in memory?
What will you be doing with the data?
Maybe you want to collect aggregates only. Or you could produce data to
disk (file, pipe, db, ...) and post-process it from there.
I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I
simulate rolling 2 dices seven times and registering the value of each
roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
On 3/12/22 11:09 AM, Cecil Westerhof wrote:
I have a program where I simulate throwing 2 dices several times a
number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I simulate rolling 2 dices seven times and registering the value of each roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of
RAM significantly by using another data-structure?
Hello,
Interesting problem. My recommendation would be to use an array instead
of a dict which will save you in both in time and space. Here is a
slightly modified version of your code using arrays:
array set diceRollDictSeq {}
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set temp {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
## you can map 10/11/12 to A/B/C here or later when accessing the values lappend temp [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
incr diceRollDictSeq([join $temp ,])
}
I wasn't sure what to expect so I ran it for 10 million and 7You can use a bitfield to hold the values. A bitfield basically holds a 1 or 0 at an index. So the string "001001" as an example would have a value at 2 and 5. So just using a string you can get ( assuming one byte per character and storing a
(inner/outer loop limits). On a Windows laptop, this ran for 6 minutes
and consumed 950MB of memory (while I was busy browsing). It resulted
in 5,346,039 unique rolls which is about a quarter of the total solution space.
You can generate the same output report as your other post indicates
with minor changes.
On Sunday, March 13, 2022 at 11:18:33 PM UTC-5, [email protected] wrote:
On 3/12/22 11:09 AM, Cecil Westerhof wrote:
I have a program where I simulate throwing 2 dices several times a number of times.
In the code below nrOfTries is 25 million and nrOfRolls is 7. So I simulate rolling 2 dices seven times and registering the value of each roll 25 million times. The code is:
set diceRollDictSeq [dict create]
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set diceRoll {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
lappend diceRoll [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
dict incr diceRollDictSeq ${diceRoll}
}
This takes almost 9 GB of RAM. Is there a way to reduce the amount of RAM significantly by using another data-structure?
Hello,
Interesting problem. My recommendation would be to use an array instead
of a dict which will save you in both in time and space. Here is a slightly modified version of your code using arrays:
array set diceRollDictSeq {}
for {set j 0} {$j < ${nrOfTries}} {incr j} {
set temp {}
for {set i 0} {$i < ${nrOfRolls}} {incr i} {
## you can map 10/11/12 to A/B/C here or later when accessing the values lappend temp [expr {[::math::random 1 7] + [::math::random 1 7]}]
}
incr diceRollDictSeq([join $temp ,])
}
string for each value and each attempt , in your case : 2-12 or 10 possible dice values and 7 attempts we get 70 bit strings ) this gives us 25M x 70 or 1750M or 1.75 Gig . But wait we know that a byte is made up of 1's and 0's so each one byte characterI wasn't sure what to expect so I ran it for 10 million and 7
(inner/outer loop limits). On a Windows laptop, this ran for 6 minutes
and consumed 950MB of memory (while I was busy browsing). It resulted
in 5,346,039 unique rolls which is about a quarter of the total solution space.
You can generate the same output report as your other post indicatesYou can use a bitfield to hold the values. A bitfield basically holds a 1 or 0 at an index. So the string "001001" as an example would have a value at 2 and 5. So just using a string you can get ( assuming one byte per character and storing a bit
with minor changes.
tries attempts memory realseveral smaller strings especially when you pre-allocate then its just math to get the correct index into the correct string to update the value.
-------- -------- -------- --------
1000 7 7k 0.078s
10000 7 70k 0.382s
100000 7 700k 3.770s
1000000 7 7M 79.58s
I would expect that for a 10 times increase in tries we'd get a 10 times increase in time spent but as we see after 100000 the time needed to seek to the correct spot causes significant slow down. To speed it up one may want to split up the string into
oo::class create Bitfield {
# get and set taken from https://wiki.tcl-lang.org/page/bitstrings
variable data
variable maxposition
variable lastfree
constructor { { numbits -1 } } {
set data ""
set maxposition $numbits
set lastfree 0
if { $numbits >= 0 } {
# prealloc space
set numbytes [ expr { $numbits / 8 + (( $numbits % 8 ) ? 1 : 0 ) } ]
set data [ string repeat "\0" $numbytes ]
set maxposition [ expr $numbytes * 8 ]
}
}
method alloc { } {
set pos $lastfree
try {
while { [ my get $pos ] == 1 && ( $maxposition == -1 || $pos < $maxposition ) } {
incr pos
}
} trap BAD_INDEX { a } {
throw CAPACITY_ERROR " unable to allocate any more blocks!"
}
set lastfree [ expr $pos + 1 ]
while { [ my get $lastfree ] && $lastfree != $maxposition } { incr lastfree }
my set $pos 1
#puts "allocated block $pos"
return $pos
}
method free { pos } {
my set $pos 0
set lastfree $pos
}
method set { position bit } {
#puts "Bitfield: set $position -> $bit "
if { (! [string is integer $position]) || ($position < 0) } {
throw BAD_INDEX "position must be an integer >= 0"
}
if { ($bit ne "0") && ($bit ne "1") } {
throw BAD_VALUE "bit must be '0' or '1'"
}
set block [ expr { $position / 8 } ]
set bitnum [ expr { $position % 8 } ]
set byteval "00000000"
binary scan $data @${block}b8 byteval
# Compute new string of zeros and ones
set newbyte [string range $byteval 0 [expr $bitnum-1]]$bit[string range $byteval [expr $bitnum+1] end]
# Convert back into binary format
set data [binary format a*@${block}b8 $data $newbyte]
}
method toggle { position } {
set value [ my get $position ]
my set $position [expr { ( $value + 1 ) % 2 } ]
}
method get { position } {
if { (! [string is integer $position]) || ($position < 0) } {
throw BAD_INDEX error "position must be an integer >= 0"
}
if { $maxposition >= 0 && $position >= $maxposition } {
throw BAD_INDEX "position must be an integer < $maxposition"
}
# Compute byte and bit offsets
set block [ expr { $position / 8 } ]
set bitnum [expr $position % 8]
# Scan the btye from the string
# Default value is used if scan fails
set byteval "00000000"
binary scan $data @${block}b8 byteval
# return the interesting bit
set value [string range $byteval $bitnum $bitnum]
return $value
}
method setData { buffer } {
set data $buffer
}
method getData { } {
return $data
}
method length { } {
return [string length $data];
}
method capacity { } {
return [expr { [string length $data ] * 8 } ]
}
method toString { } {
set retval "[format "%10i " [my offset]]"
foreach c [split $data "" ] {
binary scan $c b8 x
append retval "$x"
}
return $retval
}
method offset {} {
return [ expr {[my length] + 11 }]
}
destructor {}
}
package provide Bitfield 1.0
if { 0 } {
set b [ Bitfield new ]
$b set 4 1
$b set 5 1
$b toggle 6
$b toggle 4
$b set 25 1
for { set i 0 } { $i < 30 } { incr i } {
puts "$i . '[$b get $i ]'"
}
puts "length : [$b length ]"
puts "capacity : [$b capacity ]"
}
# your code starts here
package require math ; # from tcllib
set nrOfTries 10000
set nrOfRolls 7
array set diceRoll {}
for {set j 0} {$j < $nrOfRolls} {incr j} {
for {set k 2 } { $k < 13 } { incr k } {
set diceRoll($j,$k) [Bitfield new ${nrOfTries} ]
}
}
proc rolls { trynum numRolls } {
for {set i 0} {$i < $numRolls} {incr i} {
set x [::math::random 1 $numRolls]
incr x [::math::random 1 $numRolls]
$::diceRoll($i,$x) set $trynum 1
}
}
for {set j 0 } {$j < $::nrOfTries} {incr j} {
rolls $j $::nrOfRolls
if { [ expr { $j % 10000 } ] == 0 } {
puts "$j"
}
}
proc getRolls { tryNum } {
set val {}
for {set j 0} {$j < $::nrOfRolls } {incr j} {
for {set k 2 } { $k < 13 } { incr k } {
if { [$::diceRoll($j,$k) get $tryNum ] eq "1" } {
lappend val $k
}
}
}
return $val
}
proc numTimes { value attempt } {
lassign [$::diceRoll($attempt,$value) toString] capacityLen bitStr
return [string length [string map { "0" "" } $bitStr ]]
}
puts "What were the rolls for 2500th try?"
puts "[join [getRolls 2499] \n]"
puts "How many times did 10 get rolled on the each attempt in $::nrOfTries tries?"
puts " Attempt Num Times "
puts " ======== =========="
for { set m 0 } { $m < 7 } { incr m } {
puts [format "% 7i % 5i" [expr { $m + 1 } ] [numTimes 10 $m ]]
}
- Carl
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 714 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 133:56:29 |
| Calls: | 12,087 |
| Files: | 14,997 |
| Messages: | 6,517,349 |