• 7.5 million lines in a text widget

    From Luc@21:1/5 to All on Sat Feb 25 20:46:21 2023
    I can't load a 765MB, 7.5 million-line text file into my text editor.

    It's not like I need to do that routinely. I just wanted to see if
    Tcl/Tk would handle it. But my text editor crashes every time.

    I was using a "slurp" method to read the file.

    set _slurp [read -nonewline $_channel]

    Since the app was crashing, I switched to a "streaming" approach:

    proc p.openstream {argTextWidget argChannel {argStreamSize 10000}} {
    $argTextWidget insert end [read $argChannel $argStreamSize]
    if {[eof $argChannel]} {
    close $argChannel
    } else {
    after idle [list after 0 [info level 0]]
    }
    }


    Two problems:

    1. The text loads slowly. It's tolerable, but slow.

    2. It crashes before it's done. It never gets to load everything.

    Terminal output says it's killed, probably because it runs out of
    memory. But 'free' says I have 3GB free and the file has 765MB.

    So is there a way to solve this?

    --
    Luc


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Luc on Sun Feb 26 04:32:06 2023
    Luc <[email protected]d> wrote:
    I can't load a 765MB, 7.5 million-line text file into my text editor.

    It's not like I need to do that routinely. I just wanted to see if
    Tcl/Tk would handle it. But my text editor crashes every time.

    Then there is something else, besides Tcl/Tk "can't handle it" going on
    with your text editor.

    $ ls -l
    total 8
    -rwxr-xr-x 1 users 118 Feb 25 23:20 loadbigtext*
    -rwxr-xr-x 1 users 228 Feb 25 23:17 makebigtext*

    $ time ./makebigtext

    real 0m13.975s
    user 0m13.026s
    sys 0m0.740s

    $ ls -lh
    total 779M
    -rw-r--r-- 1 users 779M Feb 25 23:23 bigtext
    -rwxr-xr-x 1 users 118 Feb 25 23:20 loadbigtext*
    -rwxr-xr-x 1 users 228 Feb 25 23:17 makebigtext*

    $ ./loadbigtext
    8000002.0

    And in about 6 seconds a text widget appears, containing all 8M lines
    from the 'big file'. And I can scroll around using the mouse wheel.
    Total memory used (per top) is 3007m.

    Contents of "make big text":

    $ cat makebigtext
    #!/usr/bin/tclsh

    set fd [open bigtext {WRONLY CREAT TRUNC}]
    for {set i 0} {$i < 8000000} {incr i} {
    puts $fd "The quick brown fox jumped over the lazy dog. Mary had a little lamb, it's fleece was white as snow."
    }
    close $fd

    Note that the above creates an 800M line file.

    Contents of "loadbigtest":

    $ cat loadbigtext
    #!/usr/bin/wish

    text .t
    pack .t

    set fd [open bigtext RDONLY]
    .t insert end [read $fd]
    close $fd
    puts [.t index end]

    So, provided one has sufficient free RAM, Tcl/Tk can load an 8M line
    file that is 779M large (both more lines and a larger file than your
    file).

    So something else is causing your crash.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Luc on Wed Mar 1 02:13:47 2023
    Luc <[email protected]d> wrote:
    What methods do you recommend to make the application become visible
    right away even if the argument file is still not loaded?

    You have to either load in a streaming mode via the event loop, or you
    need to load in a background thread and then transfer the data over and
    load into the text widget (the background thread method will still
    cause a 'pause' while the data is copied from the background thread and
    loaded into the widget).

    For ideas you need to read at least this wiki page:

    https://wiki.tcl-lang.org/page/Keep%20a%20GUI%20alive%20during%20a%20long%20calculation

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Luc@21:1/5 to Rich on Tue Feb 28 22:28:31 2023
    On Sun, 26 Feb 2023 04:32:06 -0000 (UTC), Rich wrote:

    Then there is something else, besides Tcl/Tk "can't handle it" going on
    with your text editor.

    $ ls -l
    total 8
    -rwxr-xr-x 1 users 118 Feb 25 23:20 loadbigtext*
    -rwxr-xr-x 1 users 228 Feb 25 23:17 makebigtext*

    $ time ./makebigtext

    real 0m13.975s
    user 0m13.026s
    sys 0m0.740s

    $ ls -lh
    total 779M
    -rw-r--r-- 1 users 779M Feb 25 23:23 bigtext
    -rwxr-xr-x 1 users 118 Feb 25 23:20 loadbigtext*
    -rwxr-xr-x 1 users 228 Feb 25 23:17 makebigtext*

    $ ./loadbigtext
    8000002.0

    And in about 6 seconds a text widget appears, containing all 8M lines
    from the 'big file'. And I can scroll around using the mouse wheel.
    Total memory used (per top) is 3007m.

    Contents of "make big text":

    $ cat makebigtext
    #!/usr/bin/tclsh

    set fd [open bigtext {WRONLY CREAT TRUNC}]
    for {set i 0} {$i < 8000000} {incr i} {
    puts $fd "The quick brown fox jumped over the lazy dog. Mary had a little lamb, it's fleece was white as snow." }
    close $fd

    Note that the above creates an 800M line file.

    Contents of "loadbigtest":

    $ cat loadbigtext
    #!/usr/bin/wish

    text .t
    pack .t

    set fd [open bigtext RDONLY]
    .t insert end [read $fd]
    close $fd
    puts [.t index end]

    So, provided one has sufficient free RAM, Tcl/Tk can load an 8M line
    file that is 779M large (both more lines and a larger file than your
    file).

    So something else is causing your crash.


    So, I not only ran your code but also changed mine and now it's not
    crashing when I read the file in one fell swoop.

    Which is strange because I was already doing that before I introduced
    the "streaming" method. It didn't work then and works now.

    Maybe because some time has elapsed and I have rebooted at least once
    since then? Who knows.

    It still crashes with the streaming method though. But I prefer the
    single read method so I won't fret about that now.

    The problem I have now is that the entire application takes a bit
    too long to become visible when it's not running and is called with a
    file as argument. In fact, it's not visible at all until the whole text
    has been loaded. And it takes about 7 seconds.

    What methods do you recommend to make the application become visible
    right away even if the argument file is still not loaded?


    --
    Luc


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Walton@21:1/5 to Luc on Tue Mar 14 10:53:44 2023
    On Tuesday, February 28, 2023 at 8:28:36 PM UTC-5, Luc wrote:
    What methods do you recommend to make the application become visible
    right away even if the argument file is still not loaded?


    I would look into reading chunks of the file as needed and displaying one chunk of the file at a time in the text widget. Then as the text widget is scrolled, load the next chunk. And perhaps you can overlap the chunks so the swap is seamless. Or read
    the entire file at once so all of the text is in memory, and then use the same method. Whatever works better. I've never tried it, but it's just an idea. Seems like it should work and should be fast. But I'm sure it'll be tricky.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)