• Event loop and http::geturl

    From Jonathan Kelly@21:1/5 to All on Tue Jun 24 06:55:45 2025
    This is related to my other "too many nested evaluations".

    I have taken a snapshot of my apache logs that caused the "..too many
    nested evaluations" and stripped down my program a bit so I could try
    and track down what was going on.

    So, it looks like ::http::geturl is operating asynchronously, despite my program NOT using -command.

    In the log I'm seeing:
    ----------------------
    A before do_get_abuse 132.226.122.74 and ::last_ip is ##
    do_get_abuse 132.226.122.74
    get_abuse 132.226.122.74
    get_abuse 132.226.122.74 before geturl
    A before do_get_abuse 132.226.122.74 and ::last_ip is ##
    ----------------------

    The "A before ..." line is coming from the proc checkForError that is
    attached to the fileevent

    set ::accessLog [open "|cat access_test.log" r]
    fconfigure $::accessLog -blocking 0 -buffering line
    fileevent $::accessLog readable [list checkForError $::accessLog]

    checkForError eventually calls do_get_abuse

    proc do_get_abuse {ip} {
    log "do_get_abuse $ip"
    if { [catch {get_abuse $ip} result] } {
    puts "get_abuse failed ... $result"
    exit
    }
    }

    and in get_abuse I have

    log " get_abuse $ip before geturl"
    set token [::http::geturl ${url}?${query} -method GET -headers $headers]
    log " get_abuse $ip after geturl"

    It never gets to the "log get_abuse $ip after geturl", which it should
    BEFORE the next fileevent readable event is processed.

    This is at least contrary to the man page. From the man page:

    The ::http::geturl command blocks until the operation completes, unless
    the -command option specifies a callback that is invoked when the HTTP transaction completes.

    I'm going to start building a slow-ish webpage on one of our servers to investigate further with a minimal program.

    Hopefully, someone on the guru team could check the http::geturl code?

    Thanks
    Jonathan.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Kelly@21:1/5 to All on Tue Jun 24 07:51:25 2025
    Here's my simple test to confirm that ::http:geturl is running
    asynchronously - unless I completely misunderstand what "block" means.
    slow.htm pauses for 2 seconds and outputs the date and time.
    -----------
    #!/usr/bin/tclsh

    package require http
    package require tls

    http::register https 443 [list ::tls::socket -autoservername true]

    proc test1 {} {
    incr ::cnt
    set url "https://congresstravel.com.au/events.cgi/xx/slow.htm"
    puts " test1 $::cnt before geturl "
    set token [::http::geturl ${url} -method GET]
    puts " $::cnt data is [::http::data $token]"
    ::http::cleanup $token
    }

    proc check {chan} {
    if {[gets $chan line] >= 0} {
    test1
    }
    }
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }
    proc doClose {} {
    set ::close "close"
    }

    set fd [open "test.txt" "w"]
    for {set x 1} {$x < 10} {incr x} {
    puts $fd $x
    }
    close $fd
    queue
    after 20000 doClose
    vwait close
    file delete "test.txt"
    -----------
    output
    ===========
    test1 1 before geturl
    test1 2 before geturl
    test1 3 before geturl
    test1 4 before geturl
    test1 5 before geturl
    test1 6 before geturl
    test1 7 before geturl
    test1 8 before geturl
    test1 9 before geturl
    9 data is 24/06/25 07:47:03
    9 data is 24/06/25 07:47:03
    9 data is 24/06/25 07:47:03
    9 data is 24/06/25 07:47:02
    9 data is 24/06/25 07:47:02
    9 data is 24/06/25 07:47:01
    9 data is 24/06/25 07:47:01
    9 data is 24/06/25 07:47:01
    9 data is 24/06/25 07:47:01
    ===========

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Jonathan Kelly on Tue Jun 24 04:21:01 2025
    Jonathan Kelly <[email protected]> wrote:
    So, it looks like ::http::geturl is operating asynchronously, despite my program NOT using -command.

    It does. It is documented as such:

    man n http:

    Note: The event queue is even used without the -command option. As a
    side effect, arbitrary commands may be processed while http::geturl is
    running.


    The code snippets below are from http-2.9.5.tm which was distributed
    (at least) with 8.6.12:

    Buried deep in http::geturl:

    # geturl does EVERYTHING asynchronously, so if the user
    # calls it synchronously, we just do a wait here.
    http::wait $token

    And the implementation of http::wait is:

    proc http::wait {token} {
    variable $token
    upvar 0 $token state

    if {![info exists state(status)] || $state(status) eq ""} {
    # We must wait on the original variable name, not the upvar alias
    vwait ${token}(status)
    }

    return [status $token]
    }

    And the 'vwait' there reenters the event loop and allows other events
    to be processed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Kelly@21:1/5 to Rich on Tue Jun 24 18:01:08 2025
    On 24/6/25 14:21, Rich wrote:
    Jonathan Kelly <[email protected]> wrote:
    So, it looks like ::http::geturl is operating asynchronously, despite my
    program NOT using -command.

    It does. It is documented as such:

    man n http:

    Note: The event queue is even used without the -command option. As a
    side effect, arbitrary commands may be processed while http::geturl is
    running.


    The code snippets below are from http-2.9.5.tm which was distributed
    (at least) with 8.6.12:

    Buried deep in http::geturl:

    # geturl does EVERYTHING asynchronously, so if the user
    # calls it synchronously, we just do a wait here.
    http::wait $token

    And the implementation of http::wait is:

    proc http::wait {token} {
    variable $token
    upvar 0 $token state

    if {![info exists state(status)] || $state(status) eq ""} {
    # We must wait on the original variable name, not the upvar alias
    vwait ${token}(status)
    }

    return [status $token]
    }

    And the 'vwait' there reenters the event loop and allows other events
    to be processed.

    OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I
    need the geturl to finish before anything else happens.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From et99@21:1/5 to Jonathan Kelly on Tue Jun 24 17:19:23 2025
    On 6/24/2025 1:01 AM, Jonathan Kelly wrote:
    On 24/6/25 14:21, Rich wrote:
    Jonathan Kelly <[email protected]> wrote:
    So, it looks like ::http::geturl is operating asynchronously, despite my >>> program NOT using -command.

    It does.  It is documented as such:

    man n http:

       Note: The event queue is even used without the -command option.   As a
       side effect, arbitrary commands may be processed while http::geturl is >>    running.


    The code snippets below are from http-2.9.5.tm which was distributed
    (at least) with 8.6.12:

    Buried deep in http::geturl:

             # geturl does EVERYTHING asynchronously, so if the user
             # calls it synchronously, we just do a wait here.
             http::wait $token

    And the implementation of http::wait is:

         proc http::wait {token} {
             variable $token
             upvar 0 $token state

             if {![info exists state(status)] || $state(status) eq ""} { >>              # We must wait on the original variable name, not the upvar alias
                 vwait ${token}(status)
             }

             return [status $token]
         }

    And the 'vwait' there reenters the event loop and allows other events
    to be processed.

    OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I need the geturl to finish before anything else happens.

    I would think you can use this option on the geturl call:

    -command callback

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From et99@21:1/5 to All on Wed Jun 25 00:03:52 2025
    On 6/24/2025 5:19 PM, et99 wrote:
    On 6/24/2025 1:01 AM, Jonathan Kelly wrote:

    OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I need the geturl to finish before anything else happens.

    I would think you can use this option on the geturl call:

    -command callback




    I see you already know about -command, so perhaps what you really want is an example of how to use it.


    proc httpCallback {token} { ;# the -command callback - called when transaction completes
    upvar 0 $token state ;# use this to get the results
    set ::urldone 1 ;# block on the setting of this variable
    return
    }

    unset -nocomplain ::urldone :# rules out a race condition
    http::geturl <your url> -command httpCallback
    if {![info exists ::urldone]} {vwait ::urldone} ;# no need to block if the variable exists

    The above code is being cautious. There may not be any possibility of a race condition, but this way we rule it out, even if they change the code in the future. It never hurts to unset a variable first that you're going to set later anyway.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From et99@21:1/5 to All on Wed Jun 25 01:57:28 2025
    On 6/25/2025 12:03 AM, et99 wrote:
    On 6/24/2025 5:19 PM, et99 wrote:



    ... snip ....


    It has just now occurred to me that you are running your [test1] proc as a fileevent script. Read the vwait manual under the section:

    "NESTED VWAITS BY EXAMPLE"

    I use geturl synchronously with no issues. But I do a single url request and wait for it, in the main line code - NOT inside an event.

    The code I presented in the prior posting is how you could use -command and get a synchronous result. It is only really useful if you were going to do something between the geturl and the wait for it to be done. Otherwise, you could just call it
    synchronously - but NOT inside an event, if another fileevent might trigger before the first one is done.

    As you will see with the example in the manual, things have to unwind, so if your fileevents occur fast enough, they may have triggered before earlier geturl calls will have had time to unwind. The event loop works like a stack.

    That's why the timestamps are output in reverse order of when the geturl was called.

    I'm not sure exactly what you want to accomplish. But is sounds to me like you need to do some queuing or co-routines. I have code I wrote that does single queue with 1 or more servers using threads. I sometimes use it for just a single server to get my
    own queuing of events.

    Unfortunately, I can't use it with tcl 9.0 because of a race condition bug with respect to package requires inside threads that has been ticketed but not yet looked into.

    (sorry for so many postings :)

    -e

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Kelly@21:1/5 to All on Thu Jun 26 04:20:51 2025
    On 25/6/25 18:57, et99 wrote:
    On 6/25/2025 12:03 AM, et99 wrote:
    On 6/24/2025 5:19 PM, et99 wrote:



    ... snip ....


    It has just now occurred to me that you are running your [test1] proc as
    a fileevent script.  Read the vwait manual under the section:

    "NESTED VWAITS BY EXAMPLE"

    I use geturl synchronously with no issues. But I do a single url request
    and wait for it, in the main line code - NOT inside an event.

    The code I presented in the prior posting is how you could use -command
    and get a synchronous result. It is only really useful if you were going
    to do something between the geturl and the wait for it to be done.
    Otherwise, you could just call it synchronously - but NOT inside an
    event, if another fileevent might trigger before the first one is done.

    As you will see with the example in the manual, things have to unwind,
    so if your fileevents occur fast enough, they may have triggered before earlier geturl calls will have had time to unwind. The event loop works
    like a stack.

    That's why the timestamps are output in reverse order of when the geturl
    was called.

    I'm not sure exactly what you want to accomplish. But is sounds to me
    like you need to do some queuing or co-routines. I have code I wrote
    that does single queue with 1 or more servers using threads. I sometimes
    use it for just a single server to get my own queuing of events.

    Unfortunately, I can't use it with tcl 9.0 because of a race condition
    bug with respect to package requires inside threads that has been
    ticketed but not yet looked into.

    (sorry for so many postings :)

    -e



    Thanks for looking at it. Yes, I had to do a queue - my case is exactly
    like the test1 code I posted ... the events that end up triggering the
    geturl come in quicker than the geturl can process, and the geturl
    re-enters the event loop under the hood enabling more
    geturl-triggering-events to queue up in the event queue(?) - eventually
    enough to crash something. Anyway I did this ...
    ----------------
    set ::test_busy 0
    set ::test_queue {}

    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }
    proc check {chan} {
    if {[gets $chan line] >= 0} {
    queue_test $line
    }
    proc queue_test {n} {
    set task [list test1 $n]
    lappend ::test_queue $task
    maybe_run_test
    }
    proc maybe_run_test {} {
    if {$::test_busy} return
    if {[llength $::test_queue] == 0} return

    set ::test_busy 1

    # get first in queue
    set next [lindex $::test_queue 0]

    # remove first from queue
    set ::test_queue [lrange $::test_queue 1 end]

    # run first
    uplevel #0 $next

    set ::test_busy 0
    maybe_run_test
    }
    ----------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Jonathan Kelly on Wed Jun 25 21:32:42 2025
    Jonathan Kelly <[email protected]> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From et99@21:1/5 to Rich on Wed Jun 25 18:52:33 2025
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <[email protected]> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.



    I was also curious about this. But I'm also wondering why this is even event driven at all? Why not simply, in pseudo code:

    while 1 {
    read...a line
    if end of file, break
    geturl
    do something with the url results
    }

    If there's also a gui that the OP wants to keep alive, it should not be starved, since the synchronous form of geturl is calling vwait, and that would allow gui events to get processed while waiting for the url request to complete.

    -e

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to [email protected] on Thu Jun 26 17:08:29 2025
    et99 <[email protected]> wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <[email protected]> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open
    text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    I was also curious about this. But I'm also wondering why this is
    even event driven at all? Why not simply, in pseudo code:

    My guess: the above was OP's "test case" code. The real code is
    reading an Apache log file as Apache logs to the file, so 'event
    driven' in that senario does make some sense.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonathan Kelly@21:1/5 to Rich on Fri Jun 27 05:09:07 2025
    On 27/6/25 03:08, Rich wrote:
    et99 <[email protected]> wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <[email protected]> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open
    text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    I was also curious about this. But I'm also wondering why this is
    even event driven at all? Why not simply, in pseudo code:

    My guess: the above was OP's "test case" code. The real code is
    reading an Apache log file as Apache logs to the file, so 'event
    driven' in that senario does make some sense.

    What Rich said. Before I realised geturl is *always* asynchronous, I had
    read the man for geturl where it said geturl "blocked". I needed to
    simplify my program as a test case to prove something was broken. Turned
    out, the problem was my understanding, though I still think the manual
    page is mis-leading. The relevant

    "Note: The event queue is even used without the -command option. As a
    side effect, arbitrary commands may be processed while http::geturl is running."

    is in the general description at the top, and I had just been reading
    the geturl function description.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From et99@21:1/5 to Jonathan Kelly on Thu Jun 26 14:02:34 2025
    On 6/26/2025 12:09 PM, Jonathan Kelly wrote:
    On 27/6/25 03:08, Rich wrote:
    et99 <[email protected]> wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <[email protected]> wrote:
    proc queue {} {
        set ::input [open "|cat test.txt" r]
        fconfigure $::input -blocking 0 -buffering line
        fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open
    text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    I was also curious about this.  But I'm also wondering why this is
    even event driven at all?  Why not simply, in pseudo code:

    My guess: the above was OP's "test case" code.  The real code is
    reading an Apache log file as Apache logs to the file, so 'event
    driven' in that senario does make some sense.

    What Rich said. Before I realised geturl is *always* asynchronous, I had read the man for geturl where it said geturl "blocked". I needed to simplify my program as a test case to prove something was broken. Turned out, the problem was my understanding,
    though I still think the manual page is mis-leading. The relevant

    "Note: The event queue is even used without the -command option. As a side effect, arbitrary commands may be processed while http::geturl is running."

    is in the general description at the top, and I had just been reading the geturl function description.


    I wonder, if you are reading a file that is being written from another process, sort of like a "tail" program, doesn't tcl's [fileevent <channel> readable <script>] trigger constantly? Isn't this in effect a tight polling loop?

    The manual says:

    "A channel is also considered to be readable if an end of file or error condition is present on the underlying file or device. It is important for script to check for these conditions and handle them appropriately; for example, if there is no special
    check for end of file, an infinite loop may occur where script reads no data, returns, and is immediately invoked again."

    To avoid this problem, one is normally supposed to close the file or remove the read handler. I've never written a log handler like this one, so I'm not sure what the correct approach would be.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Gollwitzer@21:1/5 to All on Fri Jun 27 16:35:08 2025
    Am 26.06.25 um 23:02 schrieb et99:
    I wonder, if you are reading a file that is being written from another process, sort of like a "tail" program, doesn't tcl's [fileevent
    <channel> readable <script>] trigger constantly? Isn't this in effect a
    tight polling loop?

    The underlying mechanism is select() or poll(). To my knowledge, this
    only works for pipes/sockets, not for files. The "tail -f" program runs
    stat() in a loop to see if the file date or size has changed.

    On Linux, you could also use inotify (there is a Tcl package) to get
    callbacks when the file is changed

    Christian

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Fri Jun 27 18:18:32 2025
    * Christian Gollwitzer <[email protected]>
    | Am 26.06.25 um 23:02 schrieb et99:
    | > I wonder, if you are reading a file that is being written from
    | > another process, sort of like a "tail" program, doesn't tcl's
    | > [fileevent
    | > <channel> readable <script>] trigger constantly? Isn't this in
    | > effect a tight polling loop?

    | The underlying mechanism is select() or poll(). To my knowledge, this
    | only works for pipes/sockets, not for files.

    select() also works for files, the effect is that it indeed triggers immediately on each call.

    man select(2)
    [...]
    readfds
    The file descriptors in this set are watched to see if they are
    ready for reading.
    ** A file descriptor is ready for reading if a read operation will not block;
    in particular, a file descriptor is also ready on end-of-file.

    (** emphasis by me).

    The effect in TCL of setting a readable-fileevent on a regular disk file
    is indeed that the event fires repeatedly until EOF, blocking any GUI
    updates which would run "after idle".

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)