• How to deal with HTTP redirects to same URL

    From Alexandru@21:1/5 to All on Mon Jun 12 05:32:21 2023
    Hello,

    I'm trying to find a solution in the Web to this issue, when a server sends an HTTL response code 303 (redirect) and the new location (URL) is identical to the original location. Until now I couldn't find a solution in the Web. But obviously there must
    be a solution since my web browser can handle this.
    Any ideas on your side?

    Many thanks
    Alexandru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Mon Jun 12 16:02:17 2023
    On 12/06/2023 14:32, Alexandru wrote:
    I'm trying to find a solution in the Web to this issue, when a server sends an HTTL response code 303 (redirect) and the new location (URL) is identical to the original location. Until now I couldn't find a solution in the Web. But obviously there must
    be a solution since my web browser can handle this.
    Any ideas on your side?

    Check carefully. Are the URLs really exactly the same? I remember I had
    to look twice when I fetched a URL like "http://www.tcl.tk/about" and
    got a redirect to "/about/. The trailing slash makes a difference.

    The only way a browser can "handle" a redirect to the exact same URL it requested is by displaying a page that says something like "The page
    isn’t redirecting properly". But I guess that's not what you are
    referring to.


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Mon Jun 12 07:45:58 2023
    Schelte schrieb am Montag, 12. Juni 2023 um 16:02:23 UTC+2:
    On 12/06/2023 14:32, Alexandru wrote:
    I'm trying to find a solution in the Web to this issue, when a server sends an HTTL response code 303 (redirect) and the new location (URL) is identical to the original location. Until now I couldn't find a solution in the Web. But obviously there
    must be a solution since my web browser can handle this.
    Any ideas on your side?

    Check carefully. Are the URLs really exactly the same? I remember I had
    to look twice when I fetched a URL like "http://www.tcl.tk/about" and
    got a redirect to "/about/. The trailing slash makes a difference.

    The only way a browser can "handle" a redirect to the exact same URL it requested is by displaying a page that says something like "The page
    isn’t redirecting properly". But I guess that's not what you are
    referring to.


    Schelte.

    I double checked, the URLs are identical, including the trailing slash.
    This is the URL and the siplified code:

    set url https://www.duma-bandzink.com/de/
    set token [::http::geturl $url -binary 1]
    if {[string index $ncode 0]==3} {
    array set meta $state(meta)
    if {![info exists meta(Location)] && ![info exists meta(location)]} {
    http::cleanup $token
    return
    }
    if {[info exists meta(Location)]} {
    set url2 $meta(Location)
    } else {
    set url2 $meta(location)
    }
    http::cleanup $token
    puts "Old: $url"
    puts "New: $url2"
    }

    Just run the code. Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Mon Jun 12 17:37:51 2023
    On 12/06/2023 16:45, Alexandru wrote:
    I double checked, the URLs are identical, including the trailing slash.
    This is the URL and the siplified code:

    The code doesn't run as presented. But indeed the URLs are the same.
    However, the first response also returns some cookies. Repeating the
    request with those cookies results in a 200 OK.

    My www package (https://chiselapp.com/user/schelte/repository/www/index)
    takes care of redirects and cookies. Fetching the URL that way (you'll
    need the latest check-in) works correctly:

    package require www
    catch {www get https://www.duma-bandzink.com/de/} data info
    dict get $info status line
    HTTP/1.1 200 OK


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Alexandru on Mon Jun 12 09:41:03 2023
    Alexandru schrieb am Montag, 12. Juni 2023 um 18:30:40 UTC+2:
    Schelte schrieb am Montag, 12. Juni 2023 um 17:37:56 UTC+2:
    On 12/06/2023 16:45, Alexandru wrote:
    I double checked, the URLs are identical, including the trailing slash. This is the URL and the siplified code:

    The code doesn't run as presented. But indeed the URLs are the same. However, the first response also returns some cookies. Repeating the request with those cookies results in a 200 OK.

    My www package (https://chiselapp.com/user/schelte/repository/www/index) takes care of redirects and cookies. Fetching the URL that way (you'll
    need the latest check-in) works correctly:

    package require www
    catch {www get https://www.duma-bandzink.com/de/} data info
    dict get $info status line
    HTTP/1.1 200 OK


    Schelte.
    Hi Shelte,

    Many thanks, the cookies could be the cause indeed.
    I'll take a look at your code.

    Regards
    Alexandru

    Hi Schelte,

    How where should I save your package?
    I unpacked the ZIP archive to C:\Tcl\lib\www but package require www fails...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Mon Jun 12 09:30:37 2023
    Schelte schrieb am Montag, 12. Juni 2023 um 17:37:56 UTC+2:
    On 12/06/2023 16:45, Alexandru wrote:
    I double checked, the URLs are identical, including the trailing slash. This is the URL and the siplified code:

    The code doesn't run as presented. But indeed the URLs are the same.
    However, the first response also returns some cookies. Repeating the
    request with those cookies results in a 200 OK.

    My www package (https://chiselapp.com/user/schelte/repository/www/index) takes care of redirects and cookies. Fetching the URL that way (you'll
    need the latest check-in) works correctly:

    package require www
    catch {www get https://www.duma-bandzink.com/de/} data info
    dict get $info status line
    HTTP/1.1 200 OK


    Schelte.

    Hi Shelte,

    Many thanks, the cookies could be the cause indeed.
    I'll take a look at your code.

    Regards
    Alexandru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Schelte on Mon Jun 12 12:53:53 2023
    On 6/12/2023 11:37 AM, Schelte wrote:
    On 12/06/2023 16:45, Alexandru wrote:
    I double checked, the URLs are identical, including the trailing slash.
    This is the URL and the siplified code:

    The code doesn't run as presented. But indeed the URLs are the same.
    However, the first response also returns some cookies. Repeating the
    request with those cookies results in a 200 OK.


    I see that the cookies are there to make sure the GDPR notice gets
    displayed and the user is forced to act on it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Mon Jun 12 21:27:15 2023
    On 12/06/2023 18:41, Alexandru wrote:
    How where should I save your package?
    I unpacked the ZIP archive to C:\Tcl\lib\www but package require www fails...

    As it is a Tcl module, you have to put it in one of the directories
    returned from [tcl::tm::path list], or add the directory where you put
    it to that list, using [tcl::tm::path add C:/Tcl/lib/www].


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Tue Jun 13 06:32:10 2023
    Schelte schrieb am Montag, 12. Juni 2023 um 21:27:20 UTC+2:
    On 12/06/2023 18:41, Alexandru wrote:
    How where should I save your package?
    I unpacked the ZIP archive to C:\Tcl\lib\www but package require www fails...
    As it is a Tcl module, you have to put it in one of the directories
    returned from [tcl::tm::path list], or add the directory where you put
    it to that list, using [tcl::tm::path add C:/Tcl/lib/www].


    Schelte.

    Well, I'm a little bit familiar with the Tcl basics. I put your package as any other package into the recognized directory C:/Tcl/lib.
    Should your archive have a pkgIndex.tcl ?
    This file is not available so no wonder is not working.
    I tried to define the pkgIndex.tcl with this content:

    package ifneeded www 2.3 [list source [file join $dir www-2.3.tm]]

    This give me the error:

    attempt to provide package www 2.3 failed: no version of package www provided

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Alexandru on Tue Jun 13 06:38:50 2023
    Alexandru schrieb am Dienstag, 13. Juni 2023 um 15:32:13 UTC+2:
    Schelte schrieb am Montag, 12. Juni 2023 um 21:27:20 UTC+2:
    On 12/06/2023 18:41, Alexandru wrote:
    How where should I save your package?
    I unpacked the ZIP archive to C:\Tcl\lib\www but package require www fails...
    As it is a Tcl module, you have to put it in one of the directories returned from [tcl::tm::path list], or add the directory where you put
    it to that list, using [tcl::tm::path add C:/Tcl/lib/www].


    Schelte.
    Well, I'm a little bit familiar with the Tcl basics. I put your package as any other package into the recognized directory C:/Tcl/lib.
    Should your archive have a pkgIndex.tcl ?
    This file is not available so no wonder is not working.
    I tried to define the pkgIndex.tcl with this content:

    package ifneeded www 2.3 [list source [file join $dir www-2.3.tm]]

    This give me the error:

    attempt to provide package www 2.3 failed: no version of package www provided

    Ah sorry, I think I know what the problem is. The tm are expected in another location than the normal libraries.
    Forget my last post.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Alexandru on Tue Jun 13 06:43:16 2023
    Alexandru schrieb am Dienstag, 13. Juni 2023 um 15:38:53 UTC+2:
    Alexandru schrieb am Dienstag, 13. Juni 2023 um 15:32:13 UTC+2:
    Schelte schrieb am Montag, 12. Juni 2023 um 21:27:20 UTC+2:
    On 12/06/2023 18:41, Alexandru wrote:
    How where should I save your package?
    I unpacked the ZIP archive to C:\Tcl\lib\www but package require www fails...
    As it is a Tcl module, you have to put it in one of the directories returned from [tcl::tm::path list], or add the directory where you put
    it to that list, using [tcl::tm::path add C:/Tcl/lib/www].


    Schelte.
    Well, I'm a little bit familiar with the Tcl basics. I put your package as any other package into the recognized directory C:/Tcl/lib.
    Should your archive have a pkgIndex.tcl ?
    This file is not available so no wonder is not working.
    I tried to define the pkgIndex.tcl with this content:

    package ifneeded www 2.3 [list source [file join $dir www-2.3.tm]]

    This give me the error:

    attempt to provide package www 2.3 failed: no version of package www provided
    Ah sorry, I think I know what the problem is. The tm are expected in another location than the normal libraries.
    Forget my last post.

    Now I put the www directory directory into one of the paths returned by [tcl::tm::path list] and it can still not find the package.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Tue Jun 13 16:26:54 2023
    On 13/06/2023 15:43, Alexandru wrote:
    Now I put the www directory directory into one of the paths returned by [tcl::tm::path list] and it can still not find the package.

    The module system doesn't look in subdirectories. That was one of its
    design goals: avoid delays due to scanning large portions of the file
    system. So you should put the contents of the www directory in a path
    returned by [tcl::tm::path list].

    Alternatively, you can indicate that the package is in the www
    subdirectory: package require www::www. That will work for this specific
    case. But it will mess things up if one of the sub packages is needed,
    such as www::digest, which should now be loaded as www::www::digest.

    I didn't include an installation part in the documentation because it is
    just a normal Tcl module. But if even a long-time Tcl user like yourself
    is struggling with it, I guess I may have to add it.


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Tue Jun 13 07:54:48 2023
    Schelte schrieb am Dienstag, 13. Juni 2023 um 16:27:00 UTC+2:
    On 13/06/2023 15:43, Alexandru wrote:
    Now I put the www directory directory into one of the paths returned by [tcl::tm::path list] and it can still not find the package.
    The module system doesn't look in subdirectories. That was one of its
    design goals: avoid delays due to scanning large portions of the file
    system. So you should put the contents of the www directory in a path returned by [tcl::tm::path list].

    Alternatively, you can indicate that the package is in the www
    subdirectory: package require www::www. That will work for this specific case. But it will mess things up if one of the sub packages is needed,
    such as www::digest, which should now be loaded as www::www::digest.

    I didn't include an installation part in the documentation because it is
    just a normal Tcl module. But if even a long-time Tcl user like yourself
    is struggling with it, I guess I may have to add it.


    Schelte.

    Thanks, now it loads the package.
    I'll test it.

    Regards
    Alexandru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Alexandru on Tue Jun 13 08:58:59 2023
    Alexandru schrieb am Dienstag, 13. Juni 2023 um 16:54:51 UTC+2:
    Schelte schrieb am Dienstag, 13. Juni 2023 um 16:27:00 UTC+2:
    On 13/06/2023 15:43, Alexandru wrote:
    Now I put the www directory directory into one of the paths returned by [tcl::tm::path list] and it can still not find the package.
    The module system doesn't look in subdirectories. That was one of its design goals: avoid delays due to scanning large portions of the file system. So you should put the contents of the www directory in a path returned by [tcl::tm::path list].

    Alternatively, you can indicate that the package is in the www subdirectory: package require www::www. That will work for this specific case. But it will mess things up if one of the sub packages is needed,
    such as www::digest, which should now be loaded as www::www::digest.

    I didn't include an installation part in the documentation because it is just a normal Tcl module. But if even a long-time Tcl user like yourself
    is struggling with it, I guess I may have to add it.


    Schelte.
    Thanks, now it loads the package.
    I'll test it.

    Regards
    Alexandru

    When trying to open another URL (https://www.avi-gmbh.com/) your code throws and error that that aborts the my program flow, although I use catch:
    *** START OF ERROR MESSAGE ***
    SSL channel "sock0000000015597E60": error: sslv3 alert handshake failure
    SSL channel "sock0000000015597E60": error: sslv3 alert handshake failure
    *** END OF ERROR MESSAGE ***
    *** START OF ERROR MESSAGE ***
    software caused connection abort
    software caused connection abort
    while executing
    "close $fd"
    (class "::www::connection" method "Disconnect" line 6)
    invoked from within
    "my Disconnect"
    (class "::www::connection" method "Failed" line 3)
    invoked from within
    "my Failed {WWW DATA TIMEOUT} "timeout waiting for a response""
    (class "::www::connection" method "Timedout" line 2)
    invoked from within
    "::oo::Obj1007::my Timedout"
    ("after" script)
    *** END OF ERROR MESSAGE ***

    How can continue in case of an error?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Wed Jun 14 15:40:59 2023
    On 13/06/2023 17:58, Alexandru wrote:
    When trying to open another URL (https://www.avi-gmbh.com/) your code throws and error that that aborts the my program flow, although I use catch

    Thank you for this example of a failing scenario. I will investigate why
    it is not handled correctly. But this may take some time.


    Thanks,
    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Wed Jul 19 00:55:48 2023
    Schelte schrieb am Mittwoch, 14. Juni 2023 um 15:41:07 UTC+2:
    On 13/06/2023 17:58, Alexandru wrote:
    When trying to open another URL (https://www.avi-gmbh.com/) your code throws and error that that aborts the my program flow, although I use catch
    Thank you for this example of a failing scenario. I will investigate why
    it is not handled correctly. But this may take some time.


    Thanks,
    Schelte.

    Hi Schelte,
    Did you find time to look into this issue?
    I have a task where I could use your package if it can catch communication errors.
    Many thanks
    Alexandru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Wed Jul 19 10:46:55 2023
    On 19/07/2023 09:55, Alexandru wrote:
    Did you find time to look into this issue?
    I have a task where I could use your package if it can catch communication errors.

    I did have a quick look. But in my simple test, it doesn't behave
    consistent. Sometimes the error is caught. Other times it takes a while
    before it is reported via the background error handler. That makes
    debugging harder. And then I got distracted by other things :-}

    But knowing there is a real need for this, increases the priority. I
    will have another go at it.


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Wed Jul 19 01:52:46 2023
    Schelte schrieb am Mittwoch, 19. Juli 2023 um 10:47:00 UTC+2:
    On 19/07/2023 09:55, Alexandru wrote:
    Did you find time to look into this issue?
    I have a task where I could use your package if it can catch communication errors.
    I did have a quick look. But in my simple test, it doesn't behave
    consistent. Sometimes the error is caught. Other times it takes a while before it is reported via the background error handler. That makes
    debugging harder. And then I got distracted by other things :-}

    But knowing there is a real need for this, increases the priority. I
    will have another go at it.


    Schelte.

    Yes, there is a real need. Thanks for the help.
    Regards
    Alexandru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to All on Wed Jul 19 11:57:20 2023
    saitology9 schrieb am Mittwoch, 19. Juli 2023 um 20:54:13 UTC+2:
    On 7/19/2023 4:52 AM, Alexandru wrote:

    Yes, there is a real need. Thanks for the help.
    Regards
    Alexandru

    Would you care to describe what you are doing with these websites?
    Depending on your use, there may be other options.

    Like: are you just downloading them to a local file? or checking
    whether they are up and running? etc.

    Parsing the HTML content for specific keywords.
    I already have the Tcl scripts ready for work.
    The only 2 issues are described above (error catching or handling of cookies)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Alexandru on Wed Jul 19 14:54:06 2023
    On 7/19/2023 4:52 AM, Alexandru wrote:

    Yes, there is a real need. Thanks for the help.
    Regards
    Alexandru


    Would you care to describe what you are doing with these websites?
    Depending on your use, there may be other options.

    Like: are you just downloading them to a local file? or checking
    whether they are up and running? etc.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Fri Jul 21 21:48:08 2023
    On 19/07/2023 09:55, Alexandru wrote:
    Did you find time to look into this issue?
    I have a task where I could use your package if it can catch communication errors.

    I have found a problem and fixed it. In my testing, the latest committed version of the www package now consistently allows catching the error to
    access the URL you mentioned. Please let me know if you encounter any
    more problems.


    Thanks,
    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Fri Jul 21 12:55:45 2023
    Schelte schrieb am Freitag, 21. Juli 2023 um 21:48:13 UTC+2:
    On 19/07/2023 09:55, Alexandru wrote:
    Did you find time to look into this issue?
    I have a task where I could use your package if it can catch communication errors.
    I have found a problem and fixed it. In my testing, the latest committed version of the www package now consistently allows catching the error to access the URL you mentioned. Please let me know if you encounter any
    more problems.


    Thanks,
    Schelte.
    Thanks Schelte, I'll run the code with your updated package right away and give you feedback.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Fri Jul 21 13:11:17 2023
    Schelte schrieb am Freitag, 21. Juli 2023 um 21:48:13 UTC+2:
    On 19/07/2023 09:55, Alexandru wrote:
    Did you find time to look into this issue?
    I have a task where I could use your package if it can catch communication errors.
    I have found a problem and fixed it. In my testing, the latest committed version of the www package now consistently allows catching the error to access the URL you mentioned. Please let me know if you encounter any
    more problems.


    Thanks,
    Schelte.
    Your new code breaks earlier now at this link and with this error:

    Getting https://globalautomationtechnologies.com/...(622/7537)
    *** START OF ERROR MESSAGE ***
    can't delete "::www::sock00000000154A58A0": command doesn't exist
    can't delete "::www::sock00000000154A58A0": command doesn't exist
    while executing
    "rename ::www::$fd """
    (class "::www::connection" method "Disconnect" line 5)
    invoked from within
    "my Disconnect"
    (class "::www::connection" method "Failed" line 3)
    invoked from within
    "my Failed [dict get $opts -errorcode] $msg"
    (class "::www::connection" method "Statusline" line 20)
    invoked from within
    "::oo::Obj756::my Statusline"
    *** END OF ERROR MESSAGE ***

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Alexandru on Fri Jul 21 17:28:36 2023
    On 7/21/2023 4:11 PM, Alexandru wrote:
    Schelte schrieb am Freitag, 21. Juli 2023 um 21:48:13 UTC+2:
    I have found a problem and fixed it. In my testing, the latest committed
    version of the www package now consistently allows catching the error to
    access the URL you mentioned. Please let me know if you encounter any
    more problems.


    Thanks,
    Schelte.
    Your new code breaks earlier now at this link and with this error:

    Getting https://globalautomationtechnologies.com/...(622/7537)

    I noticed that whenever you get these errors, Firefox and Chrome also
    complain. It is because these sites do not have their certificates setup
    but the url's start with https. If you use http instead in the same
    url's, it works and redirect works.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to All on Fri Jul 21 15:43:07 2023
    saitology9 schrieb am Freitag, 21. Juli 2023 um 23:28:42 UTC+2:
    On 7/21/2023 4:11 PM, Alexandru wrote:
    Schelte schrieb am Freitag, 21. Juli 2023 um 21:48:13 UTC+2:
    I have found a problem and fixed it. In my testing, the latest committed >> version of the www package now consistently allows catching the error to >> access the URL you mentioned. Please let me know if you encounter any
    more problems.


    Thanks,
    Schelte.
    Your new code breaks earlier now at this link and with this error:

    Getting https://globalautomationtechnologies.com/...(622/7537)
    I noticed that whenever you get these errors, Firefox and Chrome also complain. It is because these sites do not have their certificates setup
    but the url's start with https. If you use http instead in the same
    url's, it works and redirect works.
    Thanks for the hint.

    Still, changing the URL from https to https won't solve the problem, since other sites will fail because of http.
    Parsing 7000 sites need an error catching system.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Alexandru on Fri Jul 21 19:03:27 2023
    On 7/21/2023 6:43 PM, Alexandru wrote:
    Thanks for the hint.

    Still, changing the URL from https to https won't solve the problem, since other sites will fail because of http.
    Parsing 7000 sites need an error catching system.

    What I meant was that you have been given bad data. You may want raise
    it up with whoever gave it to you. I any case, 7_000 sounds like a lot
    to process. I am curious how long it takes.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to All on Fri Jul 21 16:10:37 2023
    saitology9 schrieb am Samstag, 22. Juli 2023 um 01:03:32 UTC+2:
    On 7/21/2023 6:43 PM, Alexandru wrote:
    Thanks for the hint.

    Still, changing the URL from https to https won't solve the problem, since other sites will fail because of http.
    Parsing 7000 sites need an error catching system.
    What I meant was that you have been given bad data. You may want raise
    it up with whoever gave it to you. I any case, 7_000 sounds like a lot
    to process. I am curious how long it takes.

    Yes, partly bad data, can't do a thing about it, that's how it is.
    The time is not an issue, few hours maybe.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Sat Jul 22 15:44:50 2023
    On 22/07/2023 01:10, Alexandru wrote:
    saitology9 schrieb am Samstag, 22. Juli 2023 um 01:03:32 UTC+2:
    On 7/21/2023 6:43 PM, Alexandru wrote:
    Thanks for the hint.

    Still, changing the URL from https to https won't solve the problem, since other sites will fail because of http.
    Parsing 7000 sites need an error catching system.
    What I meant was that you have been given bad data. You may want raise
    it up with whoever gave it to you. I any case, 7_000 sounds like a lot
    to process. I am curious how long it takes.

    Yes, partly bad data, can't do a thing about it, that's how it is.
    The time is not an issue, few hours maybe.

    This bad data is good for stress testing the www library. :-D

    The latest bug you encountered should be fixed now.


    Thanks,
    Schelte

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Sat Jul 22 13:07:10 2023
    Schelte schrieb am Samstag, 22. Juli 2023 um 15:44:56 UTC+2:
    On 22/07/2023 01:10, Alexandru wrote:
    saitology9 schrieb am Samstag, 22. Juli 2023 um 01:03:32 UTC+2:
    On 7/21/2023 6:43 PM, Alexandru wrote:
    Thanks for the hint.

    Still, changing the URL from https to https won't solve the problem, since other sites will fail because of http.
    Parsing 7000 sites need an error catching system.
    What I meant was that you have been given bad data. You may want raise
    it up with whoever gave it to you. I any case, 7_000 sounds like a lot
    to process. I am curious how long it takes.

    Yes, partly bad data, can't do a thing about it, that's how it is.
    The time is not an issue, few hours maybe.
    This bad data is good for stress testing the www library. :-D

    The latest bug you encountered should be fixed now.


    Thanks,
    Schelte

    Just a little bit further, the code hangs at this URL: https://ide-gmbh.com/en/ This is position 1231 of 7537 URLs in my list.

    FireFox also hangs a while, then returns:
    "Fehler: Umleitungsfehler
    Beim Verbinden mit www.ide-gmbh.com trat ein Fehler auf.
    Dieses Problem kann manchmal auftreten, wenn Cookies deaktiviert oder abgelehnt werden.
    "

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Sun Jul 23 12:39:14 2023
    On 22/07/2023 22:07, Alexandru wrote:
    Just a little bit further, the code hangs at this URL: https://ide-gmbh.com/en/
    This is position 1231 of 7537 URLs in my list.

    FireFox also hangs a while, then returns:
    "Fehler: Umleitungsfehler
    Beim Verbinden mit www.ide-gmbh.com trat ein Fehler auf.
    Dieses Problem kann manchmal auftreten, wenn Cookies deaktiviert oder abgelehnt werden.
    "

    This web site redirects to itself. There already was an (undocumented)
    option to limit the number of redirections. I have made an update to set
    the default limit to 20, instead of allowing unlimited redirections.


    Thanks,
    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexandru@21:1/5 to Schelte on Sun Jul 23 10:42:15 2023
    Schelte schrieb am Sonntag, 23. Juli 2023 um 12:39:20 UTC+2:
    On 22/07/2023 22:07, Alexandru wrote:
    Just a little bit further, the code hangs at this URL: https://ide-gmbh.com/en/
    This is position 1231 of 7537 URLs in my list.

    FireFox also hangs a while, then returns:
    "Fehler: Umleitungsfehler
    Beim Verbinden mit www.ide-gmbh.com trat ein Fehler auf.
    Dieses Problem kann manchmal auftreten, wenn Cookies deaktiviert oder abgelehnt werden.
    "
    This web site redirects to itself. There already was an (undocumented)
    option to limit the number of redirections. I have made an update to set
    the default limit to 20, instead of allowing unlimited redirections.


    Thanks,
    Schelte.
    Perfect! Now, all 7535 URLs run through without any errors.
    Many thanks, Schelte!
    Cheers
    Alexandru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to Alexandru on Sun Jul 23 22:17:06 2023
    On 23/07/2023 19:42, Alexandru wrote:
    Perfect! Now, all 7535 URLs run through without any errors.
    Many thanks, Schelte!
    Cheers
    Alexandru

    Thank you for running the stress test and identifying some weak points
    of the www library.


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Schelte on Sun Jul 23 17:22:08 2023
    On 7/23/2023 4:17 PM, Schelte wrote:
    On 23/07/2023 19:42, Alexandru wrote:
    Perfect! Now, all 7535 URLs run through without any errors.
    Many thanks, Schelte!
    Cheers
    Alexandru

    Thank you for running the stress test and identifying some weak points
    of the www library.


    Schelte.


    Glad for the happy ending :-)

    @Schelte, I am not familiar with the www package. Is it a replacement
    for tcllib's http? or is it something else altogether?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Schelte@21:1/5 to All on Mon Jul 24 22:23:04 2023
    On 23/07/2023 23:22, saitology9 wrote:
    @Schelte, I am not familiar with the www package. Is it a replacement
    for tcllib's http?  or is it something else altogether?

    There isn't really a tcllib http package. The http directory in the
    tcllib tree only contains the autoproxy package. So I assume you mean
    Tcl's bundled http package. That package caries a lot of historical
    burden, limiting how much it is able to adapt to modern web protocols
    and take advantage of "recent" Tcl features.

    The http package is also very low level. It only does a single
    request-response exchange. It is up to the caller to handle errors at
    different levels (transport layer, session layer, application layer).
    Then there are additional things to take care of: Redirections, cookies, authentication, compression, proxies, and protocol upgrades. The http
    package does none of these for the caller.

    I wanted a package that would be much simpler to use. You should just be
    able to hand it a URL and it will return the resource with all the redirections, cookies, et cetera taken care of. If there is a problem,
    it should just throw an exception and not require you to check multiple
    things to determine if there was any error. Most of the time you are
    only interested in the body. So that is what it normally returns. But
    there should be an easy way to examine the meta data.

    Another design goal was that it should be completely non-blocking, even
    for DNS lookups. Finally, it should also provide a possibility to handle
    modern protocols via protocol upgrades, such as websockets and http/2.

    I created the www package to fulfill these requirements.

    For more information, see https://chiselapp.com/user/schelte/repository/www


    Schelte.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to Schelte on Tue Jul 25 11:56:30 2023
    On 7/24/2023 4:23 PM, Schelte wrote:

    I created the www package to fulfill these requirements.

    For more information, see https://chiselapp.com/user/schelte/repository/www


    Schelte.


    Great! Thank you for your contribution.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)