• Re: Problem with tdbc

    From greg@21:1/5 to All on Mon Apr 29 21:28:32 2024
    Am 29.04.24 um 15:03 schrieb Alan Grunwald:
    I'm writing a script to scrape data from a website and save it in a
    database.

    The web site offers games and supports multiple users, who play against
    each other in multiple tournaments every day. The first time the script
    is run, the idea is to capture data, including their opponents, for a specified user over the past 24 hours. On subsequent runs, we capture
    data for all users that we know about.

    So, I expect that I'll be scraping more and more data as time goes by.
    I've run it for a few days writing data to a SQLite database but after a couple of weeks it now takes more than 24 hours to run to completion.
    I'd like to see how long it takes if I use a MySQL database.
    Unfortunately, on the second run the script reports the error:

    "Can't create more that max_prepared_stmt_count statements (current
    value 16382)"

    Now, I could simply increase the value of the MySQL variable, maybe to
    its maximum allowed value of 4194304. However, when it failed, it was
    trying to handle 1000 users. I know from the SQLite prototype that I
    need to be able to hand more than 80000 users. Upping the limit might
    work and might not kill the machine hosting the database, but I'd really
    like to find a better solution.

    As far as I'm aware I'm only creating about a dozen (12) prepared
    statements. I have noticed, however, that I have a large number of tdbc::mysql::resultset objects - 16375 when the server complained, and I wonder whether these objects create a prepared statement that they
    release when they are destroyed. (I saw the same growth - of tdbc::sqlite3::resultset objects - when saving to a SQLite database.)

    I am nearly certain that I'm not creating these resultset objects
    directly myself, I think all my interaction with tdbc is via db prepare
    and statement foreach/allrows calls but I imagine that freach and/or
    allrows will create a temporary resultset that it (should) manage itself.

    Does the above description ring any bells with anyone here? If not, any suggestions as to how I should proceed?

    Alan

    Are you using close?

    https://www.tcl.tk/man/tcl/TdbcCmd/tdbc_statement.htm
    The close object command removes a statement and any result sets that
    it has created. All system resources associated with the objects are freed.

    https://www.tcl.tk/man/tcl/TdbcCmd/tdbc_resultset.htm
    The close object command deletes the result set and frees any associated
    system resources.


    I've run it for a few days writing data to a SQLite database but after a couple of weeks it now takes more than 24 hours to run to completion.

    Are you using transaction and commit? https://www.tcl-lang.org/man/tcl/TdbcCmd/tdbc_connection.htm
    db begintransaction
    db commit

    https://www.magicsplat.com/articles/tdbc.html#_transactions


    a example:
    proc sqlInsertCsv {dbconn datalist} {
    $dbconn begintransaction
    set stmt [$dbconn prepare {
    INSERT INTO csvdaten (one,two,three) VALUES (:one,:two,:three)}]

    foreach dataline $datalist {
    foreach {one two three} $dataline {
    $stmt execute
    }
    }
    $dbconn commit
    $stmt close
    }



    Gregor

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From greg@21:1/5 to All on Tue Apr 30 06:15:03 2024
    Am 30.04.24 um 00:00 schrieb Alan Grunwald:


    The whole web-scraping activity is wrapped in a try statement; the first statement within the try is begintransation; if execution reaches the
    end of the try the transaction is committed; if if there is an error,
    the trasaction is rolled back.

    Yes, the better way is to use try.


    I've just had another look at the tdbc_statement manual page and I see
    that this may well be designed behaviour; I see that [statement execute] returns a resultset object. I had not noticed that before
    and am very sure that I haven't closed these objects.


    According to the manual it should already happen with close of the tdbc statement.

    https://www.tcl.tk/man/tcl/TdbcCmd/tdbc_statement.htm
    The close object command removes a statement and any result sets that
    it has created. All system resources associated with the objects are freed.
    !


    I only got involved with introspection here after your question. I found something new about tdbc for myself.

    https://www.magicsplat.com/articles/tdbc.html#_introspection

    and

    https://www.tcl-lang.org/man/tcl/TdbcCmd/tdbc_connection.htm
    db statements
    The statements object command returns a list of statements that have
    been created by prepare and preparecall statements against the given
    connection and have not yet been closed.

    db resultsets
    The resultsets object command returns a list of result sets that have
    been obtained by executing statements prepared using the given
    connection and not yet closed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)