• Clock scan

    From snosniv@21:1/5 to All on Tue Nov 1 04:54:56 2022
    I'm reading a csv file with date & time in one cell, however, the source file format doesn't seem to be constant, I sometimes get the following alternatives:
    01/11/2022 11:00:27
    and sometimes:
    01 November 2022 11:00:27

    So how to scan to ensure I get the seconds since epoch, using some way to detect the format with a switch or if statement??

    TIA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From snosniv@21:1/5 to snosniv on Tue Nov 1 05:58:04 2022
    On Tuesday, 1 November 2022 at 12:21:36 UTC, snosniv wrote:
    On Tuesday, 1 November 2022 at 11:54:59 UTC, snosniv wrote:
    I'm reading a csv file with date & time in one cell, however, the source file format doesn't seem to be constant, I sometimes get the following alternatives:
    01/11/2022 11:00:27
    and sometimes:
    01 November 2022 11:00:27

    So how to scan to ensure I get the seconds since epoch, using some way to detect the format with a switch or if statement??

    TIA
    My script did work, now failing miserably. I tried pasting some of the lines directly into tcl shell & now get this, which is really confusing me!
    ( I haven't shown the file read, split, etc to get the data rows from the CSV file).

    (FIT_FILE_READ) 13 % puts $row
    "1 November 2022 11:00:27",156.0,24.2,15.7,131.6,13.6,7.0,60.8,54.4,125.0,6.6,19.2,1658.0,64.0,--
    (FIT_FILE_READ) 14 %
    (FIT_FILE_READ) 14 % set row_data [split $row ","]
    {"1 November 2022 11:00:27"} 156.0 24.2 15.7 131.6 13.6 7.0 60.8 54.4 125.0 6.6 19.2 1658.0 64.0 --
    (FIT_FILE_READ) 15 %
    (FIT_FILE_READ) 15 % set my_DateTime [lindex $row_data 0]
    "1 November 2022 11:00:27"
    (FIT_FILE_READ) 16 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 17 % set my_tcl_time [clock scan $my_DateTime -format "%d %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 18 % set my_tcl_time [clock scan $my_DateTime -format "%D %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 19 % set my_tcl_time [clock scan $my_DateTime -format "%D %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 20 %
    (FIT_FILE_READ) 20 % set my_tcl_time [clock scan $my_DateTime -format "%d %m %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 21 %
    (FIT_FILE_READ) 21 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 22 % puts $my_DateTime
    "1 November 2022 11:00:27"
    (FIT_FILE_READ) 23 %

    It looks like the DateTime with the quotes is causing the issue!
    I didn't get these errors before, and nothing seems to have changed in the way the source CSV file gets saved, and script unchanged:
    so I tried this:
    (FIT_FILE_READ) 21 % puts $my_DateTime
    "01 November 2022 11:00:27"
    (FIT_FILE_READ) 23 % set s1 $my_DateTime
    "01 November 2022 11:00:27"
    (FIT_FILE_READ) 24 % set x1 [string length $s1]
    27
    (FIT_FILE_READ) 25 % set s2 [string range $s1 1 25] ; # strip out the quotes from start & end.
    01 November 2022 11:00:27
    (FIT_FILE_READ) 28 % set t1 [clock scan $s1 -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 29 % set t1 [clock scan $s2 -format "%d %B %Y %H:%M:%S"] 1667300427

    Can someone explain what I'm doing wrong please, I seem to be constantly editing my script to solve what seems to be ever moving formats?

    TIA, Niv.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From snosniv@21:1/5 to snosniv on Tue Nov 1 05:21:33 2022
    On Tuesday, 1 November 2022 at 11:54:59 UTC, snosniv wrote:
    I'm reading a csv file with date & time in one cell, however, the source file format doesn't seem to be constant, I sometimes get the following alternatives:
    01/11/2022 11:00:27
    and sometimes:
    01 November 2022 11:00:27

    So how to scan to ensure I get the seconds since epoch, using some way to detect the format with a switch or if statement??

    TIA

    My script did work, now failing miserably. I tried pasting some of the lines directly into tcl shell & now get this, which is really confusing me!
    ( I haven't shown the file read, split, etc to get the data rows from the CSV file).

    (FIT_FILE_READ) 13 % puts $row
    "1 November 2022 11:00:27",156.0,24.2,15.7,131.6,13.6,7.0,60.8,54.4,125.0,6.6,19.2,1658.0,64.0,--
    (FIT_FILE_READ) 14 %
    (FIT_FILE_READ) 14 % set row_data [split $row ","]
    {"1 November 2022 11:00:27"} 156.0 24.2 15.7 131.6 13.6 7.0 60.8 54.4 125.0 6.6 19.2 1658.0 64.0 --
    (FIT_FILE_READ) 15 %
    (FIT_FILE_READ) 15 % set my_DateTime [lindex $row_data 0]
    "1 November 2022 11:00:27"
    (FIT_FILE_READ) 16 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 17 % set my_tcl_time [clock scan $my_DateTime -format "%d %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 18 % set my_tcl_time [clock scan $my_DateTime -format "%D %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 19 % set my_tcl_time [clock scan $my_DateTime -format "%D %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 20 %
    (FIT_FILE_READ) 20 % set my_tcl_time [clock scan $my_DateTime -format "%d %m %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 21 %
    (FIT_FILE_READ) 21 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 22 % puts $my_DateTime
    "1 November 2022 11:00:27"
    (FIT_FILE_READ) 23 %

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From saitology9@21:1/5 to snosniv on Tue Nov 1 09:21:45 2022
    On 11/1/2022 8:58 AM, snosniv wrote:
    My script did work, now failing miserably. I tried pasting some of the lines directly into tcl shell & now get this, which is really confusing me!
    ( I haven't shown the file read, split, etc to get the data rows from the CSV file).

    (FIT_FILE_READ) 13 % puts $row
    "1 November 2022 11:00:27",156.0,24.2,15.7,131.6,13.6,7.0,60.8,54.4,125.0,6.6,19.2,1658.0,64.0,--
    (FIT_FILE_READ) 14 %
    (FIT_FILE_READ) 14 % set row_data [split $row ","]
    {"1 November 2022 11:00:27"} 156.0 24.2 15.7 131.6 13.6 7.0 60.8 54.4 125.0 6.6 19.2 1658.0 64.0 --

    The problem is with your data and it starts right here: Your split is
    combining the double-quotes with your date data. It seems that your
    data now contains extra characters that were not there before. See if
    you can the cvs package in the tcllib for better results.

    By the way,


    (FIT_FILE_READ) 16 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 17 % set my_tcl_time [clock scan $my_DateTime -format "%d %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 18 % set my_tcl_time [clock scan $my_DateTime -format "%D %M %Y %H:%M:%S"]
    input string does not match supplied format


    I would not recommend going down the list of different formats until one sticks. Lots of room for data quality errors there. It might be better
    to fix your data file format.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Tue Nov 1 14:29:08 2022
    Am 01.11.2022 um 13:58 schrieb snosniv:
    On Tuesday, 1 November 2022 at 12:21:36 UTC, snosniv wrote:
    On Tuesday, 1 November 2022 at 11:54:59 UTC, snosniv wrote:
    I'm reading a csv file with date & time in one cell, however, the source file format doesn't seem to be constant, I sometimes get the following alternatives:
    01/11/2022 11:00:27
    and sometimes:
    01 November 2022 11:00:27

    So how to scan to ensure I get the seconds since epoch, using some way to detect the format with a switch or if statement??

    TIA
    My script did work, now failing miserably. I tried pasting some of the lines directly into tcl shell & now get this, which is really confusing me!
    ( I haven't shown the file read, split, etc to get the data rows from the CSV file).

    (FIT_FILE_READ) 13 % puts $row
    "1 November 2022 11:00:27",156.0,24.2,15.7,131.6,13.6,7.0,60.8,54.4,125.0,6.6,19.2,1658.0,64.0,--
    (FIT_FILE_READ) 14 %
    (FIT_FILE_READ) 14 % set row_data [split $row ","]
    {"1 November 2022 11:00:27"} 156.0 24.2 15.7 131.6 13.6 7.0 60.8 54.4 125.0 6.6 19.2 1658.0 64.0 --
    (FIT_FILE_READ) 15 %
    (FIT_FILE_READ) 15 % set my_DateTime [lindex $row_data 0]
    "1 November 2022 11:00:27"
    (FIT_FILE_READ) 16 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 17 % set my_tcl_time [clock scan $my_DateTime -format "%d %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 18 % set my_tcl_time [clock scan $my_DateTime -format "%D %M %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 19 % set my_tcl_time [clock scan $my_DateTime -format "%D %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 20 %
    (FIT_FILE_READ) 20 % set my_tcl_time [clock scan $my_DateTime -format "%d %m %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 21 %
    (FIT_FILE_READ) 21 % set my_tcl_time [clock scan $my_DateTime -format "%d %B %Y %H:%M:%S"]
    input string does not match supplied format
    (FIT_FILE_READ) 22 % puts $my_DateTime
    "1 November 2022 11:00:27"
    (FIT_FILE_READ) 23 %

    It looks like the DateTime with the quotes is causing the issue!
    I didn't get these errors before, and nothing seems to have changed in the way the source CSV file gets saved, and script unchanged:
    so I tried this:
    (FIT_FILE_READ) 21 % puts $my_DateTime
    "01 November 2022 11:00:27"
    (FIT_FILE_READ) 23 % set s1 $my_DateTime
    "01 November 2022 11:00:27"
    (FIT_FILE_READ) 24 % set x1 [string length $s1]
    27
    (FIT_FILE_READ) 25 % set s2 [string range $s1 1 25] ; # strip out the quotes from start & end.
    01 November 2022 11:00:27
    (FIT_FILE_READ) 28 % set t1 [clock scan $s1 -format "%d %B %Y %H:%M:%S"] input string does not match supplied format
    (FIT_FILE_READ) 29 % set t1 [clock scan $s2 -format "%d %B %Y %H:%M:%S"] 1667300427

    Can someone explain what I'm doing wrong please, I seem to be constantly editing my script to solve what seems to be ever moving formats?

    TIA, Niv.

    Dear Niv,

    thank you. The double-quotes (") are part of the CSV file, not part of
    the data. You may consider to decode the CSV format using a present
    method, which may handle the double-quotes.

    You may choose:
    - csv module of TCLLIB : https://wiki.tcl-lang.org/page/Tcllib+Csv
    - csv parser on the wiki: https://wiki.tcl-lang.org/page/csv
    - Ashoks binary rclcsv package: https://wiki.tcl-lang.org/page/tclcsv

    Take care,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to snosniv on Tue Nov 1 15:03:53 2022
    snosniv <[email protected]> wrote:
    I'm reading a csv file with date & time in one cell, however, the
    source file format doesn't seem to be constant, I sometimes get the
    following alternatives:

    01/11/2022 11:00:27
    and sometimes:
    01 November 2022 11:00:27

    So how to scan to ensure I get the seconds since epoch, using some
    way to detect the format with a switch or if statement??

    If your input CSV file has varying formats for the date, then yes, you
    will have to do one of:

    1) detect what format a given row contains, and then scan that format.

    or

    2) determine how many variant formats the file contains, and scan for
    each in sequence until one does not error (you'll have to wrap the
    clock scan in a [catch] or a [try] block and handle the errors as "try
    next possible format". Also note that until you detect all the variant formats, you'll want a final "could not parse this string: '.....'" log
    output so you can add to your formats

    or

    3) get the producer of the csv files to stop creating different date
    formats and always output a standard format for all date/times.

    Also, in a followup it looks like you might be using [split] to parse
    the CSV file. DO NOT DO THIS. CSV's quoting method will cause a
    simple [split] to fail once a row that is quoted appears.

    ALWAYS use Tcllib's CSV module to parse CSV files, it will handle the troublesome details for you.

    Also, ALWAYS use Tcllib's CSV module to output CSV rows, as it will
    also handle adding required quoting for you as well.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gerald Lester@21:1/5 to snosniv on Tue Nov 1 13:39:07 2022
    On 11/1/22 06:54, snosniv wrote:
    I'm reading a csv file with date & time in one cell, however, the source file format doesn't seem to be constant, I sometimes get the following alternatives:
    01/11/2022 11:00:27
    and sometimes:
    01 November 2022 11:00:27

    So how to scan to ensure I get the seconds since epoch, using some way to detect the format with a switch or if statement??

    Normally, at least in the US, it is mm/dd/YYYY -- are you sure that:
    "01/11/2022 11:00:27" is "01 November 2022 11:00:27"
    and not:
    "01/11/2022 11:00:27" is "11 January 2022 11:00:27"

    --
    +----------------------------------------------------------------------+
    | Gerald W. Lester, President, KNG Consulting LLC |
    | Email: [email protected] | +----------------------------------------------------------------------+

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gerald Lester@21:1/5 to Gerald Lester on Tue Nov 1 13:40:38 2022
    On 11/1/22 13:39, Gerald Lester wrote:
    On 11/1/22 06:54, snosniv wrote:
    I'm reading a csv file with date & time in one cell, however, the
    source file format doesn't seem to be constant, I sometimes get the
    following alternatives:
    01/11/2022  11:00:27
      and sometimes:
    01 November 2022  11:00:27

    So how to scan to ensure I get  the seconds since epoch, using some
    way to detect the format with a switch or if statement??

    Normally, at least in the US, it is mm/dd/YYYY -- are you sure that:
      "01/11/2022  11:00:27" is "01 November 2022  11:00:27"
    and not:
      "01/11/2022  11:00:27" is "11 January 2022  11:00:27"


    BTW, if it is:
    "01/11/2022 11:00:27" is "11 January 2022 11:00:27"

    Then clock scan returns the same value either way.

    --
    +----------------------------------------------------------------------+
    | Gerald W. Lester, President, KNG Consulting LLC |
    | Email: [email protected] | +----------------------------------------------------------------------+

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)