• non-continuous hash

    From Dr Eberhard W Lisse@21:1/5 to All on Sun Jul 3 17:14:35 2022
    Having something like:

    [...]
    my $statement = qq/
    SELECT exchangerate, transdate
    FROM exchangerate
    ORDER BY transdate
    /;
    $statement = $TABLE->prepare ($statement);
    $statement->execute () or die $TABLE->errstr;
    my %exchangrate;
    while (@row = $statement->fetchrow_array) {
    $exchangrate{$row[1]} = $row[0];
    }
    foreach $key ( sort (keys(%exchangrate))) {
    print "$key: $exchangrate{$key}\n"
    }
    [...]

    results in something like

    [...]
    2022-06-19: 0.855
    2022-06-21: 0.8601
    2022-06-22: 0.8589
    2022-06-23: 0.8582
    2022-06-29: 0.8646
    [...]

    which is non-continuous mainly because of weekends and so on.


    How do I fill up the empty key-value pairs by way of the last existing
    pair?

    Ie in this example, the value for 2022-06-20 should also be 0.855, and
    the values for 2022-06-24/5/6/7/8 should all be 0.8582.


    Even cooler would be to take the "closes value, ie 2022-06-24/5 should
    have the (same) value from 2022-06-23 and 2022-06-27/8 should get the
    ones from 2022-06-29. But that's not a requirement,

    greetings, el

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From JLM@21:1/5 to All on Sun Jul 3 19:30:26 2022
    El 3/7/22 a las 17:14, Dr Eberhard W Lisse escribió:
      foreach $key ( sort (keys(%exchangrate))) {
            print "$key: $exchangrate{$key}\n"
        }
        [...]

    results in something like

        [...]
        2022-06-19: 0.855
        2022-06-21: 0.8601
        2022-06-22: 0.8589
        2022-06-23: 0.8582
        2022-06-29: 0.8646
        [...]

    which is non-continuous mainly because of weekends and so on.


    How do I fill up the empty key-value pairs by way of the last existing
    pair?

    Ie in this example, the value for 2022-06-20 should also be 0.855, and
    the values for 2022-06-24/5/6/7/8 should all be 0.8582.


    for $key ( sort keys %exchangerate ) {
    if ($key =~ /(\d+)\-(\d+)\-(\d+)/) {
    my $date = $3;
    if ($date <= 30){
    my $c = 1
    until (exists $exchagerate{"$1-$2-".$date+$c}){ $exchangerate{"$1-$2-".$date+$c} =$exchangerate{$key};
    $c++;
    }
    }
    }
    }

    Try this. Untested.

    --
    http://gamo.sdf-eu.org/index.html # ?
    World peace, no less!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From JLM@21:1/5 to All on Tue Jul 5 07:12:47 2022
    El 3/7/22 a las 19:30, JLM escribió:
    El 3/7/22 a las 17:14, Dr Eberhard W Lisse escribió:
       foreach $key ( sort (keys(%exchangrate))) {
             print "$key: $exchangrate{$key}\n"
         }
         [...]

    results in something like

         [...]
         2022-06-19: 0.855
         2022-06-21: 0.8601
         2022-06-22: 0.8589
         2022-06-23: 0.8582
         2022-06-29: 0.8646
         [...]

    which is non-continuous mainly because of weekends and so on.


    How do I fill up the empty key-value pairs by way of the last existing
    pair?

    Ie in this example, the value for 2022-06-20 should also be 0.855, and
    the values for 2022-06-24/5/6/7/8 should all be 0.8582.


    for $key ( sort keys %exchangerate ) {
          if ($key =~ /(\d+)\-(\d+)\-(\d+)/) {
              my $date = $3;
              if ($date <= 30){
                my $c = 1

    $c = spritf "%02d", $date+$c; # let's try to fix things

                until  (exists $exchagerate{"$1-$2-$c"}){
    $exchangerate{"$1-$2-$c"} =$exchangerate{$key};
                    $c++;

    $c = sprintf "%02d", $c; #

                }
               }
          }
    }

    Try this. Untested.


    Still untested. Performance issues I suppose are not a case.

    --
    http://gamo.sdf-eu.org/index.html # ?
    World peace, no less!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From JLM@21:1/5 to All on Tue Jul 5 13:22:10 2022
    El 5/7/22 a las 7:12, JLM escribió:
    for $key ( sort keys %exchangerate ) {
           if ($key =~ /(\d+)\-(\d+)\-(\d+)/) {
               my $date = $3;
               if ($date <= 30){
                 my $c = 1

    $c = spritf "%02d", $date+$c;   # let's try to fix things

                  until  (exists $exchagerate{"$1-$2-$c"}){
             $exchangerate{"$1-$2-$c"} =$exchangerate{$key};
                     $c++;

    last if ($c > 31);


    $c = sprintf "%02d", $c;       #

                 }
                }
           }
    }

    Try this. Untested.



    A question arises here. For how many days have sense to have
    no data? Is it possible to happen in the edge of one month
    to another? Then, similar arrangements should be done to the
    number of month, and so on.

    Best.

    --
    http://gamo.sdf-eu.org/index.html # ?
    World peace, no less!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bob Nichols@21:1/5 to All on Tue Jul 5 14:00:42 2022
    #!/usr/bin/perl
    use strict;
    use warnings;
    use POSIX qw/mktime strftime/;

    # This is just to get some values into %Rate for the demo:
    # You should populate %Rate properly with your database
    # selection code (as %exchangrate in your post).
    my %Rate = (
    '2022-06-19' => 0.855,
    '2022-06-21' => 0.8601,
    '2022-06-22' => 0.8589,
    '2022-06-23' => 0.8582,
    '2022-06-29' => 0.8646,
    );
    my @Dates = sort keys %Rate;

    sub timestamp {
    # Convert an ISO8601 date string into a UNIX timestamp
    $_[0] =~ /^(\d\d\d\d)-?(\d\d)-?(\d\d)$/a;
    return mktime(0, 0, 12, $3+0, $2-1, $1-1900);
    }

    sub iso8601 {
    # Convert a UNIX timestamp into an ISO8601 date string
    return strftime('%F', localtime($_[0]));
    }

    sub previous_date {
    my $date = $_[0];
    return $date if exists $Rate{$date};
    return (grep $_ lt $date, @Dates)[-1];
    }

    sub nearest_date {
    my $date = $_[0];
    return $date if exists $Rate{$date};
    my $prev_date = (grep $_ lt $date, @Dates)[-1];
    my $next_date = (grep $_ gt $date, @Dates)[0];
    my $prev_gap = timestamp($date) - timestamp($prev_date);
    my $next_gap = timestamp($next_date) - timestamp($date);
    return $prev_gap <= $next_gap ? $prev_date : $next_date;
    }

    # Now we have all we need to add missing values to %Rate.

    printf "Original %%Rate contains %d values:\n",
    scalar keys %Rate;
    printf "$_\t$Rate{$_}\n" foreach sort keys %Rate;

    # Iterate through every date between the first value and
    # the last value in @Dates:
    for (
    my $date = $Dates[0];
    $date le $Dates[-1];
    $date = iso8601( timestamp($date)+86400 )
    ) {
    next if exists $Rate{$date};
    # If you want a use the previous date for a missing value:
    my $previous = previous_date($date);
    $Rate{$date} = $Rate{$previous};
    # or, if you prefer the nearest date instead:
    my $nearest = nearest_date($date);
    $Rate{$date} = $Rate{$nearest};
    }

    printf "\nFilled %%Rate contains %d values:\n",
    scalar keys %Rate;
    printf "$_\t$Rate{$_}\n" foreach sort keys %Rate;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dr Eberhard Lisse@21:1/5 to Bob Nichols on Tue Jul 5 16:28:50 2022
    Bob

    Thanks, that looks very cool :-)-O


    On 05/07/2022 15:00, Bob Nichols wrote:
    #!/usr/bin/perl
    [...]



    JLM,

    Thanks.

    It's an exchange rate coming from

    http://www.ecb.europa.eu/stats/policy_and_exchange_rates/euro_reference_exchange_rates/html/index.en.html

    That implies it doesn't work on the weekends and sometimes my system is
    off.

    Exactness doesn't really matter for the exchange rate differences in the
    book keeping system where this goes, as they are written off once the
    financial year has closed and have no tex implications.

    Currently I look up the exchange rate for each transaction (between 2
    and 100, last week, last month up to the whole of a calendar year) with
    an SQL statement which gives me the last one before the date if the date doesn't have one.

    This is reasonably fast, but sending a single SQL statement from Perl
    and then doing that with a hash would be obviously faster and some fun
    figuring out

    Bob's solution seems to be putting me onto the right track, anyway.

    greetings, el

    On 05/07/2022 13:22, JLM wrote:
    [..]
    A question arises here. For how many days have sense to have
    no data? Is it possible to happen in the edge of one month
    to another? Then, similar arrangements should be done to the
    number of month, and so on.
    [...]
    --
    To email me replace 'nospam' with 'el'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rainer Weikusat@21:1/5 to Dr Eberhard W Lisse on Tue Jul 5 17:29:34 2022
    Dr Eberhard W Lisse <[email protected]> writes:

    [...]

    [...]
    2022-06-19: 0.855
    2022-06-21: 0.8601
    2022-06-22: 0.8589
    2022-06-23: 0.8582
    2022-06-29: 0.8646
    [...]

    which is non-continuous mainly because of weekends and so on.


    How do I fill up the empty key-value pairs by way of the last existing
    pair?

    Not overly complicated way of doing that:

    ------
    #* date generator
    #
    my @m_lens = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);

    sub m_len
    {
    my ($y, $m) = @_;
    my $ml;

    $ml = $m_lens[$m - 1];
    ++$ml if $ml < 30 && !($y & 3) && !($y % 100 == 0 && $y % 400);
    return $ml;
    }

    sub date_counter
    {
    my ($y, $m, $d) = split('-', $_[0]);

    return sub {
    ++$d;

    if ($d > m_len($y, $m)) {
    $d = 1;

    if ($m < 12) {
    ++$m;
    } else {
    $m = 1;
    ++$y;
    }
    }

    return sprintf('%u-%02u-%02u', $y, $m, $d);
    }
    }

    #* sample input data
    #
    my %data = qw(2022-06-19 0.855
    2022-06-21 0.8601
    2022-06-22 0.8589
    2022-06-23 0.8582
    2022-06-29 0.8646);

    #* main
    #
    my @dates = sort(keys(%data));
    my $dc = date_counter($dates[0]);
    my ($cur_v, $n_d);

    $cur_v = $data{$dates[0]};

    for (@dates[1 .. $#dates]) {
    $data{$n_d} = $cur_v while ($n_d = $dc->()) ne $_;
    $cur_v = $data{$_};
    }

    print($_, "\t", $data{$_}, "\n") for sort(keys(%data));

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rainer Weikusat@21:1/5 to Bob Nichols on Tue Jul 5 18:33:35 2022
    Bob Nichols <[email protected]d> writes:
    # This is just to get some values into %Rate for the demo:
    # You should populate %Rate properly with your database
    # selection code (as %exchangrate in your post).
    my %Rate = (
    '2022-06-19' => 0.855,
    '2022-06-21' => 0.8601,
    '2022-06-22' => 0.8589,
    '2022-06-23' => 0.8582,
    '2022-06-29' => 0.8646,
    );
    my @Dates = sort keys %Rate;

    [...]

    sub previous_date {
    my $date = $_[0];
    return $date if exists $Rate{$date};
    return (grep $_ lt $date, @Dates)[-1];
    }


    [...]

    # Iterate through every date between the first value and
    # the last value in @Dates:
    for (
    my $date = $Dates[0];
    $date le $Dates[-1];
    $date = iso8601( timestamp($date)+86400 )
    ) {
    next if exists $Rate{$date};
    my $previous = previous_date($date);
    $Rate{$date} = $Rate{$previous};

    [...]

    }

    With doing a benchmark, the exact effects of this can't be determined
    (and they might not matter much, anyway) but this is a bad algorithm: It iterates over an array of a certain length and scans the complete array
    once for each iteration, hence, it's running time is proportional to the
    square of the array length (ie, the algorithm is of quadratic
    complexity).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dr Eberhard Lisse@21:1/5 to Rainer Weikusat on Wed Jul 6 14:18:01 2022
    Rainer,

    thank you for your code (in a separate message).

    I have modified both to produce identical output (4489 "completed' pairs
    from 3897 returned by my SQL statement and then a few identical lines
    to measure the time the actual code (without the SQL) takes

    Your's takes indeed around 6 milliseconds whereas Bob's takes some 136 milliseconds (on my iMac) without the (print(f) statement(s)).

    In practical terms (of my use case) this does not matter as you
    suspected quite correctly, because the PostgreSQL server sits on another continent and hence the execution time of the SQL statement to populate
    the hash is the limiting factor, but I will of course now use your's as
    speed was the aim of the exercise.

    Learned something here.

    Thanks, el

    On 05/07/2022 19:33, Rainer Weikusat wrote:
    Bob Nichols <[email protected]d> writes:
    [...]
    With doing a benchmark, the exact effects of this can't be determined
    (and they might not matter much, anyway) but this is a bad algorithm:
    It iterates over an array of a certain length and scans the complete
    array once for each iteration, hence, it's running time is
    proportional to the square of the array length (ie, the algorithm is
    of quadratic complexity).

    On 05/07/2022 18:29, Rainer Weikusat wrote:
    [...]
    Not overly complicated way of doing that:
    [...]

    --
    To email me replace 'nospam' with 'el'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bo Lindbergh@21:1/5 to Dr Eberhard Lisse on Tue Jul 12 11:29:02 2022
    In article <[email protected]>,
    Dr Eberhard Lisse <[email protected]> wrote:
    Currently I look up the exchange rate for each transaction (between 2
    and 100, last week, last month up to the whole of a calendar year) with
    an SQL statement which gives me the last one before the date if the date doesn't have one.

    This is reasonably fast, but sending a single SQL statement from Perl
    and then doing that with a hash would be obviously faster and some fun figuring out

    You could use a temporary SQLite database to cache an appropriate range
    of data from the remote server. Looking up individual dates in this cache would be very fast, possibly faster than any fill-in-the-holes code written
    in Perl.


    /Bo Lindbergh

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rainer Weikusat@21:1/5 to Dr Eberhard Lisse on Tue Jul 12 19:43:58 2022
    Dr Eberhard Lisse <[email protected]> writes:
    I have modified both to produce identical output (4489 "completed' pairs
    from 3897 returned by my SQL statement and then a few identical lines
    to measure the time the actual code (without the SQL) takes

    Your's takes indeed around 6 milliseconds whereas Bob's takes some 136 milliseconds (on my iMac) without the (print(f) statement(s)).

    That's interesting. I actually expected the opposite result. While I was
    using a linear algorithm, the implementation (with a date enumeration
    closure) wasn't exactly low-overhead and did a lot of stuff in
    Perl. This usually means that the theoretical scalability advantage
    doesn't turn into a pratical advantage unless the amount of input data
    is insanely large.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dr Eberhard W Lisse@21:1/5 to Rainer Weikusat on Fri Jul 15 08:49:22 2022
    Indeed, the limiting factor is the SQL Query (the server is on another continent). And at the most 365 rates per annum doesn't bother
    PostgreSQL nor Perl :-)-O

    Hmm maybe I should measure on my M1 :-)-O

    el

    On 2022-07-12 20:43 , Rainer Weikusat wrote:
    Dr Eberhard Lisse <[email protected]> writes:
    I have modified both to produce identical output (4489 "completed'
    pairs from 3897 returned by my SQL statement and then a few identical
    lines to measure the time the actual code (without the SQL) takes

    Your's takes indeed around 6 milliseconds whereas Bob's takes some
    136 milliseconds (on my iMac) without the (print(f) statement(s)).

    That's interesting. I actually expected the opposite result. While I
    was using a linear algorithm, the implementation (with a date
    enumeration closure) wasn't exactly low-overhead and did a lot of
    stuff in Perl. This usually means that the theoretical scalability
    advantage doesn't turn into a practical advantage unless the amount of
    input data is insanely large.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dr Eberhard Lisse@21:1/5 to Bo Lindbergh on Fri Jul 15 11:01:13 2022
    Thanks, but besides being more complicated I really doubt that creating
    a (temp) SQLite3 table, writing to it and then reading from it before
    even doing any comparison will be faster that Rainer's fill-in-the-hole.

    el


    On 12/07/2022 11:29, Bo Lindbergh wrote:
    In article <[email protected]>, Dr Eberhard Lisse <[email protected]> wrote:

    Currently I look up the exchange rate for each transaction (between 2
    and 100, last week, last month up to the whole of a calendar year)
    with an SQL statement which gives me the last one before the date if
    the date doesn't have one.

    This is reasonably fast, but sending a single SQL statement from Perl
    and then doing that with a hash would be obviously faster and some
    fun figuring out

    You could use a temporary SQLite database to cache an appropriate
    range of data from the remote server. Looking up individual dates in
    this cache would be very fast, possibly faster than any
    fill-in-the-holes code written in Perl.


    /Bo Lindbergh


    --
    To email me replace 'nospam' with 'el'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)