• Re: Can you help me with this memoization simple example?

    From MRAB@21:1/5 to marc nicole via Python-list on Sun Mar 31 00:39:09 2024
    On 2024-03-31 00:09, marc nicole via Python-list wrote:
    I am creating a memoization example with a function that adds up / averages the elements of an array and compares it with the cached ones to retrieve them in case they are already stored.

    In addition, I want to store only if the result of the function differs considerably (passes a threshold e.g. 500000 below).

    I created an example using a decorator to do so, the results using the decorator is slightly faster than without the memoization which is OK, but
    is the logic of the decorator correct ? anybody can tell me ?

    My code is attached below:



    import time


    def memoize(f):
    cache = {}

    def g(*args):
    if args[1] == "avg":
    sum_key_arr = sum(list(args[0])) / len(list(args[0]))

    'list' will iterate over args[0] to make a list, and 'sum' will iterate
    over that list.

    It would be simpler to just let 'sum' iterate over args[0].

    elif args[1] == "sum":
    sum_key_arr = sum(list(args[0]))
    if sum_key_arr not in cache:
    for (
    key,
    value,
    ) in (
    cache.items()
    ): # key in dict cannot be an array so I use the sum of the array as the key

    You can't use a list as a key, but you can use a tuple as a key,
    provided that the elements of the tuple are also immutable.

    if (
    abs(sum_key_arr - key) <= 500000
    ): # threshold is great here so that all values are approximated!
    # print('approximated')
    return cache[key]
    else:
    # print('not approximated')
    cache[sum_key_arr] = f(args[0], args[1])
    return cache[sum_key_arr]

    return g


    @memoize
    def aggregate(dict_list_arr, operation):
    if operation == "avg":
    return sum(list(dict_list_arr)) / len(list(dict_list_arr))
    if operation == "sum":
    return sum(list(dict_list_arr))
    return None


    t = time.time()
    for i in range(200, 15000):
    res = aggregate(list(range(i)), "avg")

    elapsed = time.time() - t
    print(res)
    print(elapsed)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MRAB@21:1/5 to marc nicole on Sun Mar 31 20:23:50 2024
    On 2024-03-31 09:04, marc nicole wrote:
    Thanks for the first comment which I incorporated

    but when you say "You can't use a list as a key, but you can use a
    tuple as a key,
    provided that the elements of the tuple are also immutable."

    does it mean  the result of sum of the array is not convenient to use
    as key as I do?
    Which tuple I should use to refer to the underlying list value as you suggest?

    I was suggesting using `tuple` on the argument:

    def memoize(f):
         cache = {}

         def g(*args):
             key = tuple(args[0]), args[1]

             if key not in cache:
                 cache[key] = f(args[0], args[1])

             return cache[key]

         return g

    Anything else is good in my code ?

    Thanks

    Le dim. 31 mars 2024 à 01:44, MRAB via Python-list <[email protected]> a écrit :

    On 2024-03-31 00:09, marc nicole via Python-list wrote:
    > I am creating a memoization example with a function that adds up
    / averages
    > the elements of an array and compares it with the cached ones to
    retrieve
    > them in case they are already stored.
    >
    > In addition, I want to store only if the result of the function
    differs
    > considerably (passes a threshold e.g. 500000 below).
    >
    > I created an example using a decorator to do so, the results
    using the
    > decorator is slightly faster than without the memoization which
    is OK, but
    > is the logic of the decorator correct ? anybody can tell me ?
    >
    > My code is attached below:
    >
    >
    >
    > import time
    >
    >
    > def memoize(f):
    >      cache = {}
    >
    >      def g(*args):
    >          if args[1] == "avg":
    >              sum_key_arr = sum(list(args[0])) / len(list(args[0]))

    'list' will iterate over args[0] to make a list, and 'sum' will
    iterate
    over that list.

    It would be simpler to just let 'sum' iterate over args[0].

    >          elif args[1] == "sum":
    >              sum_key_arr = sum(list(args[0]))
    >          if sum_key_arr not in cache:
    >              for (
    >                  key,
    >                  value,
    >              ) in (
    >                  cache.items()
    >              ):  # key in dict cannot be an array so I use the
    sum of the
    > array as the key

    You can't use a list as a key, but you can use a tuple as a key,
    provided that the elements of the tuple are also immutable.

    >                  if (
    >                      abs(sum_key_arr - key) <= 500000
    >                  ):  # threshold is great here so that all
    values are
    > approximated!
    >                      # print('approximated')
    >                      return cache[key]
    >              else:
    >                  # print('not approximated')
    >                  cache[sum_key_arr] = f(args[0], args[1])
    >          return cache[sum_key_arr]
    >
    >      return g
    >
    >
    > @memoize
    > def aggregate(dict_list_arr, operation):
    >      if operation == "avg":
    >          return sum(list(dict_list_arr)) / len(list(dict_list_arr))
    >      if operation == "sum":
    >          return sum(list(dict_list_arr))
    >      return None
    >
    >
    > t = time.time()
    > for i in range(200, 15000):
    >      res = aggregate(list(range(i)), "avg")
    >
    > elapsed = time.time() - t
    > print(res)
    > print(elapsed)


    --
    https://mail.python.org/mailman/listinfo/python-list


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)