Hacker News new | ask | show | jobs
by FreakLegion 1006 days ago
The point of the article is a handful of lines. The rest is accoutrement like the URL list and timing code. But sure, if

    tasks = {}
    for url in URLs:
        future = executor.submit(fetch_url, url)
        tasks[future] = url
bothers you, this is perfectly (some would say more so even than the original) Pythonic:

    tasks = {executor.submit(fetch_url, url): url for url in URLs}
2 comments

I have found another way in the documentation for `concurrent.futures`. You can use `Executor.map` (https://docs.python.org/3/library/concurrent.futures.html#co...). It eliminates the need to wait on the futures explicitly.

  def main():
      with ThreadPoolExecutor(max_workers=len(URLs)) as executor:
          for url, title in zip(URLs, executor.map(fetch_url, URLs)):
              print(f"URL: {url}\nTitle: {title}")
The default value of `max_workers` since Python 3.8 has been

  min(32, os.cpu_count() + 4)
You should probably avoid

  max_workers=len(items_to_process)
It will not save memory or CPU time when you have few items (workers are created as necessary) and may waste memory when you have many.
As a side note, using a future as a map key struck be as a bit weird, though perfectly valid. It'd be more natural IMO to use a list for the futures, and have the fetch_url function return a (url, result) tuple. Or use the url as the map key and just iterate over the map items instead of using as_completed on the keys