Hacker News new | ask | show | jobs
by Twirrim 3216 days ago
Considering everything that is involved in making a request to the internet, multithreading would have to be spectacularly slow to even come close to making serial approach quicker:

  $ python quicktest.py 
  ['http://www.google.com', 'http://news.bbc.co.uk', 'http://news.ycombinator.com', 'http://www.cnn.com', 'http://www.foxnews.com', 'http://www.msnbc.com']
  [<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
  Serial: 1.23853206635
  [<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
  Multiprocess: 0.912357807159
  [<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
  Multithreaded: 0.708998918533
edit: Here's the code:

  import requests
  import time
  from multiprocessing import Pool
  from multiprocessing import Pool as ThreadPool
  
  
  session = requests.Session()
  
  urllist = ['http://www.google.com',
             'http://news.bbc.co.uk',
             'http://news.ycombinator.com',
             'http://www.cnn.com',
             'http://www.foxnews.com',
             'http://www.msnbc.com']
  # Warm up?
  responses = []
  for url in urllist:
      responses.append(session.get(url))
  
  print urllist
  
  start = time.time()
  responses = []
  
  for url in urllist:
      responses.append(session.get(url))
  
  print responses
  print "Serial: {}".format(time.time()-start)
  
  start = time.time()
  
  pool = Pool()
  responses = pool.map(requests.get, urllist)
  
  print responses
  print "Multiprocess: {}".format(time.time()-start)
  
  start = time.time()
  pool = ThreadPool()
  responses = pool.map(requests.get, urllist)
  
  print responses
  print "Multithreaded: {}".format(time.time()-start)