|
Considering everything that is involved in making a request to the internet, multithreading would have to be spectacularly slow to even come close to making serial approach quicker: $ python quicktest.py
['http://www.google.com', 'http://news.bbc.co.uk', 'http://news.ycombinator.com', 'http://www.cnn.com', 'http://www.foxnews.com', 'http://www.msnbc.com']
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
Serial: 1.23853206635
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
Multiprocess: 0.912357807159
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
Multithreaded: 0.708998918533
edit: Here's the code: import requests
import time
from multiprocessing import Pool
from multiprocessing import Pool as ThreadPool
session = requests.Session()
urllist = ['http://www.google.com',
'http://news.bbc.co.uk',
'http://news.ycombinator.com',
'http://www.cnn.com',
'http://www.foxnews.com',
'http://www.msnbc.com']
# Warm up?
responses = []
for url in urllist:
responses.append(session.get(url))
print urllist
start = time.time()
responses = []
for url in urllist:
responses.append(session.get(url))
print responses
print "Serial: {}".format(time.time()-start)
start = time.time()
pool = Pool()
responses = pool.map(requests.get, urllist)
print responses
print "Multiprocess: {}".format(time.time()-start)
start = time.time()
pool = ThreadPool()
responses = pool.map(requests.get, urllist)
print responses
print "Multithreaded: {}".format(time.time()-start)
|