|
Former LimeWire engineer here. I implemented randomized chunk selection for swarmed downloads in LimeWire. I think the proper swarmed streaming solution is to make the percentage chance of requesting the rarest chunk be a smooth function of the number of replicas of the rarest chunk. If the rarest chunk has only one source, the probability should be 1.0. If there are multiple chunks that are all the most rare, you probably want to randomly select which chunk to request next, with an approximately exponential distribution rather than uniform distribution. Y probability of selecting the first chunk, Y^2 for the second, Y^3 for the third, etc. You'd want to run some simulations to fine-tune the probability function and also the Y percentage. What I did for LimeWire was (1) if the MIME type wasn't on a streamable whitelist, download all chunks in randomized order. (2) if the file was a streming type and less than 10% complete, start downloading the available chunk closest to the front of the file that isn't currently in progress (3) if the file is a streamable type and 10% to N% complete, randomly select either in-order or randomized selection with probability X. (4) beyond N% complete, always use random chunk selection. I'm pretty sure N was 50 and nearly certain X was 0.5. I originally proposed making X a smooth function of the % downloaded instead of 0, 0.5. 1.0 stair-steps, but the lead developer strongly preferred stair-steps. The random selection algorithm actually tried to keep the number of ranges of bytes (extents) below 5. So even after 50% downloaded, you still have a 25% chance of getting in-order downloading. The reason I used randomization instead of rarest-first was that it was my first change to LimeWire, and this was the least invasive change to make. At that time, LimeWire had a global list of verified downloaded chunks and a global list of in-progress chunks, but no global counter for number of replicas. |
I vaguely seem to remember WMV and ASF also needing some information from footers in the file, and therefor also prioritizing the last MB of the file.
This was all implemented using the Strategy object oriented design pattern, to make it easier to play around with many alternatives and make specialized strategies for specific MIME types.