Hacker News new | ask | show | jobs
by KMag 3729 days ago
No blog entries specifically about LimeWire, but I do have a few observations that maybe I'll blog about:

    (1) Merkle trees are tough to get right
      (a) Bittorrent's BEP 30 is vulnerable
      (b) A small tweak would have allowed Gnutella's THEX to carry a proof of file length [0]
      (c) Use the Sakura tree construction [0]
      (d) There was an attack against LW where one could respond quickly with a bogus THEX root for a popular SHA-1
      (e) The THEX root should have been the unique identifier in both DHT and query responses
    (2) Using HTTP for data transfer was definitely the right choice
      (a) It uses X-alts and X-nalts "experimental" HTTP headers for swarm control
      (b) I prototyped an Apache plugin to allow it to transparently participate in Gnutella swarms
      (c) HTTP/2.0 would be ideal now
    (3) Gnutella uses query broadcast
      (a) exponential fan-out means most traffic is in the last hop
      (b) if the fanout is 19:1, 95% of traffic is the last hop
      (c) LW used Bloom Filters to often skip the last hop
      (d) We should have used mulitple hash functions in the Bloom filter
      (e) Adding new hash functions is backward-compatible, at the cost of increased query traffic during transition
    (4) LW connection handshake includes the 32-bit serial number of the latest XML version message
      (a) The message is signed using DSA
      (b) Newly signed XML messages propagate to 95% of the network within 60 seconds
      (c) We accidentally DDoSed our servers by having everyone come for updates at the same time
      (d) So we added user alert time randomization parameters in the XML message
      (e) There was no mechanism to roll over or expand version message serial numbers.
      (f) We could have locked ourselves out of asking users to upgrade by signing an INT_MAX serial XML message.
    (4) We wrote a minimal C++ agent capable of downloading the latest free LW version from LW nodes
      (a) SHA-1 of the free installer is part of the signed XML version message above
      (b) SHA-1 was checked before running the full installer, preventing malware injection
      (c) It was great for saving bandwidth and reducing legacy support
    (5) I misplaced a paren in LimeWire QueryKey crypto code (later fixed)
      (a) QueryKeys prevent turning the LW network into a DDoS botnet
      (b) I knew the code wasn't behaving quite right
      (c) I convinced myself that my reasoning was wrong and the code must be right
    (6) Random seeks are tough on equipment
      (a) Apache would kernel-panic OSX on random HTTP range requests (ca 2006)
      (b) Anecdotally, random block download order wasn't great for hard drive life
      (c) Random download order code tried to minimize number of file extents
         (i) Saves bandwidth in describing what you have
         (ii) Might be better for hard drive life 
[0] http://kmagsoftware.blogspot.hk/2016/02/on-content-addressed...