Hacker News new | ask | show | jobs
by vitus 908 days ago
tcpdump only uses BPF, not eBPF. BPF is a simpler language that, among other things, is guaranteed to run in finite time because it doesn't have backward jumps, and has limitations on program size (4096 instructions). (The "e" in eBPF stands for "extended", as it extends BPF to remove those limitations, among other changes.)

It compiles your filter expression into a series of instructions, using libpcap. For instance, the output of `tcpdump -d -y EN10MB 'ip and tcp port 80' (which is rather similar to the use case in the OP, but not identical, since it doesn't strip headers), on my machine, is:

    # Load the 2-byte ethernet protocol into A (the accumulator register).
    (000) ldh      [12]
    # If IPv4, continue to line 2. Else, jump to line 12.
    (001) jeq      #0x800           jt 2 jf 12
    # Load the one-byte IP protocol into A.
    (002) ldb      [23]
    # If TCP, continue. Else, jump to line 12.
    (003) jeq      #0x6             jt 4 jf 12
    # Load the 2 bytes corresponding to IP flags / fragment offset into A.
    (004) ldh      [20]
    # If the fragment offset is nonzero, jump to line 12. Else, continue.
    (005) jset     #0x1fff          jt 12 jf 6
    # Load the internet header length into X. (Note that this is the bottom 4
    # bits of the first byte of the IPv4 header, expressed in 4-byte words)
    (006) ldxb     4*([14]&0xf)
    # Load the source port into A.
    (007) ldh      [x + 14]
    # If 80, jump to 11. Else, continue.
    (008) jeq      #0x50            jt 11 jf 9
    # Load the dest port into A.
    (009) ldh      [x + 16]
    # If 80, jump to 11. Else, jump to 12.
    (010) jeq      #0x50            jt 11 jf 12
    # Accept. Return 262144 bytes, the default snaplen.
    (011) ret      #262144
    # Reject. Literally, return 0 bytes.
    (012) ret      #0
If you know how to read assembly, it should be fairly straightforward to follow a typical program (you'll need various protocol header wire formats handy if you haven't memorized the offsets).
3 comments

Could you recommend any book/article/video about how eBPF works? - I got a bit interested in this topic but couldn't find anything technical.
There are a number of good resources at https://ebf.io, including a couple of links to books. I haven't read those books personally, but I would be surprised if "BPF Performance Tools" by Brendan Gregg isn't worthwhile.
Wow.. that was an epic response :-D
eBPF are still verified for completion, not just BPF. This is not relaxed in eBPF.
That's a good point. I should clarify that no such verification is necessary in classic BPF due to the absence of backward jumps -- you can trivially show that the maximum steps executed by a BPF program is 4096, since you'll execute each instruction at most once, and there are at most 4096 instructions.

Meanwhile, the verification that an eBPF program terminates is dependent on the correctness of the verifier, and similarly there's no guarantee that a program with appropriately-bounded complexity will be accepted by the verifier.

To be clear: I'm not trying to throw shade at the verifier; to the contrary, I think it's an impressive piece of software. But there's a difference between being able to prove in one sentence that a program always terminates, and needing to rely on the correctness of some verification software.