| HN Mirror

On Python 2.7.10:

    In [2]: %timeit a+b+c+d
    The slowest run took 6.66 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 247 ns per loop

    In [4]: %timeit "{}{}{}{}".format(a, b, c, d)
    The slowest run took 6.37 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 709 ns per loop

On Python 3.6.1:

    In [3]: %timeit a+b+c+d
    355 ns ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

    In [4]: %timeit "{}{}{}{}".format(a, b, c, d)
    630 ns ± 38.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

    In [5]: %timeit f"{a}{b}{c}{d}"
    21.8 ns ± 1.4 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Edit: that last %timeit made me suspicious, so I dug a bit deeper with the dis module:

    In [8]: import dis

    In [9]: def f1():
       ...:     return a+b+c+d
       ...:

    In [10]: def f2():
        ...:     return "{}{}{}{}".format(a, b, c, d)
        ...:

    In [11]: def f3():
        ...:     return f"{a}{b}{c}{d}"
        ...:

    In [12]: dis.dis(f1)
      2           0 LOAD_GLOBAL              0 (a)
                  2 LOAD_GLOBAL              1 (b)
                  4 BINARY_ADD
                  6 LOAD_GLOBAL              2 (c)
                  8 BINARY_ADD
                 10 LOAD_GLOBAL              3 (d)
                 12 BINARY_ADD
                 14 RETURN_VALUE

    In [13]: dis.dis(f2)
      2           0 LOAD_CONST               1 ('{}{}{}{}')
                  2 LOAD_ATTR                0 (format)
                  4 LOAD_GLOBAL              1 (a)
                  6 LOAD_GLOBAL              2 (b)
                  8 LOAD_GLOBAL              3 (c)
                 10 LOAD_GLOBAL              4 (d)
                 12 CALL_FUNCTION            4
                 14 RETURN_VALUE

    In [14]: dis.dis(f3)
      2           0 LOAD_GLOBAL              0 (a)
                  2 FORMAT_VALUE             0
                  4 LOAD_GLOBAL              1 (b)
                  6 FORMAT_VALUE             0
                  8 LOAD_GLOBAL              2 (c)
                 10 FORMAT_VALUE             0
                 12 LOAD_GLOBAL              3 (d)
                 14 FORMAT_VALUE             0
                 16 BUILD_STRING             4
                 18 RETURN_VALUE

    In [15]: %timeit f1()
    415 ns ± 30.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

    In [16]: %timeit f2
    31.4 ns ± 1.45 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

    In [17]: %timeit f2()
    727 ns ± 43.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

    In [18]: %timeit f3()
    344 ns ± 27.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

My understanding: in the %timeit f"..." example, a constant value for the expression must have been calculated before the execution of the timings. When wrapping things around a function, I'm forcing the interpreter to actually evaluate the f-string on every call, so it's a more apples-to-apples comparison. With the dis module I also verify that there is no constant expression fixed on every function.

Timings are now closer, but the f-string is still a bit faster.

Edit 2: (final edit?) still for string concatenation, "".join(...) is the king:

    In [19]: %timeit "".join([a, b, c, d])
    365 ns ± 39.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

    In [20]: l = [a, b, c, d]

    In [21]: %timeit "".join(l)
    229 ns ± 21.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Notice that in [19] %timeit is also measuring the time to build the list in the first place, which is why I also tested with a prebuilt list in [21]. Building the list in the first place is a cost you have to pay, so it's not entirely free.

Still, all this optimizations make more sense for really long strings and fragments (substrings that you want to join). Extreme case of 256 1-long strings:

    In [43]: %timeit l[0] + l[1] + l[2] + l[3] + l[4] + l[5] + l[6] + l[7] + l[8] + l[9] + l[10] + l[11] + l[12] + l[13] + l[14] + l[15] + l[16] + l[17]
        ...:  + l[18] + l[19] + l[20] + l[21] + l[22] + l[23] + l[24] + l[25] + l[26] + l[27] + l[28] + l[29] + l[30] + l[31] + l[32] + l[33] + l[34] +
        ...: l[35] + l[36] + l[37] + l[38] + l[39] + l[40] + l[41] + l[42] + l[43] + l[44] + l[45] + l[46] + l[47] + l[48] + l[49] + l[50] + l[51] + l[5
        ...: 2] + l[53] + l[54] + l[55] + l[56] + l[57] + l[58] + l[59] + l[60] + l[61] + l[62] + l[63] + l[64] + l[65] + l[66] + l[67] + l[68] + l[69]
        ...: + l[70] + l[71] + l[72] + l[73] + l[74] + l[75] + l[76] + l[77] + l[78] + l[79] + l[80] + l[81] + l[82] + l[83] + l[84] + l[85] + l[86] + l
        ...: [87] + l[88] + l[89] + l[90] + l[91] + l[92] + l[93] + l[94] + l[95] + l[96] + l[97] + l[98] + l[99] + l[100] + l[101] + l[102] + l[103] +
        ...: l[104] + l[105] + l[106] + l[107] + l[108] + l[109] + l[110] + l[111] + l[112] + l[113] + l[114] + l[115] + l[116] + l[117] + l[118] + l[11
        ...: 9] + l[120] + l[121] + l[122] + l[123] + l[124] + l[125] + l[126] + l[127] + l[128] + l[129] + l[130] + l[131] + l[132] + l[133] + l[134] +
        ...:  l[135] + l[136] + l[137] + l[138] + l[139] + l[140] + l[141] + l[142] + l[143] + l[144] + l[145] + l[146] + l[147] + l[148] + l[149] + l[1
        ...: 50] + l[151] + l[152] + l[153] + l[154] + l[155] + l[156] + l[157] + l[158] + l[159] + l[160] + l[161] + l[162] + l[163] + l[164] + l[165]
        ...: + l[166] + l[167] + l[168] + l[169] + l[170] + l[171] + l[172] + l[173] + l[174] + l[175] + l[176] + l[177] + l[178] + l[179] + l[180] + l[
        ...: 181] + l[182] + l[183] + l[184] + l[185] + l[186] + l[187] + l[188] + l[189] + l[190] + l[191] + l[192] + l[193] + l[194] + l[195] + l[196]
        ...:  + l[197] + l[198] + l[199] + l[200] + l[201] + l[202] + l[203] + l[204] + l[205] + l[206] + l[207] + l[208] + l[209] + l[210] + l[211] + l
        ...: [212] + l[213] + l[214] + l[215] + l[216] + l[217] + l[218] + l[219] + l[220] + l[221] + l[222] + l[223] + l[224] + l[225] + l[226] + l[227
        ...: ] + l[228] + l[229] + l[230] + l[231] + l[232] + l[233] + l[234] + l[235] + l[236] + l[237] + l[238] + l[239] + l[240] + l[241] + l[242] +
        ...: l[243] + l[244] + l[245] + l[246] + l[247] + l[248] + l[249] + l[250] + l[251] + l[252] + l[253] + l[254] + l[255]
    27.9 µs ± 2.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

    In [44]: %timeit "".join(l)
    3.71 µs ± 197 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)