Hacker News new | ask | show | jobs
by shakna 1475 days ago
It was just paragraphs of text. p, strong, em, and q mingled at most. No figures or images or anything of the like to radically shift DOM computations. That the effect can even be seen is probably due to the scale of the document, as I noted it's a little larger than most things.

All paragraphs had a blank line between them, both with and without the p end tag. The p opening tag was always at the top-left, with no gap between it and the content.

So, for example:

    <p>Cheats open the doorway for casual play. They make it easier for disabled players to enjoy the same things as their peers, and allow people to skip parts of a game that <em>they bought</em> that they find too difficult.</p>

    <p>Unfortunately, cheats are going away, because of extensive online play, and a more corporate approach to developing games that despises anything hidden.</p>
Versus:

    <p>Cheats open the doorway for casual play. They make it easier for disabled players to enjoy the same things as their peers, and allow people to skip parts of a game that <em>they bought</em> that they find too difficult.

    <p>Unfortunately, cheats are going away, because of extensive online play, and a more corporate approach to developing games that despises anything hidden.
(You can also discount CSS from having a major effect. Less than a hundred lines of styles, where most rules are no more complicated than: `p { font-family: sans-serif; }`. No whitespace rules.)

However, if you wanted to look at this in a more scientific way - it should be entirely possible to generate test cases fairly easily, given the simplicity of the text data I saw my results with.

2 comments

Yay, thanks for info and inspiration, sure it seems like fun weekend project.

(BTW your snippet's content sounds interesting and feels relatable, definitely intrigued.)

Finally did some synthetic measurements of (hopefully) parse times (not render nor CSSOM or anything like that). Differences seems microscopic but overall aligned with my initial expectations (omitting the closing tag actually shaves a bit of yak's hair), so I suspect that the real overhead you observed is caused by something happening after parse, where absence of trailing white-space in DOM nodes (ensued by closing tags) helps in some way. I guess something around that white-space or text layout. (Speaking of insignificant white-space, you could probably gain some more microseconds if you'd stuck paragraphs together (`..</p>\n\n<p>..` -> `..</p><p>..`), however such minification seems like a nuisance.)

Tested only on Windows, in browser consoles.

Numbers:

Firefox (Nightly) (performance.now is clamped to miliseconds)

    total; median; average; snippet
    2279.0; 4.0; 4.558; '<p>_'
    2652.0; 4.0; 5.304; '<p>_</p>'
    2471.0; 4.0; 4.942; '<p>_abcd'
    2387.0; 4.0; 4.774; '<p>_\n'
    3615.0; 5.0; 7.230; '<p>_</p>\n'
    2380.0; 4.0; 4.760; '<p>_abcd\n'
    3093.0; 5.0; 6.186; '<p>_\n</p>\n'
    3107.0; 5.0; 6.214; '<p>_</p>\n\n'
    2317.0; 4.0; 4.634; '<p>_abcd\n\n'
    2344.0; 4.0; 4.688; '<p>_\n\n'
Google Chrome (performance.now is sub-milisecond)

    total; median; average; snippet
    2870.4; 5.2; 5.741; '<p>_'
    2895.2; 5.4; 5.790; '<p>_</p>'
    2684.7; 5.2; 5.369; '<p>_abcd'
    2845.4; 5.2; 5.690; '<p>_\n'
    3836.7; 7.3; 7.673; '<p>_</p>\n'
    2837.8; 5.2; 5.676; '<p>_abcd\n'
    4022.5; 7.4; 8.045; '<p>_\n</p>\n'
    4044.3; 7.3; 8.089; '<p>_</p>\n\n'
    2928.4; 5.2; 5.857; '<p>_abcd\n\n'
    2805.3; 5.2; 5.611; '<p>_\n\n'
Test config

    Snippets per document: 5000
    Rounds: 500
    Wrap: '<!doctype html>(items-paragraphs)'
    Content each item (_): bunch of random digits chunks, something like '1943965927 52 27 5 51664138859173 5161 7226 5 15 2 55679 6553712585'
Code: https://gist.github.com/myfonj/57a6a8fcb1c5686527412543a897c...

(Before realizing I can use synthetic domparser I made something what measures document load time in iframe (http://myfonj.github.io/tst/html-parsing-times.html) but it gives quite unconvincing results, although probably closer to the real world. Understandably, synthetic domparser can crunch much more code than visible iframe.)