| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kgtm 3566 days ago

Maybe it will make more sense once it fully sinks in, but I think in general it is a mistake to make developers think about when and where certain things can be omitted. It's more straightforward to simply do one thing, consistently, following the "explicit is better than implicit" mantra.

What happened to optimizing for mental overhead instead of file size? This simply should be a build step, part of your minification and concatenation dance, not having to consider all of these when trying to decide if I should close my <p> tag or not:

A p element's end tag may be omitted if the p element is immediately followed by an address, article, aside, blockquote, details, div, dl, fieldset, figcaption, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, main, menu, nav, ol, p, pre, section, table, or ul element, or if there is no more content in the parent element and the parent element is an HTML element that is not an a, audio, del, ins, map, noscript, or video element, or an autonomous custom element.

8 comments

Arnavion 3566 days ago

This reasoning is why I write all the web pages for my personal projects using XHTML. I can't be bothered to remember which tags are self-closing, which tags need explicit closing tags which can't be combined into the opening tag, etc. Everything's consistent in XHTML.

nayuki 3566 days ago

Agreed. Years ago I started doing all my projects in XHTML because I found that debugging silent HTML errors was not fun.

Silent errors include things like malformed tags and attributes, incorrect nesting structure (thus also messing up where CSS rules are applied), and unescaped left-angles and ampersands.

prodigal_erik 3566 days ago

This is why I've always advocated DTD-validation of HTML (which is shockingly underused).

icedchai 3566 days ago

I've never actually seen anyone validate their HTML. If you suggested this in most companies, they would look at you like you had two heads.

jimminy 3566 days ago

About a decade ago it was a pretty commonplace thing to happen.

HTML 4/4.1 was kind of messy, and could have rendering issues. So going with an (x)HTML validator was a common thing, as well as a marketable value proposal to clients.

HTML 5 had much "saner" implementations, so validators fell by the wayside as they weren't as necessary for compatibility.

intrasight 3566 days ago

A good text editor will validated against the DTD as you type. And after you publish, you can use https://validator.w3.org/

jsingleton 3565 days ago

The Firefox source viewer (not the developer tools DOM viewer) does validation. It will highlight bad tags in red and if you hover over them it shows the error.

luhn 3566 days ago

I'm pretty sure Moz has HTML validator built into its SEO tool, so it may be more common than you think solely because of that. We validate HTML at my company—If we don't we'll hear about it next time our boss runs an SEO check,

amckinlay 3565 days ago

How can you validate HTML when most of the HTML these days is templated and generated dynamically?

ChoHag 3565 days ago

It turns into a complete document eventually.

TheRealPomax 3566 days ago

It's why I stopped using hand coded markup at all, aside from markdown for article data. Everything else is data pushed into templates that generate "whatever the code is that the client needs to receive", and let the build tools figure it out. That's what they're for.

Dylan16807 3566 days ago

As long as you're sure it will never be interpreted as HTML, you can do that. Which is harder than it should be, because doctype declarations are ignored. One lost header or unforseen embedding and everything after that <script /> tag gets eaten.

timlyo 3566 days ago

I've started using haml recently, it handles this for you and works well for me.

BurningFrog 3566 days ago

You shouldn't have to remember it, but you editor could and should.

Arnavion 3566 days ago

XML validators are much more common in editors than HTML validators, probably because XML is both easier to parse and used for a lot more than XHTML.

DonHopkins 3564 days ago

I write and edit all my Genshi [1] templates as xhtml, so I can validate and process them as crisp clean hi-fidelity xml, and then pump them out to browsers with the html serializer [2].

If I were inclined to follow Google's guidelines on omitting optional tags, it would be easy to write a stream filter that removed them [3].

But I prefer source templates to have all the explicit properly indented structure, so they're easier to validate and process with XML tools (and by eye), and unintentional mistakes don't sneak through as easily.

For the same reason, I also prefer not to write minified JavaScript source code: that should be done by post-processors, no humans. ;)

[1] https://genshi.edgewall.org/

[2] https://genshi.edgewall.org/wiki/ApiDocs/genshi.output

[3] https://genshi.edgewall.org/wiki/Documentation/streams.html#...

scrollaway 3566 days ago

If you are writing <p>something <div>like this</div><p> then your editor knows you are making a mistake and can highlight it.

If on the other hand you are not closing tags that autoclose, how can your editor tell you? There is no way to know it's not intended.

BurningFrog 3565 days ago

The editor could have a setting for that.

But mostly I meant that you don't have to close the autoclosing tags.

franze 3566 days ago

interesting - which mime/type do you use?

Arnavion 3566 days ago

application/xhtml+xml as it's supposed to be, though it's not that I have a choice since Github sets all the headers.

zodiakzz 3566 days ago

So you don't? Coz it would be nuts.. I'd wager 90%+ of the 3rd party JavaScript out there will choke on it.

berns 3565 days ago

You can use text/html. Technically it wouldn't be an XML resource, but it's correct for HTML5. You can also use an xhtml doctype[1]. And don't forget that the HTML5 namespace is http://www.w3.org/1999/xhtml! [2] So basically you use your xhtml tools and just publish as HTML5.

[1] https://www.w3.org/TR/html5/syntax.html#obsolete-permitted-d... [2] https://www.w3.org/TR/html5/infrastructure.html#namespaces

hhsnopek 3566 days ago

"I can't be bothered to remember which tags are self-closing, which tags need explicit closing tags which can't be combined into the opening tag, etc"

You're right lets not bother ourselves with this small things, cause === and == do the exact same comparison in Javascript and all browsers are exact replicates when implementing html, css and javascript.

Beyond all the sarcasm, in reality, web programming is a hassle. But other programming languages and markups have their quirks as well. I'm glad you found a solution, but it doesn't mean we shouldn't look at the fine details of a specification.

Arnavion 3566 days ago

>cause === and == do the exact same comparison in Javascript

I also don't bother remembering how == works. I use === everywhere. The reason is the same - lower cognitive overhead.

>all browsers are exact replicates when implementing html, css and javascript

The browsers I care about all parse XML correctly.

>I'm glad you found a solution, but it doesn't mean we shouldn't look at the fine details of a specification.

I'm only talking about myself, yes. I only make websites for my personal projects. I'm certainly not a web dev by profession or even by hobby.

luhn 3566 days ago

I don't understand your argument. Yes, web programming has lots of warts and subtle behaviors and inconsistencies. So shouldn't we jump on a chance to remove a small part of that from our day-to-day development? OP isn't advocating ignorance of the spec, just a way not to need to reason with it as often.

hhsnopek 3566 days ago

No argument, just a comment that displays my disapproval of not fully complying with the spec before blaming it.

MaulingMonkey 3565 days ago

How is anyone not complying with the spec by not micro-optimizing away legal-but-redundant tags?

AgentME 3566 days ago

You already have to consider all of those cases about the <p> tag: because they auto close when they hit one of those elements, that means that <p> tags can't contain any of them. If you don't know about this while using <p> tags, you can be in for a world of fun mysterious issues.

luhn 3566 days ago

But all those tags are things that no sane developer would put inside a p tag anyways, so you don't really have to think about them.

The real mental overhead is incurred when reasoning about the tag following the p, which could be anything. "Hmmm, I have a nav tag coming after this p tag. Does that implicitly close it?"

Although if you had a good autoindenter, you could catch any mistakes by how it was indented. "Oh, that nav tag is on the same indentation level as the p tag, I guess it does implicitly close it."

tinco 3566 days ago

I have done web dev on and off for over 15 years and I've never even thought about what happens when you put a h1 in a p. In my opinion the browser should crash and the operating system should BSOD. I have always been severely annoyed by the amount of shit browsers put up with. I don't understand why XHTML strict didn't get the traction it deserved and why they didn't continue along that line with HTML5.

sopooneo 3565 days ago

Because the world is made up of messy people. And the value of allowing messy content was perceived as outweighing the value of consistency and reliability. I happen to agree.

AgentME 3565 days ago

I ran into this when working on some software that put user comments in <p> tags. I added some allowed markup that came out as <div> tags for a collapsible section. It didn't strike me as a particularly insane feature, but I about lost my mind trying to figure out why the <div> tags appeared to negate the <p> tag styling for all of the text after it.

tedmiston 3566 days ago

> This simply should be a build step

This is a great point, but when I think of build steps, I think of something like minifying which comes with a performance gain.

I'm not sure I see what the obvious gain to omitting optional tags in the way Google suggests is.

Edit: To clarify, I'm wondering if there's some performance gain by the browser not having to parse the implicit optional tags.

jacobevelyn 3566 days ago

How is a build step that turns your HTML into a smaller amount of HTML with the exact same behavior (by removing optional tags) different from a "minification" step that turns your HTML or JS into a smaller amount of HTML/JS with the exact same behavior?

This is minification, isn't it?

tedmiston 3566 days ago

Updated to clarify my comment as a reference to browser performance not file size.

the_duke 3566 days ago

Smaller file size -> faster loading (in theory... if gzipped, it's probably redundant).

Possibly faster parsing, because the parser has less HTML so go through. (also probably not valid, because I'd be pretty sure that reading a string from memory is not the bottleneck in parsing, compared to logic, memory allocations, etc).

It could make a difference for Googles server infrastructure though.

If they have to download a tiny bit less, and save a tiny bit on CPU cycles and memory for each page, , it might still lead to considerably savings.

nevir 3566 days ago

> I'm wondering if there's some performance gain by the browser not having to parse the implicit optional tags.

The motivation behind this style is not browser parsing perf - it's network perf. The smaller your HTTP response, the fewer packets (and round trips) required to transmit it.

DougWebb 3566 days ago

If your output is compressed (which it should be if you're worried about response size) then omitting end tags has much less impact, I believe. All of the tags should get compressed well because they're repeated so often, and they should be much smaller overall than your non-repetitive content.

TheRealPomax 3566 days ago

But note that on the scale of move as much data around as Google does, or even "the web as a whole", shaving even a few bytes off of every single gzip packet stream can still equate to significant network relief.

DougWebb 3566 days ago

I suspect their advice is for their benefit, not other website devs. They can save a lot of space in their archive if everyone's pages were smaller. Nothing compared to better image compression though.

dgacmu 3566 days ago

No - a few bytes on a web page are insignificant compared to the data volume of images and movies. This is all about getting pages to load faster on mobile.

nevir 3563 days ago

If those extra bytes drop you from two packets to one, that's a _significant_ reduction in traffic

(which, IIRC, was the original rationale behind that style guide rule)

h1d 3566 days ago

If you would gzip your output like you should, how much does that even buy? There's usually something better to use your time for instead of trying to shave 500 bytes out of your page.

TheRealPomax 3566 days ago

it's not a competition, though. If there's something better to do, also do that. However, that does still leave the question of how many bytes are actually saved in transport, especially with gzipping. The benefit here is absolutely not individual developers or even individual sites, but the data transfered by entire data centers over the course of a day, week, month etc. If this recommendation can bring down the total byte transmission for "the web" by 0.001% for instance, that's still a boatload of bytes that don't bog down the network anymore.

Dylan16807 3566 days ago

When you're looking at fractions of a percent, remember to consider other options. Set up brotli, for example. Or redesign your site to have a leaner layout. You might not ever reach the efficiency level where optimizing optional tags is the best use of dev time.

And the overhead of tracking which tags are optional in which circumstances is not particularly small. Consider that the extra complexity could impede more optimizations in the future, especially now that your markup requires a more complex parser than it could have needed.

the_duke 3566 days ago

Have you looked at the size of Youtube and Netflix videos?

According to this study [1], 70% of web traffic is video streaming. Only 8% are web browsing (which might include images, because they are not mentioned anywhere else - didn't find any info on that).

This is not going to make any difference.

TheRealPomax 3566 days ago

Just because the vast majority of roads are for cars doesn't mean we should therefore not try to optimize the bike and pedestrian lanes.

Sure, a lot of the traffic is streamed data rather than HTML, but 30% of close to a zetabyte of data in a single day (for the internet as a whole) is still hundreds of petabytes that can be made drastically smaller. When the numbers are that large, even optimizing for something as "insignificant" as 0.01% of the traffic means 10s or even 100s of terabytes not pumped through the network every day.

M2Ys4U 3565 days ago

compression and encryption often don't play nice with each other. See CRIME and BREACH, for example.

bcoates 3566 days ago

The double negative phrasing Google and the spec uses makes it sound weirder than it is. You could phrase it as "only use tags that are needed for the document to be parsed correctly" which makes explicitly including an <html> tag with no attributes or a information-free stack of closing tags seem like a strange thing to do if it wasn't tradition.

rtpg 3566 days ago

file size? It's not much, but it would still strip some stuff.

I'm still bitter that HTML/XML works based off of explicit closing tags (where you can mistakenly close the wrong tag) instead of something like braces.

TheRealPomax 3566 days ago

Use a build tool (which you should be doing anyway if you hand-write any markup, because you need to validate it) and make it rewrite </> to the relevant closing tag, if necessary... problem solved? (and yes, you'd be free to even leave </> off in many, many places: https://www.w3.org/TR/html5/syntax.html#optional-tags).

Alternatively, don't use HTML at all. Use pug (formerly "jade") or something and now you're free from all those inconvenient angle brackets.

lgamero 3566 days ago

After using pug, I don't think I can go back to plain HTML. I didn't know how much i hated closing tags.

a3n 3566 days ago

Maybe it reduces the load on Google's crawls of the web.

sotojuan 3566 days ago

Less HTML to load? Probably makes no difference in most cases, but it is less to load.

Many React/Webpack flows do something like this (minify or use a barebones template HTML).

bpicolo 3566 days ago

Seems like something you can add a build step for though. Less human overhead

sotojuan 3566 days ago

Sorry I wasn't very clear, that's what I meant by "React/Webpack" flow! See: https://github.com/ampedandwired/html-webpack-plugin

userbinator 3566 days ago

but I think in general it is a mistake to make developers think about when and where certain things can be omitted.

Yes, sometimes it is better to make developers think about when and where certain things are required.

nchelluri 3566 days ago

I agree, but maybe a transformer step could do it automatically. Write full HTML, generate less.

h1d 3566 days ago

After 20 years of composing HTML, the world can do better than write full HTML but use technology like jade and not worry about what goes to the browser...

CoryG89 3566 days ago

Why not worry about what goes to the browser? In my eyes, what actually runs in the browser is the only thing that matters in the end. You could still write in something like Jade, transpiler, minify, then strip unneeded tags, all with automation.

h1d 3566 days ago

I didn't mean not to care what goes to the brower, I meant if tools like jade does it right for us, the rest of us no longer have to care about those little details.

Frankly I'm amazed the HTML way of verbose writing still stands after all these years in a fast paced industry.

tux1968 3566 days ago

Hadn't heard of Jade myself so your post inspired me to go looking. On http://learnjade.com/ the front page example shows that Jade doesn't take advantage of this ability to omit the closing </p> tag. So while I agree with you that html is not the best form for authors to write in, Jade itself still has room for improvement.

city41 3566 days ago

Would that be jade's job? I'd argue jade's job is to make writing html easier. Another tool should take on html optimization.

RubyPinch 3566 days ago

> A p element's end tag may be omitted if the p element is immediately followed by block-ish element, or if there is no more content in the parent element.

> This doesn't apply if you are doing weird stuff in a non-block-ish element, or a media element, or a custom element.

is the easier way to think about it usually

h1d 3566 days ago

It's really just better to keep a closing p tag, so you don't have to care about consequence when you edit that part later... Does not type </p> save anything? No.

RubyPinch 3566 days ago

I honestly prefer it when editing

    <p>
        It naturally acts as a clean way to segment
        paragraphs of text
    <p>
        And most of the tag-closing rules are roughly
        matched with the rules of using p tags altogether.
    <p>
        e.g. you can't have a div within a paragraph, so
        closing or not closing, divs can only come after 
        paragraphs!.

hellcow 3566 days ago

It actually saves at least 4 bytes per closing tag. On a larger webpage, that could easily add up to saving hundreds or thousands of bytes per request. That's a significant savings, especially for mobile.

h1d 3566 days ago

gzip makes it insignificant.

I just took a sample page out of here which has bunch of p tags open and closed, gzipped the original and the one with </p> stripped, difference was 39 bytes.

https://en.wikipedia.org/wiki/C_(programming_language)

emn13 3566 days ago

Ironically, if end tags were truly non-optional, html might actually compress better, because it would have less entropy (less choices). In practice, it would allow for a compression filter to represent the tree structure in a less redundant form with fewer corner cases to deal with (much like compressors do for binaries, for example).

rhizome 3566 days ago

Thousands? Over 250 p tags on a page?

thinkloop 3565 days ago

It's possible/likely that's what they mean. However the final markup is generated, make it minimal, shout-out to react, packagers, minifiers, etc

meerita 3565 days ago

I write HAML. It's confortable, resumed and strict. It outputs nice-formated HTML.