Static Asset Compilation

Y	Hacker News new \| ask \| show \| jobs

	Static Asset Compilation (autoref.com)
	48 points by autoref 5068 days ago

15 comments

coenhyde 5068 days ago

Fairly standard stuff. If you're going to have a title with "you're doing it wrong" you should have some unique insight to support your dramatised title.

link

jmtulloss 5068 days ago

I felt the same way. I was hoping the article would be about a radically different approach, but it's mostly just best practices.

link

lsh 5065 days ago

Agreed - getting sick of these thoroughly underwhelming "done right" and "you're doing it wrong" blog posts. Nuts to 'em.

link

voidfiles 5068 days ago

With such a complicated system I think you are missing out on the most signifcant speed optimization technique; reducing http requests.

For reference: http://developer.yahoo.com/blogs/ydn/posts/2007/04/rule_1_ma...

It's laudable that you are paying attention to caching, but you don't compile all your files in to one file. It seems like you could pick up a lot ground here by at least concating all css into one file, and js in to one file.

Also you could load jquery from the Google AJAX API endpoint. That way the users has a higher chance of having already loaded jQuery.

Also using the same CSS/JS products across multiple pages would help.

link

autoref 5068 days ago

An excellent point, but you have to consider warm cache vs cold cache optimizations. For a cold cache, it's better to combine assets and reduce HTTP requests. We do that on our homepage.

For a warm cache, it's better to split assets up so they are cached in finer chunks. If I added jQuery in to every page JS, there would be few HTTP requests but it would pull jQuery every time, making the payload much larger. There's a balance. I'll write another post about warm vs cold cache optimizations soon.

"using the same CSS/JS products across multiple pages would help."

Definitely. Using jQuery on half your site and YUI on the other half is pretty bad from all angles.

"you could load jquery from the Google AJAX API endpoint."

Yeah. Two reasons we don't: 1. I'm in security, and trust no one. 2. HTTPS connection reuse vs negotiation with another host. I have yet another post in the pipe about SSL optimizations.

link

codeka 5068 days ago

I usually have three "chunks" of CSS/JS per page, one for CSS/JS that's shared across all pages of the whole site, one that's common among a "group" of pages and one that's unique to just that page.

Of course, not all pages get all three chunks, but I find it's a reasonable tradeoff between reducing the number of requests and not just including everything on every page.

link

seriocomic 5068 days ago

Hmm, i've just left a comment suggesting combining assets. Based on what you've written here you've already answered one of my questions... I look forward to the warm/cold cache post.

link

eli 5068 days ago

You don't need to rename the files. As of a few months ago, you can configure CloudFront to take query strings into account when caching, so you can simply link to the file as normal but append "?<your_hash_here>" to the filename. (I actually prefer using the last modified timestamp over a hash.) IMHO, this is better because it requires less magic on the origin server. And even ancient references to a file (a logo someone hotlinked, for example) will still render rather than 404, so long as the name hasn't changed. No need to keep tons of old revisions of files around.

link

autoref 5068 days ago

This isn't recommended since many browsers and proxies do not cache resources that are referenced using a query string, even if a cache-control or expires header is set appropriately. Google says Squid up to 3.0 will not do so:

https://developers.google.com/speed/docs/best-practices/cach...

link

eli 5068 days ago

Fair point and thanks for the link -- I've only seen this phrased as a warning against "some proxies" that I've never personally encountered. But I can live with it. If your proxy is broken in this way, you will have to fetch the asset from the CDN rather than benefit from your local proxy.

link

latchkey 5068 days ago

Squid 3.0 was released in 2007. It can be argued that this is an out of date recommendation. My experience using query strings has been fine.

link

eli 5068 days ago

To be fair, I think it was still the current version up through 2011.

The more important point is that the failure mode is simply that the assets load from the CDN as if there were no squid proxy. This is not ideal, but it's not so bad either.

link

ludwigvan 5068 days ago

Another reason this is not recommended is that, if you rename the files, you can send them to the server and then reload your app.

If you don't, there might be a small amount of time before your app is reloaded where your app uses newer resources.

link

captn3m0 5068 days ago

Could someone say if using HttpGzipStaticModule really helps? Gzipping small static resources on-the-fly should not take down your cpu by much.

Surely a nice thing to have, but does it help?

link

howardr 5068 days ago

I found that renaming CSS file names using the hash of the contents does not always work because any changes to dependencies (e.g. images) won't always bubble up to the CSS. I forget all of the reasons why it didn't always work, but it I think it had to do with CDN invalidation for files that I could not-rename (e.g index.html).

The process I use computes the hash of every file and creates a dependency map then I use the hash of the contents of a file and its dependencies to rename the file.

link

autoref 5068 days ago

Right. Images and fonts have to be written and hashed first, then used in the template rendering of the CSS file. The CSS references the assets with hashes in the filenames.

link

nestlequ1k 5068 days ago

Can someone with knowledge of both this and rails 3.1 explain the difference. Seems very similar.

link

amalag 5068 days ago

Yes this article is written for people not using Rails & Sprockets. Pretty amazing the best practices that Rails asset pipeline enforces. It will also concatenate the JS & CSS files to reduce HTTP requests and does automatic .gz files on disk. When used with asset_sync gem it can also push these to S3 or your CDN to avoid your web server altogether.

link

sirn 5068 days ago

In Python world we have webassets[1] that does something similar (to Jammit, anyway). It is a little bit more complicated to use than Sprockets but I'd argue that it is also a bit more flexible. (Thanks to filters chaining e.g. compile SASS, merge them, add vendor prefixes, optimize, compress then Gzip as a single chain.)

[1] http://elsdoerfer.name/docs/webassets/

link

byroot 5068 days ago

You're right, they basically reimplemented Sprockets a.k.a rails assets manager.

link

moonboots 5068 days ago

Good tips. I've found that http://pngquant.org generates smaller pngs than optipng, but the former is lossy (reduced color palette). I can't tell the difference though.

link

tetravus 5068 days ago

You can have lossy compression that results in zero difference to the end image on a pixel by pixel basis.

E.g. if the PNG was 32 bit, and had a full color palette but was filled with a single 8 bit color. You could safely, and "loss-ily", convert the PNG to 8 bit and replace the entire color palette with the single entry for the color that is actually used.

That said, PNGQuant uses dithering so there will often be changes apparent if you perform a pixel by pixel comparison in code.

Just like you, I can't visually identify the difference between a PNGQuant image and the 'raw' PNG that was used to create it (at least not on any images that I've seen so far).

link

ciniglio 5068 days ago

To nitpick a little: If your source and final are the same, I don't think you can call it lossy, by definition.

link

rthprog 5068 days ago

Aside from minifying javascript, you should probably also consider using Google's Closure Compiler in 'Advanced-Compilation' mode. I believe it does a much better job than traditional minification.

https://developers.google.com/closure/compiler/docs/api-tuto...

link

brown9-2 5068 days ago

Is putting hash digests in filenames really easier than sending Last-Modified headers in the response, parsing If-Modified-Since headers and returning 304 when applicable, and/or using ETags?

I would have thought that most web frameworks do all these things for you automatically by now.

link

jmtulloss 5068 days ago

Putting the hash in the filename allows the browser to not even make a request that would result in a 304 request. It also works behind badly behaved proxies and caches that don't properly respect cache headers.

link

lotyrin 5068 days ago

It also allows for pages that were generated and cached before changes were made to still have resources, as well as other cases where you might have divergent sets of resources (split tests, rolling deployments, etc.)

link

malyk 5068 days ago

We use the git commit hash of the checkin that is pushed to production as part of the folder structure for our assets. Has worked really well for us.

It does mean we use more space on s3 though, but it guarantees we won't miss re-seeding some of the files.

link

autoref 5068 days ago

The bad part is no file is cached between pushes, right?

link

ryetoasthumor 5068 days ago

Interesting series http://autoref.com/blog/2012/09/07/the-tech-behind-autoref-p...

Full disclosure (bizdev at AutoRef)

link

cbhl 5068 days ago

Why is it safe to include a subset of the SHA1 digest instead of the whole digest? What's the reasoning behind this? Would it make sense to use a shorter hash (e.g. CRC32) instead if your filenames have to be that short?

link

patio11 5068 days ago

Because SHA1 tends to have every byte of the digest change if so much as one byte of the message changes (if you can disprove that, you have a much more important result than "Oops our caching is slightly borked"). Accordingly, 10 hex digits is sufficient to guarantee that a change breaks the old cache (1 - 1 / 2^40) of the time. You wouldn't be at risk of birthday-paradoxing your caches even with billions of files in your site's history.

link

csense 5068 days ago

You could probably shave a few bytes off your URL's, while achieving the same collision resistance (or alternatively increase the collision resistance in the same number of bytes) if you base64 encoded the hash.

link

dazbradbury 5068 days ago

If you're using .net, RequestReduce [1] is an excellent tool for managing your static assets.

[1] - https://github.com/mwrock/RequestReduce

link

andrewdavey 5068 days ago

Also check out Cassette http://getcassette.net/

link

seriocomic 5068 days ago

+1 for this link. I can't understand how I haven't come across it before...

link

kmfrk 5068 days ago

Is ImageOptim still a good choice to go with? I really like the simplicity of the GUI.

link

kevinconroy 5068 days ago

Yes, ImageOptim is a wrapper around pngoptim and several other programs. It tries them all and goes with the one that provides you with the optimal compression for that specific file. It also supports JPEG and GIF files.

link

bryne 5068 days ago

If you're doing it manually, ImageOptim is probably the best GUI available.

link