Hacker News new | ask | show | jobs
by causi 1666 days ago
When you usually try to download an image, your browser opens a connection to the server and sends a GET request asking for the image.

I'm not a web designer, but that seems rather ass-backwards. I'm already looking at the image, therefore the image is already residing either in my cache or in my RAM. Why it is downloaded a second time instead of just being copied onto my drive?

7 comments

Oh no, it's still downloading the one it's displaying on screen. You can even see a spinny thing as the icon of the tab on Chrome.

The format allows for showing images when they are partially downloaded, and also allows pushing data that doesn't actually change the image.

Okay? So we still seem to have an accurate representation of the image we want. Why can't I just download that and what's the point of the rest of the data. If we already are seeing the image, the rest of the data is pointless no?
Certainly so, yes. But your browser doesn't know that.
but the browser doesn't know that the image is already done, and since there's still data coming in, the browser is obliged to continue downloading.

you could right click, and copy image, rather than save as. It achieves what you wanted - save a copy of the image.

You can totally "download" the image in your RAM by right clicking / long pressing -> "copy image" or equivalent in most browsers. It's just not going to be a byte by byte identical file, and may be in a different format, e.g. you get a public.tiff on the clipboard when you copy an image from Chrome or Safari on macOS, even if the source image is an image/svg+xml.
That's the first thing I tried, "copy image" then, on gimp, file->create->from clipboard.

And it just worked, with no hassle.

As far as I remember from a previous project from a few years ago, the browser doesn't include a referrer for the download request, which can be used for a distinction. (You'll have to disable caching and E-Tags for this to work.)

However, this is easily defeated by the use of the console: Select the sources tab, locate the image and simply drag-and-drop the image from there, which will use the local cache instance for the source. Works also with this site, at least with Safari.

> [...] which will use the local cache instance for the source

I don't understand why browsers aren't always doing this. They already have the image, why redownload it?

I guess, this is for historical reasons. Mind that there is no such thing as a single, cached image. There's the downloaded content, a decoded bitmap derived from this, a buffer for any instance of the image, which may be clipped or distorted (and may have local color management applied, e.g., converted to 8-bit color range). (At least, it used to be that way. I faintly remember that this used to be a 4-step process.) When memory wasn't ample, any of these, but the instance buffer(s), may have been purged, and an instance buffer doesn't represent the original image anymore. So it makes sense to get a new clean image in the original encoding.
> They already have the image, why redownload it?

They don’t already have the image. They have part of the image. Because the connection hasn’t closed, as far as the browser is concerned, it’s still in the process of downloading it.

> When you usually try to download an image, your browser opens a connection to the server and sends a GET request asking for the image.

I can't vouch for chromium-*, but my Firefox does NOT do that. I've just tested it.

I have problem understanding what problem is this solving?

When the image is on my screen I can just screenshot it.

This is a common problem, using something in insecure environment, thats why companies are going into such extents to encrypt movies on whole train from source to the display and even those are regularly dumped.

It's not "solving" anything, just demonstrating an interesting gimmick
What’s the gimmick because I just save that image to photos on iOS?
Definitely a gimmick. Interesting might be a bit of a stretch
And even if they figured out some DRM method to prevent screenshotting/screen recording, I can still point my phone camera at my monitor and capture it that way, if I really want to. There is always a way around whatever they try to do.

If I can see it, I can make a copy of it.

> I can still point my phone camera at my monitor and capture it that way

Back in the late 1990s/early 2000s (this was so long ago that I cannot quickly find a reference), there were proposals to require all non-professional audio and video recorders to detect a watermark and disable recording when one was found. Needless to say this was a terrible idea, for several reasons.

But because they try the rest of us suffer the consequences of more expensive and slower hardware and all kinds of other problems.
Yes. DRM always hurts the legitimate users more than the "pirates". Same with disabling right click or otherwise trying to prevent downloading images.
I don't know about browser internals, but I would guess that the browser decodes the image once into a format that can be shown on the page (so from PNG/JPG/WEBP into a RGBA buffer) and then discards the original file. This saves a bit of memory in 99.99% of cases when the image is not immediately saved afterwards.
More likely the original file is saved in the browser cache. That's why it loads faster when you reload the page, and slower when you do a full reload by holding down shift. In Firefox you can see the files with about:cache, and find them in ~/.cache/mozilla/firefox/e1wkkyx3.default/cache2/entries/ or similar (they have weird names with no extension, but the file command will identify them, in their original format). In Chrome they're packed into files with metadata like the URL at the start. You can extract the original file by looking at a file in the cache folder [1] and snipping the header off (you can guess where it is by looking at the file contents with xxd or a hex editor).

More info (and link to a Windows viewer tool) here: https://stackoverflow.com/questions/6133490/how-can-i-read-c...

[1] For me on Linux, Chrome's is ~/.cache/google-chrome/Default/Cache/

Interesting if that is the explanation. I wonder if any browsers offer a "privacy mode" where the original images are saved, thereby preventing the server from knowing which specific images you chose to save and were therefore interested in. I wonder how often that information is logged, and whether those logs, if they exist, have ever been put to a purpose such as in a court case.
I'm pretty sure it only discards the original after x number of other (new) images have been decoded. (Or perhaps it's memory footprint based?)

I ran into a Chrome performance bug years ago with animations, because the animation had more frames than the decoded cache size. Everything ground to a halt on the machine when it happened. Meanwhile older unoptimized browsers ran it just fine.

One cool related thing is that (I believe) modern graphics cards (even Intel) can store and use JPG blocks directly from GPU memory, so it's not necessarily beneficial in the long term to convert to RGBA in advance. Though I think no modern browser actually does this, especially given how power-cheap decoding jpeg (with SIMD) already is and how likely it is that gpu bugs would interfere.
I don't think they can use jpg directly, that would be a waste of transistors given that the graphics world use other compression formats like etc1, bc, astc and so on.

It is however perfectly possible to decode blocks of JPG on a GPU by using shader code.

I'm pretty sure that Safari (and probably most browsers) on MacOS renders JPEGs via CoreImage, and I have seen hints that CoreImage has various GPU-accelerated pathways, though I don't know whether those include DCT or JFIF on the GPU.
This used to be common behavior, but changed over time in most browsers.

Your guess is as good as mine as to why.