| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by scatters 1022 days ago
	`&unused=.htm`. It usually works.

2 comments

jraph 1022 days ago

At this point you might as well use the -o option (-o file.htm). It's easier and easier to understand.

I'd prefer wget to be a bit more clever when handling URLs query strings though, but I guess changing this behavior now might break some scripts.

link

darnir 1022 days ago

The -O option, not the -o option. The capital O sets the output file, while the small o in your comment sets the log filename.

link

jraph 1022 days ago

Yep, thanks for the correction. I meant big -O, I don't know how I ended up writing small -o.

link

em-bee 1022 days ago

well, depends on the usecase. sometimes you want the whole url, like when i want to mirror a site and it has stuff like foo.html?page=1 foo.html?page=2 ...

wget does have options to use the name proposed by the server, and so another option to remove the query arguments would be useful, and in line with those.

link

darnir 1022 days ago

A new option to strip query parameters from the output filename would be interesting. But its not so simple. When combined with recursion, one will often see a lot of pages with the same name but different query parameters. How should they be stored on disk? There's a couple of different issues I can think of.

However, if the potential issues can be resolved with sane defaults, I think this would be a great new switch to add.

link

em-bee 1022 days ago

yes, exactly. i think that the option would have to be ignored when doing recursion. or alternatively use the .1 .2 ... method like with all cases where a file of that name already exists.

link

Karellen 1022 days ago

So... you're adding more noise to the filename?

What?

link

scatters 1022 days ago

It's a simple solution to give the file the right extension, and preserving query parameters can be the right thing to do if you hit the same path repeatedly e.g. for pagination.

link

Karellen 1021 days ago

> It's a simple solution to give the file the right extension,

Oh, I see now.

Do you work with many tools that can't work with files if they don't have the "right" extension? I thought that was mostly a Windows problem.

link