Tell you, might not believe it, but

Thursday, 25 January 2007

I would like to have a simple way of mirroring a webpage, including any referenced .css, .js, and images.

owever to complicated matters I wish to mandate that the file will be saved as “index.html” – regardless of what it was originally called.

This appears to rule wget out, as the

-output=index.html option trumps the -page-requisites flag (which is used to download images, etc which are referenced.)

Is there a simple tool which will download a single webpage, save it to a user-defined local filename and also download referenced images/css files/javascript files? (Rewriting the file to make them work too)

Using Perl I could pull down the page, and I guess I could parse the HTML manually – but that seems non-trivial – but I’d imagine there is a tool out there to do the job.

So far I’ve looked at curl, httrack, and wget.

If I’m missing the obvious solution please point me at it ..

(Yes, this is so that I can take “snapshots” of links added to my bookmark server.)

| No comments

 

 

Recent Posts

Recent Tags