<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joe's Amazing Technicolor Weblog &#187; httrack</title>
	<atom:link href="http://slagwerks.com/blog/index.php/tag/httrack/feed/" rel="self" type="application/rss+xml" />
	<link>http://slagwerks.com/blog</link>
	<description></description>
	<lastBuildDate>Fri, 23 Jul 2010 22:31:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>HTTrack: new go-to program for web mirroring / archiving</title>
		<link>http://slagwerks.com/blog/index.php/2009/04/02/httrack-new-go-to-program-for-web-mirroring-archiving/</link>
		<comments>http://slagwerks.com/blog/index.php/2009/04/02/httrack-new-go-to-program-for-web-mirroring-archiving/#comments</comments>
		<pubDate>Thu, 02 Apr 2009 19:10:44 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[archive]]></category>
		<category><![CDATA[httrack]]></category>
		<category><![CDATA[mirror]]></category>
		<category><![CDATA[wget]]></category>

		<guid isPermaLink="false">http://slagwerks.com/blog/?p=265</guid>
		<description><![CDATA[Faced with a big site full of URLs like http://mysite.com/Internal1.asp?id=357 to mirror &#38; archive, I recently tried out a new (to me) tool, HTTrack. I&#8217;ve fiddled with wget for this sort of job in the past, but it always takes me ages of man-page reading to get my options right, and even then not everything [...]]]></description>
			<content:encoded><![CDATA[<p>Faced with a big site full of URLs like <code>http://mysite.com/Internal1.asp?id=357</code> to mirror <span class="amp">&amp;</span> archive, I recently tried out a new (to me) tool, <a href="http://www.httrack.com/">HTTrack</a>. I&#8217;ve fiddled with wget for this sort of job in the past, but it always takes me ages of man-page reading to get my options right, and even then not everything seems to work&nbsp;out.</p>
<p>This time around, for example, I&#8217;d convinced myself that <code>wget -r -N -l inf --no-remove-listing -E -k -p http://mysite.com</code> would do the trick. It mostly did, except for seemingly random pages that didn&#8217;t get all of their links&nbsp;converted.</p>
<p>HTTrack, on the other hand, did The Right Thing without any switches or arguments whatsoever. It was a bit more of a pain to get running; even though it&#8217;s in macports, right now the port is lagging behind the available versions, so I had to actually type <code>./configure</code> and <code>./make</code> myself. Well worth it for a usable&nbsp;mirror.</p>
]]></content:encoded>
			<wfw:commentRss>http://slagwerks.com/blog/index.php/2009/04/02/httrack-new-go-to-program-for-web-mirroring-archiving/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
