Crawl

Fetch page with proxy using The Go language

For a while i’m playing with The Go Programming Language – so far I loved it. I figure out that I’ll push some code snippets from time to time.
Today I spend some time creating simple not ever crawler, but website fetcher.

Idea is very simple – download page, run xpath query on it and spit out results. I was looking for decent xpath library for Go and couldn’t find any. I tried to use xmlpath but it sucks. I couldn’t even run queries like id('product-details')/div[@class='product-price']" Then I found something nicer – Gokogiri – which works pretty nicely, but – couldn’t find any examples except this small article .

The only problem with running Gokogiri is that it uses libxml2 which is not a huge problem on Linux based systems, but on Mac OS X you have to install it via homebrew
brew install libxml2