Fetch page with proxy using The Go language
For a while i’m playing with
The Go Programming Language
– so far I loved it. I figure out that I’ll push some code snippets from time to time.
Today I spend some time creating simple not ever crawler, but website fetcher.
Idea is very simple – download page, run xpath query on it and spit out results. I was looking for decent xpath library for Go and couldn’t find any. I tried to use
xmlpath
but it sucks. I couldn’t even run queries like id('product-details')/div[@class='product-price']"
Then I found something nicer –
Gokogiri
– which works pretty nicely, but – couldn’t find any examples except this
small article
.
The only problem with running Gokogiri is that it uses libxml2
which is not a huge problem on Linux based systems, but on Mac OS X you have to install it via
homebrewbrew install libxml2