Friday, October 24, 2008

Mashups: Beyond 'Hello World'

Remember the 'Hello World' screen-cast I posted a few days back? Well, now it's time to move on and get some real work done from the Mashup Server. I'm sure most of you are familiar with a little technique called screen scraping.Do I have your attention now? Good. In this screen-cast, Jonathan explains how to scrape data from all those awesome sites.

You see most sites today are late to the party when it comes to developer APIs. But they do have great content nevertheless. Wouldn't it be great if we can somehow harvest this data and use it in our mashups? The Mashup Server has a nice little scraping function, which allows you to;
  • Harvest data off web pages
  • Fill and submit forms in sites and click links, which in turn allows you to
  • perform navigation through pages to get to the 'really' interesting areas
So have a look and start scraping. As a pre-requisite, I would recommend you install Firebug, a great Firefox extension that will, among many other things, allow you to read x-path expressions of selected web page elements.

Tip: Please pay attention to the Terms of Use on the sites you scrape or you might annoy the owners, trust me on this one.