aku-aku: v.. To move a tall, flat bottomed object (such as a bookshelf) by swiveling it alternatively on its corners in a "walking" fashion. [After the book by Thor Heyerdahl theorising the statues of Easter Island were moved in this fashion.] source: LangMaker.com. Aku Aku also has another meaning to the islanders: a spiritual guide.
« 18 of 52: Painting With Light | Main Page | Brain Training For Fun and Profit »
19 of 52: Yet To Be Named Travel Research Site.
Posted by dav at 2008 February 4 12:31 AM
File under: Geek

This week's project is a web site. Unfortunately, while I got most of the development done on it over the weekend, I didn't have time to deploy it to a permanent hosting server tonight. It's running only on my development workstation at the moment. I can describe it here though and will update when it goes live.

The idea for the site was born out of a personal need. I'm planning on taking a vacation in March with my family and we were trying to figure out where to go.

Traveling with Tesla effectively rules out some options, like staying in hostels, so I was hoping to rent an apartment somewhere like I did when I lived in Brasil for a few months in 2005. There's a site Jay Allen introduced me to called VRBO.com. Vacation Rentals By Owner. It's got thousands of listings of apartments for rent all around the globe. Unfortunately the site looks like it's had the same UI design last century. It's hard to get a quick idea of how much a 2BR apartment costs in various cities.

Another issue in deciding where to go is how much it costs to fly there. I was checking the airfare for various cities as I thought of them but that was tedious.

What I really wanted was a site where I could put in my home airport and it would assemble the cost of a three week vacation in every city that had a VRBO listing plus the cost of airfare to that city. It would then display that list sorted cheapest to most expensive. Additionally it would be nice if the site allowed crowd-sourced compilation of various costs in the cities (typical meals, fare from the airport to city center, gallon of mil, etc) and included that in the assembled costs. This is the site I started to build. For a first iteration I limited it to only South American destinations, but eventually I will make it cover all of the VRBO listings.

The first step was scraping the apartment data from the VRBO site. For this I used the open-uri and hpricot ruby libraries to write a scraper. In production this scraper would run maybe once a day to get the latest listings and put them in the site database.

The second step was getting the airfares. I had found a great site, farecompare.com, that finds low price airfares. I started attempting to use the mechanize ruby library to manipulate the search form and scrape the results, but the site developers seemed to have gone to great lengths to make this difficult to do. Undaunted I fired up Wireshark and started sniffing the network traffic. If my browser can make the form work, I can write code to do it as long as I can analyze and reproduce all the necessary network traffic.

So I started getting into that, but then decided maybe I should just try another site in the interest of time. I looked at a few other sites but they all seemed to have a similar level of difficulty for scraping. Finally I noticed one site had almost exactly what I needed. Kayak.com is an airfare search site that has one search mode where you can put in your home airport, select a region of the world (like South America) and it finds the lowest fares to all the major cities in that region. Perfect! Except this search flow was also designed in a way that was difficult to scrape. Kayak had another feature that made them desirable though; they have a developer API, so network traffic analysis (not a lot of fun) wouldn't be necessary! Even though their API did not expose a way to do their regional search, I could make separate searches from a home airport to a list of other airports. It is not as elegant but it should work.

The Kayak API has some flaws though. First off, it seems buggy in that sometimes the API calls work, but most of the time I get a bogus error that says anonymous access to kayak API denied even though I am using a non-anonymous developer key. If I rerun the same API call several times in a row, eventually it works. This bumps me up against another problem with the Kayak API though: they limit API queries to 41 per hour (that's 1000/day). Since I have to make separate queries between the home airport and each city in a region, this means I can effectively only do one regional search per hour. They say you can request a higher limit, so I have an email request pending regarding that.

By the way, I think we're going to Buenos Aires.

System Overview

When a user comes to the site, she enters her home airport code, number of travelers and departure/return dates. The system fires off a BackgroundRB worker in a separate process to start hitting the Kayak API with airfare queries from the specified airport to each South American city listed in the VRBO database table. The user is shown a 'searching' web page that gets updated every 5 seconds as the airfare queries are coming in. Once the low price is found for each city, the background worker is released and the system correlates the airfares with the VRBO data already scraped and in the database. The Kayak results are cached for some time so that the queries don't need to be re-run again if the same home airport is entered. The lodging price is determined using the number of travelers to know what size apartments to look at and the length of stay to know whether to use the nightly, weekly or monthly rates.

Eventually I'd like to expand it to cover other housing options (hostels, hotels, chartered boats).

Anyone got any good names for such a site? It's so hard getting a decent domain name these days.

Comments:

What about combining sojourn and journey to get: sojourney.org? (sojourney.com is already taken)

Jenn and I would really like something like that. We were using europeandestinations.com last night to gauge the different costs of destinations, but that's all prepackaged tours and the apartment idea is much more attractive.

I have one suggestion for a feature that you may have already thought of. Have the option for a specific departure date, but also give the option of a travel window. IE. "I'd like to find an apartment and flight for 7 days between the dates of June 1st to Sept 1st" This could potentially find cheaper flights when the airlines put all those kooky price restrictions on departure dates, and also find more apartment options when some rentors have rigid time slots, only rent Saturday to Saturday, or Sunday to Sunday, etc.

Posted by: Paul on February 4, 2008 01:10 PM

Hi,

I am getting the same error that you mentioned with Kayak (my code is the ksearchJavaExample). However, even though I run it multiple times I have never received a good result. I also have two developer keys -- and have tried both.

Any idea?

Thanks
Dean.


anonymous access to kayak API denied.

Posted by: mousing on April 22, 2008 03:07 PM

Post a new comment:

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?