mt hack: pageparser
Posted by dav at 2003 April 9 03:03 PM
File under: Geek

Update on April 10: new zip pacakge fixes bug in IMDB module

Previously I hacked up a system to make it easier to post movie related blog entries to Moveable Type which allowed you to first browse to a movie page at the Internet Movie Database and then just simply use your MT Post To Blog bookmark to bring up a blog entry form pre-filled with the HTML necessary to create a blog entry with the movie title, imdb link and imdb image.

Recently I made the system more general so it is easier to add support for sites other than imdb, such as online new & used independent bookseller Powells. Support is added by creating a PageParser module (I now have ones for imdb and powells), and storing it under your extlib MT dir. The PageParser retrieves the HTML for the site page and parses through it for whatever information necessary to create the blog entry presets.

The Powells module also allows you to set the partner id # which will automatically be used in the links so you can get click-thru credit for purchases originating from your blog. To set your partner id, open up the extlib/Dav/MT/parser/parsers/PowellParser.pm file and replace my partner id at the top of the file with your own.

To use the system you first need to alter the CMS.pm file included in your MT distribution. In the download ZIP below, I have included already altered CMS.pm for MT versions 2.51 and 2.63 and instructions on how to do the alteration yourself if necessary. It's easy.

To install this system along with the two modules I have so far, download MTPageParser-0.9.1.zip and unzip it in your MT home directory. It will put my the PageParser system modules in the extlib/ subdirectory. If you are using MT v2.51 or v2.63 and have never altered your CMS.pm file before, then simply replace the lib/MT/App/CMS.pm file with the appropriate CM.pm.#.##. If you are using a different version then read the CMS.pm.README for instructions on how to alter your lib/MT/App/CMS.pm file (it's really easy).

That's it! Now you can surf to IMDB or Powells and easily post to your blog. If you want to create a module for a new site, take a look at the extlib/Dav/MT/parser/parsers/*Parser.pm files. Its pretty simple if you're familiar with perl and regex. I'm not even a perl programmer (I [heart] Java) and I figured it out :)


Hmm... I'm getting weirdness with this. When I click on the bookmarklet, I get the following:

Can't use string ("3/8") as a HASH ref while "strict refs" in use at lib/MT/App/CMS.pm line 637.

followed by:

MT::App::CMS=HASH(0x835290c) Reference found where even-sized list expected at extlib/Dav/MT/parser/parsers/IMDBParser.pm line 47.

Posted by: Jim on April 10, 2003 12:32 PM

That's the bug I fixed with verion 0.9.1

Sorry about that.

Posted by: Dav Coleman on April 10, 2003 12:57 PM

This is a little tangential, but all these links to the imdb on everybody's blog is a bit of a pieve of mine. I know how to look up a movie on imdb easily enough without a million bloggers' assistance. Really, what you ought to be linking to is the official movie site, (which is almost always listed on imdb.) I suppose that I know how to look that up on imdb.com, too, but it takes 4 or more steps. But just in principle, shouldn't the movie's official site, and not the movie's imdb entry be the Web page that "stands for" the movie?

Posted by: Danzig on April 15, 2003 08:14 PM

I think it is better to lead to a standard format for the links. I like the organizational aspect of that.

Posted by: Dav on April 20, 2003 10:37 PM

